Senior Systems Engineer - L3 Operations (Data Analytics & AI) (Ref 26210a)

Jobline Resources · Singapore

Sector
AI
Function
Product & Engineering
Level
Mid-Level
Employment type
Contract
Posted
2026-05-13
Source
mycareersfuture

Responsibilities• Monitor and maintain production data pipelines to ensure 99.9% uptime and optimal performance• Implement comprehensive logging, alerting, and monitoring systems using Application monitoring tools• Perform regular health checks performance, job execution times, and resource utilization to identify and resolve bottlenecks proactively• Manage incident response procedures for pipeline failures, including root cause analysis, resolution, and post-incident reviews• Establish and maintain disaster recovery procedures and backup strategies for critical data assets within the Databricks environment• Conduct regular performance tuning of Spark jobs and Databricks cluster configurations to optimize cost and execution efficiency• Maintain comprehensive documentation for operational procedures, runbooks, and troubleshooting guides• Coordinate scheduled maintenance windows and system upgrades with minimal business impact• Manage user access controls, workspace configurations, and security policies within Application environmentsRequirements• Degree in Computer Science or Computer Engineering• Minimum 5 years working experience in system operations compliance and management areas• Project hands-on experience specifically with AWS platform (primary requirement), cloud operations or cloud architecture• Must be cloud certified (AWS)• Proficiency in Databricks platform, including workspace management, cluster configuration, and job orchestration• Strong expertise in Apache Spark within Databricks environment, including Spark SQL, DataFrames, and RDDs• Good in-depth understanding of data warehouse concepts, data profiling, data verification and advanced analytics techniques• Strong knowledge of monitoring, incident management, and cloud cost control• Technology Stack Experience:• Databricks • AWS cloud services and architecture• IDMC (Informatica Data Management Cloud)• Tableau for data visualization• Oracle Database management• ML Ops practices within Databricks environment • STATA for statistical analysis is advantage • Amazon SageMaker integration with Databricks • DataRobot platform integration • Good interpersonal skills with the ability to work with different groups of stakeholders• Strong problem-solving skills and ability to work independently in a fast-paced environment with minimal supervision• Excellent communication skills for technical documentation and cross-team collaborationLicence no: 12C6060

Apply on mycareersfuture →
AI Pipeline Damage Prevention Management maintenance upgrades Apache Spark implementing monitoring tools Data Pipeline AWS Databricks