ML/AI Engineer
HCL Singapore · Singapore
Job ObjectivesDesign and deliver scalable real-time data and machine learning solutions by building robust ingestion and transformation frameworks across Hadoop ecosystems. Enable end-to-end ML model operationalization and performance optimization, while supporting multi-modal data processing and development of engineering tools and applications.Key ResponsibilitiesDesign and develop highly scalable, Real time systems using Hadoop ecosystem components(Iceberg, Spark, Ozone, Trino, Hive, Ranger, Kafka, Flink and Nifi)Build robust data ingestion and transformation frameworks using Java, Spark, Python, and shell scripting for ingesting multi model data(image, audio, video, unstructured documents) with both batch and real-time.Develop full‑stack applications and internal engineering tools using Python, shell scripting, and modern web frameworks (e.g., Flask, React).Collaborate closely with data scientists to operationalize machine learning models using Cloudera Machine Learning (CML).Perform performance tuning and optimization of data applications on Hadoop to ensure optimal resource utilization.SkillsetExperience working with ML platforms such as CML, Spark MLlib, and Python ML libraries (scikit‑learn, XGBoost), including model deployment.Bachelor’s or Master’s degree in Computer Science, Engineering, Information Technology, or a related field.Minimum of 6+ years of professional experienceDesign and develop highly scalable, Real time systems using Hadoop ecosystem components(Iceberg, Spark, Ozone, Trino, Hive, Ranger, Kafka, Flink and Nifi)Build robust data ingestion and transformation frameworks using Java, Spark, Python, and shell scripting for ingesting multi model data(image, audio, video, unstructured documents) with both batch and real-time.Develop full‑stack applications and internal engineering tools using Python, shell scripting, and modern web frameworks (e.g., Flask, React).Collaborate closely with data scientists to operationalize machine learning models using Cloudera Machine Learning (CML).Perform performance tuning and optimization of data applications on Hadoop to ensure optimal resource utilization.Key Skills:Experience with Python, Java, Scala, or C++ML Frameworks & Libraries – XGBoost, Scikit‑learn, Tensor Flow/keras, Hugging face (NLP/NLQ/Gen AI use cases)Full-Stack DevelopmentPerformance OptimizationData Engineering & Ingestion FrameworksCollaboration with Data Science Teams