Data Engineer (AI Enablement)
Manpower Staffing Services Singapore · Singapore
Key ResponsibilitiesBuild and maintain scalable data pipelines using PythonWrite production-grade Python code specifically for data processing, transformation, and ETL workflowsPerform data cleaning, preprocessing, and feature preparation for analytics and AI use casesUse data analysis and manipulation tools to handle large datasets efficientlyDevelop reusable Python modules for data ingestion and pipeline automationPerform exploratory data analysis (EDA) to understand data patterns and quality issuesOptimize data workflows for performance, scalability, and reliabilitySupport data requirements for AI/ML and Generative AI systemsBuild data services and APIs to support downstream AI applicationsEnsure data quality, consistency, and observability across pipelinesMust have skills:Python & Data Libraries (Hands-on Experience Mandatory)Candidates must have solid practical experience with:Pandas — data manipulation, transformation, and analysisNumPy — numerical operations and array-based processingMatplotlib — data visualization and reportingscikit-learn — basic ML workflows and model evaluationPyTorch — deep learning and AI model experimentationAI / Generative AI EnablementPrepare and structure datasets for ML and LLM-based systemsSupport integration of AI models into data pipelines and applicationsEnable workflows for Generative AI use cases (RAG systems, agent workflows)Work with multiple AI model providers:OpenAIAnthropicLLaMAMistralExposure to AI orchestration frameworks such as LangChain, AutoGen, and CrewAICore RequirementsSolid hands-on Python coding expertise focused on data systems (critical requirement)Ability to write clean, efficient, production-grade Python codeThorough understanding of data structures, ETL pipelines, and data workflowsExperience working with large-scale structured and unstructured dataSolid SQL skills for data extraction and manipulationUnderstanding of data modeling and analytics workflowsAbility to support end-to-end data-to-AI pipelinesGood to HaveExperience with big data or distributed processing systemsUnderstanding of vector databases and embedding-based retrieval systemsExperience building APIs or services for data/AI systemsFamiliarity with cloud platforms (AWS, Azure, GCP)Exposure to production monitoring and data observability tools