
Company |
Revolut People |
Job title |
Machine Learning Operations Engineer |
Job location |
Saudi Arabia, UAE |
Type |
Full Time |
Responsibilities:
- Design, build, and maintain end-to-end ML pipelines, including data processing, model training, evaluation, and deployment.
- Automate model deployment and lifecycle management across cloud and potential on-prem environments.
- Establish CI/CD workflows for ML models, ensuring reproducibility and traceability.
- Implement monitoring, logging, and alerting for model performance and drift detection.
- Optimize ML training and inference workloads for cost and performance efficiency.
- Collaborate with DevOps and engineering teams to integrate ML workloads with broader infrastructure.
- Define and implement MLOps best practices, including experiment tracking, versioning, and governance.
- Evaluate and recommend tools and frameworks for MLOps, considering both cloud and on-prem scenarios.
Requirements & Skills:
- 2-7+ years of experience in MLOps, DevOps, or related fields with a strong AI/ML focus.
- Hands-on experience with cloud platforms (GCP preferred) and container orchestration (Kubernetes, Docker).
- Proficiency in AI/ML pipeline frameworks (Kubeflow, MLflow, TFX, or similar).
- Strong knowledge of CI/CD tools (GitHub Actions, ArgoCD, or similar) for ML models.
- Experience with monitoring AI/ML models in production.
- Strong programming skills in Python, Bash, or Go.
- Familiarity with model-serving frameworks (TF Serving, Triton, BentoML) and decentralized / distributed computing (Ray, Spark).
- Experience in optimizing AI/ML workloads for GPUs and CPUs.
- Excellent problem-solving skills and ability to work in a fast-paced, evolving environment.
