Lead Machine Learning Engineer, EPAM

Lead Machine Learning Engineer, EPAM

Company EPAM
Job title Lead Machine Learning Engineer
Job location Lithuania, Europe
Type Full Time

Responsibilities:

  • Responsible for the transition of machine learning algorithms to a production environment and integration with the enterprise ecosystem
  • Design, create, maintain, troubleshoot, and optimize the complete end-to-end machine learning life cycle, which includes:
    • machine-learning model optimization
    • data preparation
    • feature extraction
    • model performance monitoring
    • AB/Canar/Bluegreen/etc. testing
    • Integration with Enterprise ecosystem/IoT devices/Mobile devices
  • Write specifications, documentation, and user guides for developed solutions
  • Build frameworks for data scientists to accelerate the development of production-grade machine learning models
  • Collaborate with data scientists and the engineering team to optimize the performance of the ML pipeline
  • Aid In the improvement of SDLC practices
  • Exploration of new tools and techniques and propose improvements
  • Establish and configure CI/CD/CT processes
  • Design and maintain ML model’s continuous training
  • Provide capabilities for early detection of various drifts (data, concept, schema., etc.)
  • Continuously identify technical risks and gaps, devise mitigation strategies
  • Identify and eliminate technical debt in machine learning systems

Requirements & Skills:

  • Experience in Enterprise Software Development for 5+ years
  • Solid background in Machine Learning for 3+ Years
  • Expertise in NLP/LLM, RecSys, Time Series
  • Experience with designing, building, and deploying production applications and data pipelines
  • Experience in the development of highly available, largely scalable, ML-driven applications and systems
  • Experience with cloud-native services: GCP, AWS, Azure
  • Able to work closely with customers and other stakeholders
  • Strong knowledge and experience in Python development
  • Practical experience with one or more Cloud-native services (GCP, AWS, Azure) and Apache Spark Ecosystem (Spark SQL, MLlib/Spark ML)
  • Deep understanding of Python ML ecosystem (PyTorch, TensorFlow, numpy, pandas, sklearn, XGBoost)
  • Hands-on experience in the implementation of Data Products
  • Deep understanding of data preparation and feature engineering
  • Understanding of Apache Spark Ecosystem (Spark SQL, MLlib/Spark ML)
  • Deep hands-on experience with the implementation of SDLC best practices in complex IT projects
  • Experience with automated data pipeline and workflow management tools (Airflow)
  • Knowledge and experience in computer science disciplines such as data structures, algorithms, and software design patterns
  • Hands-on experience in different data processing paradigms (batch, micro-batch, streaming)
  • Deep understanding of MLOps concepts and best practices
  • Experience with some of the MLOps related platform/technology such as AWS SageMaker, Azure ML, GCP Vertex AI / AI Platform, Databricks MLFlow, Kubeflow, Airflow, Argo Workflow, TensorFlow Extended (TFX), etc
  • Production experience in integrating ML models into complex data-driven systems/IoT devices/Mobile devices
  • Experience with basic software engineering tools (CI/CD environments such as Jenkins or Buildkit, PyPi, Docker, Kubernetes)
  • Experience with one of the infrastructures as a code (IoC) framework (Terraform/CDK TF, Ansible, AWS CloudFormation / AWS CDK)

apply for job button