Machine Learning Engineer, Procore

Machine Learning Engineer, Procore

Company Procore
Job title Machine Learning Engineer
Job location Austin, TX, United States
Type Full Time

Responsibilities:

  • Checking deployment pipelines for ML models.
  • Review Code changes and pull requests from the data science team.
  • Triggers CI/CD pipelines after code approvals.
  • Monitors pipelines and ensures all tests pass and model artifacts are generated/stored correctly.
  • Deploys updated models to prod after pipeline completion.
  • Works closely with the software engineering and DevOps team to ensure smooth integration.
  • Containerize models using Docker and deploy on cloud platforms (like AWS/GCP/Azure).
  • Set up monitoring tools to track various metrics like response time, error rates, and resource utilization.
  • Establish alerts and notifications to quickly detect anomalies or deviations from expected behavior.
  • Collaborate with the data science team to develop updated pipelines to cover any faults and Analyze monitoring data, log, files, and system metrics.
  • Documenting and troubleshooting, changes, and optimization.
  • Work alongside our Product, UX, and Prototype Engineering teams, you’ll leverage your experience and expertise in the AI space to influence our product roadmap, developing innovative solutions that add additional capabilities to our product suite

Requirements & Skills:

  • 2+ years of experience in a Data/ML Engineer role with a degree in Computer Science, Information Systems, or another quantitative field or equivalent relevant experience.
  • Proficiency in programming languages, such as Python, Java, and C++
  • Experience building data pipelines (in Real-time and batch) on large complex datasets using Spark or Flink frameworks
  • Experience with AWS services including EC2, S3, Glue, EMR, RDS, Snowflake, Elastic Search, Cassandra, and Data pipeline/streaming tools (Airflow, NiFi, Kafka)
  • You must have hands-on experience developing systems for the machine learning lifecycle: data preprocessing and feature extraction, model training and evaluation, and deployment and monitoring. Familiarity with the associated open-source ecosystem (e.g., TensorFlow, PyTorch,  mlflow, Ray, Kubeflow, tfx) is a plus.
  • You must have hands-on experience developing large-scale distributed, fault-tolerant, and scalable data processing systems capable of processing terabytes of structured and unstructured data via batch with Spark or streaming with Flink or Kafka Streams.
  • You must have worked with data scientists and can speak knowledgeably about the major machine learning paradigms, algorithms, and software tools, and can translate data science problem statements into corresponding data, infrastructure, or workflow needs.
  • Must Have experience in working with relational and non-relational databases, data warehousing, and data streaming frameworks (think Apache Kafka/Spark/SQL)
  • You must be familiar with Python ML (Pyspark, Python libraries: setup tools, pytest and pytest mocking for unit testing, mypy, pylint, sonarqube for code quality)  and at least one high concurrency language as Java, Elixir, , Python, or Golang

apply for job button