Develops software that processes, stores, and serves data and machine learning models for use by others.
Develops large-scale data structures and pipelines to organize, collect, and standardize data that helps generate insights and intelligence to support business needs.
Writes ETL (Extract / Transform / Load) or ELT processes, designs data stores, and develops tools for real-time and offline analytic processing on-premises or on cloud infrastructure.
Develops and maintains optimal data pipelines into the ML and advanced analytics platform, including design of data flows, procedures, and schedules.
Ensures that optimal data pipelines are scalable, repeatable, and secure.
Troubleshoot software and processes for data consistency and integrity. Integrates data from a variety of sources, assuring that they adhere to data quality and accessibility standards.
Anticipates and prevents problems and roadblocks before they occur.
Interacts with internal and external peers and managers to exchange complex information related to areas of specialization.
Collaborates with AI/ML scientists and data scientists to prepare data for model development and to deploy models to production.
Requirements & Skills:
Bachelor’s degree and at least 4 years of experience in machine learning, software engineering, or data engineering
Deep knowledge of SQL
Significant experience programming in one or more of the following: Python, C, C++, Spark, Scala, and/or Java
Experience establishing and maintaining key relationships with internal (peers, business partners, and leadership) and external (business community, clients, and vendors) within a matrix organization to ensure quality standards for service.
Experience diagnosing, isolating, and resolving complex business issues and recommending and implementing strategies to resolve problems.
Experience presenting to all levels of an organization
At least 2 years of experience contributing to financial decisions in the workplace
At least 2 years of direct leadership, indirect leadership, and/or cross-functional team leadership
Willing to travel up to 10% of the time for business purposes (within state and out of state).
Graduate degree in a technical discipline and at least 2 years of experience in machine learning, software engineering, or data engineering.
Strong experience designing and implementing monitoring and alerting systems for cloud-based applications, including log management and analysis tools (e.g., ELK stack, Splunk).
Experience with cloud platforms (AWS, Azure, GCP) and their AI/ML services, as well as deploying ML models at scale in production using open-source tools (e.g., Kubeflow, Seldon).
Proficiency in CI/CD practices for AI/ML model development and deployment, with experience using tools such as Azure DevOps, Tekton, or GitHub Actions.
Experience with Infrastructure as Code (IaC) tools, particularly Terraform.
Familiarity with DAG-based workflow orchestration systems (e.g., Airflow, Prefect) and data processing pipelines using Apache Spark or Databricks.
Experience working with ML registries (e.g., MLFlow) and deploying event-driven or reactive ML applications.
Strong background in deploying and maintaining ML systems for both batch and streaming data.
Expertise in troubleshooting distributed systems, optimizing performance, and reducing costs in cloud environments.
Proficient in writing and deploying production-grade Python applications and libraries.
Experience with REST API development and configuring Kubernetes in multi-tenant environments.