Data Scientist, CPChem

Data Scientist, CPChem

Company CPChem
Job title Data Scientist – Causal Discovery
Job location Woodlands, TX, US
Type Full Time

Responsibilities:

  • Analyze and perform Exploratory Data Analysis (EDA) on raw datasets with the ability to develop visualizations to present key findings from EDA, communicating with stakeholders to ensure understanding of potential opportunities, forming testable hypotheses, and obtaining regular feedback
  • Identify potential machine learning (ML) or Generative AI opportunities that drive business value and working them through our intake process to formally get placed on our backlog
  • Apply knowledge of statistics, machine learning, programming, and data modeling to recognize patterns, identify opportunities, test business hypotheses, and make valuable discoveries leading to operational savings or growth
  • Wrangle data to cleanse and prepare it for ML model development, including the engineering of features to improve model performance
  • Apply causal discovery and causal inference techniques to understand the relationships between variables and identify potential causal effects and causal relationships in datasets at scale
  • Manage and execute ML model life cycle management within our MLOps framework for all models developed
  • Collaborate with cross-functional and cross-organizational teams to integrate causal models into business processes and decision-making

Requirements & Skills:

  • Master’s degree in a quantitative field such as engineering, computer science, mathematics, statistics, or data science. Significant equivalent work experience in causal inference may be considered as a substitute
  • Strong understanding of and experience with causal discovery and causal inference methods
  • Experience in data science-related programming languages, such as Python, R, C, C++, SQL, etc
  • Experience with Big Data platforms, such as Spark, would be advantageous. Databricks experience is a bonus
  • Proven ability to apply advanced analytics techniques to real-world problems
  • Strong written and verbal communication skills
  • Ability to be effective in a team environment
  • Curiosity and enthusiasm to learn new domains
  • Experience with managing ML models through an MLOps lifecycle, from experimentation through production deployment to model version updates
  • Experience with distributed cloud providers such as Microsoft Azure, AWS, or GCP
  • PhD in a quantitative field such as engineering, computer science, mathematics, statistics, or data science; especially if the area of research is related to causal inference
  • Experience applying causal inference techniques in an oil, gas, and/or chemical manufacturing context
  • Experience or familiarity with working in an Agile team
  • Direct Experience using MLflow as an MLOps platform
  • Experience with Microsoft Azure

apply for job button