Lead the development of end-to-end pipelines for training, evaluating and deploying ML models to solve a variety of problems related to the many steps that make up a data visualisation workflow (data sourcing, data preparation, chart generation, insight extraction, visual presentation, etc.)
Experiment with and evaluate the effectiveness of open-source and 3rd party models for Canva’s bespoke needs in the data visualisation space, as well as explore the parameters that define success for AI in this domain.
Work alongside data visualisation experts to identify and advise on opportunities for applying ML models to high-impact business opportunities.
Develop performant data pipelines that provide AI model training and evaluation workflows with access to data assets that are legally compliant and aligned with Canvas’s strict policies around data usage for AI.
Work closely with product and design, as well as backend and frontend engineers, to ideate on and design product features that will enable millions of users to leverage AI in a way that meaningfully improves their ability to use data and data story telling in their visual communication.
Requirements & Skills:
You bring at least 5+ years of industry experience building machine learning applications in a software engineering (or equivalent) style role, using industry-standard technologies for production-focussed machine learning (Python, SQL, PyTorch/Tensorflow, Kubernetes/Docker, etc.)
You have experience in building and deploying machine learning models to distributed production environments, including a strong understanding of end-to-end machine learning pipelines and components.
You have experience developing robust evaluation methodologies and workflows for complex or bespoke problem domains.
You possess strong research synthesis skills: the ability to dig through deep/machine learning literature and translate this into products and value for users.
You have industry experience working with and optimizing SoTA computer vision and/or natural language processing models for use in settings with high-performance requirements and/or compute constraints.