Scale data pipelines to allow our data to go from research to platform quickly and reliably
Manage sources that contain both semi-structured and unstructured biological data that contribute to the evolution of BenchSci’s Knowledge Graph
Integrate public life science data into the biological ontology in our Knowledge Graph
Collaborate with ML, Data engineers, and Science to solve complex data mining and extraction challenges, enabling us to capture and model scientific experiments and results
Seek out leadership opportunities and act as a Technical DRI (directly responsible individual) on multiple projects/epics
Design testable, scalable solutions to complex problems using the latest frameworks and tools
Use your experience and knowledge to help define and apply best practices for a broad platform of technologies in a cloud-based environment
Write and review engineering design proposals in accordance with BenchSci’s engineer best practices
Contribute to your team’s processes including sprint planning, task estimation, and code review
Work both independently and in pair-programming settings within an agile team of talented engineers to solve interesting data problems
Be given an unmatched opportunity for growth, and to learn from a team of outstanding engineers
Liaise closely with stakeholders from other functions including product and science
Feel challenged and engaged as you’re exposed to new opportunities to require you to push yourself
Requirements & Skills:
Degree in Software Engineering, Computer Science, or a similar area
5+ years of experience working as a professional software engineer, data in industry
Expertise with Python coding and type system
Expertise in writing SQL (GQL, PostgreSQL, and BigQuery are a plus)
Experience with building both batch and streaming ETL pipelines using data processing engines
Deep understanding of building Knowledge Graphs entailing biological ontologies, and leveraging graph DBs for their storage
Experience with cloud development (we use GCP and Terraform) including reference architectures and developing specialized stacks on cloud services
A strong background in data modeling, data structures, and large-scale data manipulation/transformations
A can-do proactive and assertive attitude – your manager believes in freedom and responsibility and helping you own what you do; you will excel best if this environment suits you
You have experience working in cross-functional teams with product managers, scientists, project managers, and engineers from other disciplines (e.g., machine learning)
Outstanding verbal and written communication skills
Can clearly explain complex technical concepts/systems to engineering peers and non-engineering stakeholders across teams
Experience executing as part of high-performance engineering teams using industry-standard software delivery practices
Proficient with agile processes (sprint planning, estimation, retros, standups, etc.)
Ideally, you have worked in the scientific/biological domain with scientists on your team
A growth mindset continuously seeking to stay up-to-date with cutting-edge advances in tech and software/data engineering