Collaborating with cross-functional teams to understand requirements and speccing up technical solutions.
Developing algorithms and heuristics for extracting information from unstructured and semi-structured data sources, cleansing, and normalizing data at a large scale.
Engaging in scaffolding new projects, testing new ideas, pairing with developers, and reviewing pull requests.
Producing clean, maintainable, and efficient code to be deployed at scale in Azure cloud.
Contributing to team stand-ups and software development lifecycle activities.
Participating in firmwide data science and machine learning forums and communities to share ideas and contribute to our body of knowledge.
Coaching and mentoring junior data scientists.
Requirements & Skills:
Strong Python development experience with modern ML/LLM frameworks.
Knowledge of ML models such as regression/boosting models,
Knowledge of deep learning, including CNN or RNN, preferably the domain of CV and NLP.
Experiences in RAG pipelines including evaluation and optimisation with industry-standard packages.
Strong experience with SQL databases such as PostgreSQL (or equivalents), preferable experiences in Vector DBs.
Strong understanding of applying algorithms at scale, selecting statistical methods, and using the scientific method to derive robust conclusions.
Capability to identify emerging theory and apply it to practical situations.
Strong critical thinking skills, an analytical mindset and outstanding attention to detail.
Ability to work efficiently with remote teams using collaboration technology.
Ability to identify issues and solve complex problems as part of a team.
Good written and verbal communication skills.
A proactive approach to resolving problems.
Knowledge of proper source code management and the use of Git repositories.
A research background in ML/LLM model development.
Prior experience with microservices architectures and containerization, including good knowledge of Docker.
Strong prompt engineering skills.
Knowledge of agile software development lifecycles (SDLC) and experience working on agile projects.
Prior experience with any message-queueing solutions (e.g. RabbitMQ, Kafka).
Prior experience with data pipelines at scale and building on top of them.
Prior experience with observability standards and frameworks such as OpenTelemetry.
Prior experience with developing on cloud environments such as Azure.
Typescript development experience with ReactJS and NodeJS.