Data Engineer, SESAMm

Responsibilities:

Design and implement the best data pipeline for our Text-based products (ingestion, processing, exposition) :
- Test and design state-of-the-art data ingestion pipelines
- Implement efficient streaming services
Take part in the acquisition of new data sources
- For each new data source, describe its feasibility and potential
- Create and maintain data collection and centralization pipelines
- Integration of data enrichment modules created by Data Scientists
Develop data request tooling for Technical teams
- Ease the use of data-requesting engines
- Optimize architecture and data pipelines
Implement and maintain critical data systems
- Process and integrate data in our systems
- Ensure maintainability and efficiency

Engineering school/university with specialization in IT, software engineering, or data science. Other types of profiles are welcome to apply as long as they have significant IT experience
At least 3 years of experience in data engineering with a successful implementation of a cloud-based data processing pipeline
Good understanding of different databases and data storage technologies
Very good knowledge of distributed computing systems, such as Spark
Good knowledge of cloud computing systems, such as AWS, GCP, Azure ML
Development: Be at ease with Python
Good communication and popularization skills: understand technical team needs and issues, and collaborate with several internal teams. Team player.
Additional skills: strong interest in Data Science / Natural Language Processing.