Data Engineer, G42

Data Engineer, G42

Company G42
Job title Data Engineer
Job location Abu Dhabi, ARE
Type Full Time

Responsibilities:

  • Solve challenging problems, using python coding skills.
  • Design, build and launch new data extraction, transformation & loading processes in production.
  • Web crawling, data cleaning, data annotation, data ingestion and data processing.
  • Reading and collating complex data sets.
  • Creating and maintaining data pipelines.
  • Continual focus on process improvement to drive efficiency and productivity within the team.
  • Use of Python, SQL, ES, Shell etc. to build the infrastructure required for optimal extraction, transformation, and loading of data.
  • Provide insights into key business performance metrics by building analytical tools that utilize the data pipeline.
  • Support the wider business with their data needs on an ad hoc basis.
  • Comply with QHSE (Quality Health Safety and Environment), Business Continuity, Information Security, Privacy, Risk, Compliance Management and Governance of Organizations policies, procedures, plans and related risk assessments.

Requirements & Skills:

  • Bachelor’s degree in computer engineering, Computer Science, or Electrical Engineering and Computer Sciences.
  • 3+ years of programming experience, solid coding skills in Python, Shell, and Java
  • Good corporate capacity, and good communication skills.
  • Experience with Web crawling, and cleaning.
  • Experience with solution architecture, data ingestion, query optimization, data segregation, ETL, ELT, AWS, EC2, S3, SQS, lambda, Elastic Search, Redshift, CI/CD
  • frameworks and workflows.
  • Working knowledge of data platform concepts – data lake, data warehouse, ETL, big data processing (designing and supporting variety/velocity/volume), real-time processing architecture for data platforms, scheduling and monitoring of ETL/ELT jobs
  • PostgreSQL and programming (preferably Java, Python), proficiency in understanding data, entity relationships, structured & unstructured data, SQL and NoSQL databases
  • Knowledge of best practices in optimizing columnar and distributed data processing systems and infrastructure
  • Experienced in designing and implementing dimensional modeling
  • Knowledge of machine learning and data mining techniques in one or more areas of statistical modeling, text mining, and information retrieval.
  • In-depth market and domain knowledge
  • A passion for constant improvement
  • An innovative and creative approach to problem-solving
  • Excellent communication skills

apply for job button