Augment and maintain the existing repositories and data structures within AWS (used to process and store large amounts of data from unrelated sources)
Experience with several formats and means for data ingestion. Including, data types (structured, semi-structured, and unstructured), and sources (on-premise, and in the cloud) using the most appropriate techniques in each case
Continue to expand and enhance the model, utilizing best practices, in regard to the organization of data and the various relationships
Optimize existing and future models for fast and scalable queries (while maintaining performance and related price thresholds)
Work with the team to define, construct, and maintain self-service dashboards for the Business and Advanced Analytics teams within PowerBI
Implement scalable and flexible, high-performance data pipelines for AWS to support analytics
Develop and maintain data maps and their relationships
Generate associated technical documentation including follow-up reports
Work with Data Governance to implement quality rules and data governance measures (data dictionary, metadata, traceability, …)
Propose improvements and actions based on the provided results
Communicate results effectively with required teams
Requirements & Skills:
Bachelor’s Degree with 6+ years of experience.
Advanced knowledge and experience using Python, Airflow, Spark, AWS, and Snowflake