Develops large-scale data structures and pipelines to organize, collect, and standardize data that helps generate insights and addresses analytical needs.
Collaborates with product business and data science team to collect user stories, translate them into technical specifications, and implement data transformation, algorithms, and models into automated processes.
Builds highly scalable and extensible data marts and data models to support Data Science and other internal customers on the Cloud. Integrates data from a variety of sources, assuring that they adhere to data quality and accessibility standards.
Provide Guidance to a team of data engineers and collaborate closely with data scientists and analysts to support data-driven decision-making.
Analyzes current information technology environments to identify and assess critical capabilities and recommend solutions.
Requirements & Skills:
3+ years of experience with SQL, NoSQL
3+ years of experience with Python (or a comparable scripting language)
3+ years of experience with Data warehouses (such as data modeling and technical architectures) and infrastructure components
3+ years of experience with ETL/ELT, and building high-volume data pipelines
3+ years of experience with reporting/analytic tools
3+ years of experience with Query optimization, data structures, transformation, metadata, dependency, and workload management
3+ years of experience with Big data and cloud architecture
3+ years of hands-on experience building modern data pipelines within a major cloud platform (GCP, AWS, Azure)
3+ years of experience with deployment/scaling of apps on containerized environment (i.e. Kubernetes, AKS)
3+ years of experience with real-time and streaming technology (i.e. Azure Event Hubs, Azure Functions, Kafka, Spark Streaming)
1+ year(s) of soliciting complex requirements and managing relationships with key stakeholders
1+ year(s) of experience independently managing deliverables
Experience in designing and building data engineering solutions in cloud environments (preferably GCP)
Experience with Git, CI/CD pipeline, and other DevOps principles/best practices
Experience with bash shell scripts, UNIX utilities & UNIX Commands
Understanding of software development methodologies including waterfall and agile.
Ability to leverage multiple tools and programming languages to analyze and manipulate data sets from disparate data sources
Knowledge of API development
Experience with complex systems and solving challenging analytical problems
Strong collaboration and communication skills within and across teams
Knowledge of data visualization and reporting
Experience with schema design and dimensional data modeling