
Company |
Trailmix Games |
Job title |
Senior DevOps Engineer |
Job location |
London, UK |
Type |
Full Time |
Responsibilities:
- Implement and maintain CI/CD pipelines for the data team using Cloud Build, Terraform, and Docker.
- Manage cloud infrastructure (GCP, AWS, Azure) to support data transformation, orchestration, and analytics workflows.
- Monitor and optimize cloud resource utilization and costs, ensuring efficient performance across AWS S3, EC2, CloudFront, Azure Cloud Functions, and PlayFab.
- Ensure security and compliance by managing user permissions, access controls, and security policies across cloud platforms.
- Support the self-hosted platform, including security enhancements and penetration tests to identify vulnerabilities.
- Support and improve data pipelines by integrating PlayFab, Azure, and S3 for real-time and batch processing.
- Develop and maintain Grafana, Kibana, and Tableau dashboards, consolidating data sources for real-time monitoring and alerting.
- Monitor infrastructure health and automate alerts to detect and resolve issues before they impact operations.
- Standardise platform methodologies across AWS, Azure, and PlayFab, ensuring best practices in cloud infrastructure.
Requirements & Skills:
- Significant demonstrable experience in a similar DevOps, Cloud Engineering, or SRE role.
- Advanced expertise in CI/CD tools and automation (Cloud Build, Docker, GCloud CLI, Bash) to streamline and support the development process.
- Experience managing Kubernetes (GKE) deployments, Helm, and cloud data infrastructure at scale.
- Experience with Infrastructure as Code (IaC) tools, such as Terraform.
- Hands-on experience with AWS and Azure.
- Experience in cloud infrastructure management and data workflow orchestration (Kubernetes, Helm, Airflow, Dataproc)
- Strong knowledge of monitoring and security tools (Grafana, Kibana).
- Proficiency in infrastructure cost monitoring and optimization across cloud services.
- Strong problem-solving skills and the ability to collaborate across teams to improve system reliability and performance.
- Nice to have: familiarity with PlayFab, Metaplay, and cloud-based data pipelines.
