Engineering Manager - DevOps, General Motors Grafana, Datadog

Company	General Motors
Job title	Engineering Manager, DevOps
Job location	Austin, Texas/Mountain View, California/Warren, Michigan
Type	Full Time

Responsibilities:

Team Leadership & Strategy
- Lead, mentor, and develop a high-performing team focused on DevOps, observability, and system reliability.
- Establish team priorities that align with organizational objectives, ensuring a scalable and efficient infrastructure.
- Set and evolve the technical vision and roadmap, integrating best practices in monitoring, logging, and alerting.
Technical Execution & Oversight
- Design, implement, and optimize observability solutions, including metrics, tracing, logging, and alerting, to enhance system reliability.
- Define and maintain scalable monitoring systems that proactively detect and prevent system failures.
- Oversee CI/CD pipelines, ensuring smooth deployments and minimizing downtime.
- Review system architecture and development code, ensuring efficiency, testability, and adherence to best practices.
Collaboration & Process Improvement
- Work closely with Product Managers, Engineers, and cross-functional teams to ensure seamless integration of observability practices.
- Define incident response strategies, improving recovery time and overall platform stability.
- Identify and integrate new technologies to enhance observability, performance monitoring, and automation.

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
5+ years of experience in DevOps or SRE engineering roles.
2+ years of experience in engineering management.
Expertise in observability, monitoring, and logging solutions (e.g., Prometheus, Grafana, Datadog, OpenTelemetry).
Strong knowledge of cloud platforms and tools.
Experience with CI/CD automation and configuration management tools.
Proficiency in containerization and orchestration (Docker, Kubernetes).
Strong understanding of system reliability, incident response, and performance optimization.
Experience implementing scalable monitoring, alerting, and logging strategies to ensure system health and reliability.
Excellent leadership, communication, and stakeholder management skills, with the ability to collaborate across teams.