DevOps Engineer, AI21 Labs

DevOps Engineer, AI21 Labs

Company AI21 Labs
Job title DevOps Engineer
Job location Tel Aviv-Yafo, Israel
Type Full Time

Responsibilities:

  • Cloud & Kubernetes Expertise: Design and implement highly scalable multi-cluster Kubernetes environments across GCP & AWS.
  • Developer Experience & Enablement: Lead the development of self-service tools and automation that improve efficiency for R&D teams.
  • Incident & Reliability Engineering: Work with engineering teams to optimize cost, performance, and reliability of production infrastructure through monitoring, capacity planning, and scaling strategies.
  • Security & Governance: Contribute to best practices for RBAC, IAM, cloud security, and compliance while ensuring infrastructure reliability.
  • Automation & Infrastructure as Code: Drive adoption of GitOps workflows and Infrastructure as Code (Terraform, Helm, Crossplane) to enhance automation and consistency.
  • Mentorship & Team Growth: Provide technical mentorship within the platform engineering team and contribute to knowledge-sharing across R&D.
  • Cross-Team Collaboration: Work closely with engineering teams to align cloud infrastructure goals with business needs and reliability requirements.

Requirements & Skills:

  • 5+ years of DevOps or SRE experience
  • 3+ years working with public cloud platforms (AWS, GCP) at scale
  • Deep Kubernetes expertise, including managing large-scale, multi-cluster enterprise-grade Kubernetes environments
  • Experience designing and managing Custom Resource Definitions (CRDs) and custom controllers
  • Strong background in Infrastructure as Code (Terraform, Helm) and GitOps principles (ArgoCD, Crossplane, FluxCD, etc.)
  • Hands-on experience in observability & monitoring (Prometheus, Grafana, Datadog, OpenTelemetry, etc.)
  • Proficiency in scripting & automation (Python, Go, Bash) for infrastructure automation
  • Expertise in cloud networking (VPC, load balancers, service meshes) and security best practices (RBAC, IAM, security groups, network policies, etc.)
  • Experience with CI/CD pipelines, optimizing for performance, security, and developer velocity

Nice-to-Have:

  • Experience with self-hosted on-prem deployments and managed private VPC deployments (Bring Your Own Cloud models)
  • Advanced expertise in Helm and Crossplane for Kubernetes resource management.
  • Other cloud provider experience
  • Experience in GenAI or large-scale SaaS platforms
  • Familiarity with SQL/NoSQL databases and distributed systems
  • DevSecOps experience, with a strong understanding of security automation and compliance frameworks

apply for job button