Design, deploy, and manage virtual machine environments using platforms like VMware, Hyper-V, Proxmox, or cloud services (AWS, Azure, GCP).
Optimise VMs for performance and cost-efficiency, ensuring high availability and reliability.
CPU/GPU Configuration
Configure and optimise CPU/GPU resources for AI inferencing tasks.
Monitor GPU utilisation and performance metrics to ensure efficient processing.
Oversee the implementation of platform infrastructure projects, including cloud migrations, network upgrades, server deployments, and storage solutions.
Manage the day-to-day operations of the platform infrastructure, including monitoring, troubleshooting, and maintenance activities to ensure high availability and performance.
Implement and maintain robust security measures for the platform infrastructure, including access controls, encryption, and vulnerability management, to protect against cyber threats and ensure compliance with industry standards and regulations.
Utilise cloud platforms for scalable infrastructure solutions, implementing containerisation technologies like Docker and Kubernetes.
Ensure security and compliance within cloud environments, leveraging tools like Azure Machine Learning and NVIDIA’s RAPIDS.
Requirements & Skills:
Bachelor’s degree in Computer Science, Engineering, or related field.
3+ years of experience in infrastructure engineering or related roles.
Strong knowledge of virtualisation technologies (VMware, Hyper-V,Proxmox).
Experience with CPU/GPU configuration and optimisation.
Proficiency in cloud platforms (AWS, Azure, GCP).
Familiarity with AI frameworks and inferencing systems (TensorFlow, PyTorch) is a plus.
Experience with containerisation and orchestration tools (Docker, Kubernetes).