Machine Learning Infrastructure Engineer
💰 Salary: Up to £45k
📍 Location: Central London – Hybrid/Onsite flexibility
🚀 Support high-performance AI and ML environments as part of a collaborative, forward-thinking tech team
💡 Your Role:
We’re hiring a Machine Learning Infrastructure Engineer to support and optimise mission-critical HPC environments used in AI and machine learning. You’ll be part of a dynamic service desk and infrastructure team, working directly with users, solving technical challenges, and improving infrastructure reliability across complex environments.
💡 What You’ll Be Doing:
- Administering and supporting HPC systems used in AI/ML workflows
- Troubleshooting scheduled job queues and task automation issues
- Managing incoming service requests and incidents across end-user and infrastructure layers
- Leading user onboarding/offboarding and internal IT provisioning
- Supporting client-facing projects, infrastructure upgrades, and recommendations
- Maintaining and improving internal tools like Microsoft 365, Teams, and Confluence
- Collaborating on hardware procurement and IT service improvements
(And if you don’t tick every box – that’s okay. We’d still love to hear from you.)
💡 The Stack & Environment:
A diverse, modern environment spanning:
- Linux, Windows, MacOS, Microsoft 365, Azure AD, Intune, Teams, NICE DCV, Nvidia CUDA, Slurm, Jira Service Desk, Terraform, Azure Resource Manager
💡 What We’re Looking For:
- 2+ years of experience administering HPC infrastructure
- Hands-on experience with Infiniband, Slurm, and GPU compute platforms (e.g. CUDA)
- Proficiency in systems administration and troubleshooting
- Strong documentation habits and a customer-focused mindset
- Experience with VDI solutions and monitoring tools
💡 Bonus Points:
- Familiarity with Jira Service Desk and Terraform scripting
- Exposure to SSL management, infrastructure-as-code, or cloud database platforms
- Comfort with a wide range of operating systems and tools, from Azure to SiteGround
💡 How You Work:
- Clear communicator with a strong sense of ownership
- Comfortable operating independently and solving problems proactively
- Enjoys creating simplicity from complexity
- Always looking to learn, improve, and collaborate
💡 Why This Role?
- Work on meaningful AI/ML infrastructure challenges
- Join a collaborative, high-performing support and engineering team
- Enjoy autonomy and ownership while contributing to visible improvements
- Flexible hybrid working in Central London
📩 Interested? Let’s talk – email Thana@engagewithus.com
Contact Detail:
The Engage Partnership Recruitment Recruiting Team