At a Glance
- Tasks: Lead the design of scalable multi-GPU infrastructure and drive performance at scale.
- Company: Join Humanoid, the UK's pioneering AI and robotics company.
- Benefits: Competitive salary, stock options, paid vacation, and office perks.
- Other info: Startup culture with opportunities for travel and collaboration with top experts.
- Why this job: Be at the forefront of robotics and AI, making a real-world impact.
- Qualifications: 7+ years in DevOps/MLOps with expertise in multi-GPU systems.
The predicted salary is between 80000 - 100000 £ per year.
Humanoid is the first AI and robotics company in the UK, creating the world’s most advanced, reliable, commercially scalable, and safe humanoid robots. Our first humanoid robot HMND 01 is a next-gen labour automation unit, providing highly efficient services across various use cases, starting with industrial applications.
Our Mission
At Humanoid we strive to create the world’s leading, commercially scalable, safe, and advanced humanoid robots that seamlessly integrate into daily life and amplify human capacity. We are building large-scale compute infrastructure to train next-generation robotics models, including transformer-based systems like VLA.
What You’ll Do:
- Lead the design and evolution of scalable multi-GPU infrastructure across cloud environments (AWS, GCP, etc.)
- Own architecture and long-term technical direction of model training platforms
- Drive reliability, performance, and cost-efficiency at scale
- Define and implement best practices for infrastructure, DevOps, and MLOps across the organization
- Build and evolve infrastructure-as-code and automation for provisioning, orchestration, and lifecycle management
- Architect and improve CI/CD systems for both infrastructure and ML training workflows
- Optimize distributed training workloads (scheduling, resource utilization, observability)
- Partner with ML engineers and researchers to enable efficient experimentation and productionization
- Lead troubleshooting and resolution of complex system issues across distributed, GPU-heavy environments
- Mentor engineers and raise the bar for engineering quality and operational excellence
- Document architecture, systems, and key technical decisions
We’re Looking For:
- 7+ years of experience in DevOps, MLOps, or infrastructure engineering (Staff level)
- Proven experience designing and operating multi-GPU / distributed compute infrastructure
- Experience with GPU scheduling/orchestration (e.g., Kubernetes schedulers, Volcano, Ray, etc.)
- Strong experience with Kubernetes and containerized workloads at scale
- Deep expertise in Infrastructure-as-Code (Terraform, Helm, or similar)
- Deep familiarity with at least one major cloud provider (AWS preferred)
- Strong experience building and scaling CI/CD systems (e.g., GitHub Actions, GitLab CI, ArgoCD)
- Proficiency in Python for automation and tooling
- Strong understanding of distributed systems, networking, and system reliability
- Demonstrated ability to lead large technical initiatives and influence system design
- Experience supporting ML workloads or training pipelines (PyTorch, TensorFlow, etc.)
Nice to have:
- Experience with multi-cloud or hybrid cloud environments
- Background in performance optimization for large-scale training workloads
- Experience in robotics, simulation, or embodied AI systems
What we offer:
- Competitive salary plus participation in our Stock Option Plan
- Paid vacation with adjustments based on your location to comply with local labor laws
- Travel opportunities to our Vancouver and Boston offices
- Office perks: free breakfasts, lunches, snacks, and regular team events
- Freedom to influence the product and own key initiatives
- Collaboration with top‑tier engineers, researchers, and product experts in AI and robotics
- Startup culture prioritising speed, transparency, and minimal bureaucracy
Staff DevOps Engineer employer: Humanoid
Contact Detail:
Humanoid Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Staff DevOps Engineer
✨Network Like a Pro
Get out there and connect with folks in the industry! Attend meetups, webinars, or even just grab a coffee with someone who works at Humanoid. Building relationships can open doors that a CV just can't.
✨Show Off Your Skills
Don’t just talk about your experience; demonstrate it! Create a portfolio showcasing your projects, especially those involving multi-GPU infrastructure or CI/CD systems. This will give us a clear picture of what you can bring to the table.
✨Ace the Interview
Prepare for technical interviews by brushing up on your knowledge of Kubernetes, distributed systems, and MLOps. Practice common interview questions and be ready to discuss how you've tackled complex system issues in the past.
✨Apply Through Our Website
Make sure to apply directly through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are genuinely interested in joining our mission at Humanoid.
We think you need these skills to ace Staff DevOps Engineer
Some tips for your application 🫡
Tailor Your CV: Make sure your CV reflects the skills and experiences that align with the Staff DevOps Engineer role. Highlight your experience with multi-GPU infrastructure and any relevant cloud platforms like AWS or GCP.
Craft a Compelling Cover Letter: Use your cover letter to tell us why you're passionate about AI and robotics. Share specific examples of how you've led technical initiatives or improved system reliability in previous roles.
Showcase Your Technical Skills: Don’t just list your skills; demonstrate them! Include projects or achievements that showcase your expertise in Kubernetes, Infrastructure-as-Code, and CI/CD systems. We love seeing real-world applications of your knowledge.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows us you’re keen on joining our team!
How to prepare for a job interview at Humanoid
✨Know Your Tech Inside Out
Make sure you’re well-versed in the technologies mentioned in the job description, especially around multi-GPU infrastructure and cloud environments like AWS and GCP. Brush up on your knowledge of Kubernetes, CI/CD systems, and Infrastructure-as-Code tools like Terraform. Being able to discuss these topics confidently will show that you’re ready to hit the ground running.
✨Showcase Your Problem-Solving Skills
Prepare to discuss specific challenges you've faced in previous roles, particularly those involving distributed systems or GPU-heavy environments. Use the STAR method (Situation, Task, Action, Result) to structure your answers, highlighting how you approached complex issues and what the outcomes were. This will demonstrate your ability to troubleshoot and resolve problems effectively.
✨Demonstrate Leadership Experience
As a Staff Engineer, you'll be expected to lead initiatives and mentor others. Be ready to share examples of how you've influenced technical direction or improved processes in past roles. Talk about any mentoring experiences you’ve had and how you’ve raised the bar for engineering quality within your teams.
✨Align with Their Mission
Familiarise yourself with Humanoid's mission to create advanced humanoid robots and how they integrate into daily life. Be prepared to discuss how your skills and experiences align with their goals. Showing genuine interest in their work and how you can contribute will set you apart from other candidates.