At a Glance
- Tasks: Design and scale AI infrastructure for cutting-edge GPU systems.
- Company: Join a hyper-growth tech company at the forefront of AI innovation.
- Benefits: Competitive salary, flexible work options, and opportunities for professional growth.
- Why this job: Make a real impact in AI by solving complex problems with a talented team.
- Qualifications: 5+ years in distributed systems, Kubernetes expertise, and a collaborative mindset.
- Other info: Dynamic environment with significant ownership and career advancement potential.
The predicted salary is between 36000 - 60000 £ per year.
We're working with a hyper growth company. They are building the GPU infrastructure to the best AI labs and the biggest enterprise companies. They are building the solution that allows researchers to focus on their models, while utilising the phenomenal scale and reliability of the world's best AI cloud platform. The engineering team is small, ambitious, and deeply technical, building the orchestration systems that keep thousands of GPUs running at peak performance across global data centres. This role sits at the heart of it, designing and scaling the systems that make AI at exascale possible.
What You'll Focus On
- Designing core platform services for cluster provisioning, workload orchestration, and resource management APIs.
- Building integrations with schedulers (Kubernetes, Slurm) and container runtimes for reliable, high-performance GPU workloads.
- Developing automation for deployment, imaging, and multi-tenant resource allocation.
- Optimising scheduler performance and resource utilisation across diverse workloads.
- Building lifecycle management and automated remediation systems for large-scale clusters.
- Creating Infrastructure-as-Code modules to support rapid, repeatable deployments across varied environments.
About You
You're a pragmatic systems builder who thrives in complexity, enjoys autonomy, and understands what it means to own production at scale. You'll likely bring:
- 5+ years' experience building distributed systems in Go within cloud-native environments.
- Deep hands-on experience with Kubernetes and container orchestration.
- A strong grasp of Infrastructure-as-Code (Terraform) and configuration management tools (Ansible, Puppet, or similar).
- Experience deploying and operating large-scale GPU clusters or HPC systems.
- Working knowledge of ML infrastructure and familiarity with GPU drivers, CUDA, and container runtimes.
- A low-ego, collaborative approach and a clear, proactive communication style.
In short: This is a role for engineers who like big systems, hard problems, and meaningful ownership. You'll be joining a team operating at the intersection of software, hardware, and AI.
Staff Software Engineer - Motive Group in London employer: Jobster
Contact Detail:
Jobster Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Staff Software Engineer - Motive Group in London
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, especially those already working at the company you're eyeing. A friendly chat can give you insider info and maybe even a referral!
✨Tip Number 2
Show off your skills! If you’ve got a GitHub or portfolio, make sure it’s up to date with your best work. This is your chance to demonstrate your expertise in distributed systems and Kubernetes.
✨Tip Number 3
Prepare for technical interviews by brushing up on your problem-solving skills. Practice coding challenges that focus on systems design and resource management – think about how you'd optimise GPU workloads!
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are proactive about their job search!
We think you need these skills to ace Staff Software Engineer - Motive Group in London
Some tips for your application 🫡
Tailor Your CV: Make sure your CV reflects the skills and experiences that align with the job description. Highlight your experience with distributed systems, Kubernetes, and GPU clusters to show us you’re the right fit!
Craft a Compelling Cover Letter: Use your cover letter to tell us why you’re passionate about AI infrastructure and how your background makes you a great candidate. Be specific about your achievements and how they relate to the role.
Showcase Your Projects: If you've worked on relevant projects, whether personal or professional, make sure to mention them! We love seeing practical examples of your work, especially if they involve automation or Infrastructure-as-Code.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates from our team!
How to prepare for a job interview at Jobster
✨Know Your Tech Inside Out
Make sure you’re well-versed in the technologies mentioned in the job description, especially Go, Kubernetes, and Infrastructure-as-Code tools like Terraform. Brush up on your knowledge of GPU clusters and how they operate, as this will likely come up during technical discussions.
✨Showcase Your Problem-Solving Skills
Prepare to discuss specific challenges you've faced in previous roles, particularly those involving distributed systems or large-scale deployments. Use the STAR method (Situation, Task, Action, Result) to structure your answers and highlight your contributions effectively.
✨Demonstrate Collaboration and Communication
Since the role emphasises a low-ego, collaborative approach, be ready to share examples of how you’ve worked with others to solve complex problems. Highlight your communication style and how it has helped in team settings, especially when dealing with technical issues.
✨Ask Insightful Questions
Prepare thoughtful questions about the company’s AI infrastructure and the challenges they face. This shows your genuine interest in the role and helps you assess if the company culture aligns with your values, especially regarding autonomy and ownership.