At a Glance
- Tasks: Design and optimise high-performance infrastructure for AI and machine learning workflows.
- Company: Leading tech firm at the forefront of AI research and development.
- Benefits: Competitive salary, flexible working hours, and opportunities for professional growth.
- Why this job: Join a dynamic team and make a real impact in cutting-edge AI projects.
- Qualifications: Experience with ML compute clusters and strong problem-solving skills.
- Other info: Work in a collaborative environment with excellent career advancement potential.
The predicted salary is between 36000 - 60000 £ per year.
We are seeking a Senior HPC Engineer to design, implement, and scale the infrastructure that supports high-performance machine learning and AI-driven research workflows. You will play a critical role in bridging the gap between data science, bioinformatics, and engineering — ensuring seamless, secure, and reproducible deployment of ML models in production and research environments.
You’ll collaborate closely with AI Scientists, Data Engineers, and DevSecOps teams, building automation pipelines that accelerate model development and deployment across distributed, cloud-native systems.
Key Responsibilities- Build, operate, and continuously optimise our high-performance GPU training and inference clusters, focusing on robust, high-availability scheduling, isolation, and automated lifecycle management.
- Drive systems design and implementation for high-throughput data paths, optimising I/O, caching, and data locality across compute and storage (including our current Lustre implementation).
- Proactively benchmark, profile, and resolve performance bottlenecks across the compute, network, and orchestration layers to maximise efficiency for distributed training and inference.
- Establish comprehensive observability, resilience, and automated security controls to ensure compliance and robust operation of sensitive research environments.
- Partner with Research, Data, and Applied teams to forecast capacity and cost for GPU and storage needs, setting quotas and streamlining ML experimentation pipelines.
- Proven experience leading the design, build, and operation of high-performance ML compute clusters at scale.
- A proactive, autonomous approach to systems design and the proven ability and desire to ideate, co-create and implement optimal solutions.
- Exposure to migrating or transforming ML infrastructure from traditional schedulers to modern, containerised systems.
- Expertise with high-throughput storage systems for ML/HPC workloads.
- Expert-level understanding of GPU architecture, high-speed networking for distributed training, and performance profiling to resolve bottlenecks.
- A solid grasp of IaC and CI/CD practices (e.g., Terraform, Argo CD).
Applicants must have the right to work permanently in the UK and be within commuting distance of Oxford.
HPC Engineer in Oxford employer: Hlx Life Sciences
Contact Detail:
Hlx Life Sciences Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land HPC Engineer in Oxford
✨Tip Number 1
Network like a pro! Reach out to folks in the HPC and AI communities on LinkedIn or at local meetups. We all know that sometimes it’s not just what you know, but who you know that can help you land that dream job.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your projects related to HPC and ML. We love seeing real-world applications of your expertise, so make sure to highlight any cool stuff you've built or optimised.
✨Tip Number 3
Prepare for those interviews! Brush up on your technical knowledge and be ready to discuss your experience with GPU clusters and containerised systems. We want to see your passion and problem-solving skills in action!
✨Tip Number 4
Apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we’re always on the lookout for talented individuals like you to join our team!
We think you need these skills to ace HPC Engineer in Oxford
Some tips for your application 🫡
Tailor Your CV: Make sure your CV is tailored to the HPC Engineer role. Highlight your experience with high-performance ML compute clusters and any relevant projects you've worked on. We want to see how your skills align with what we're looking for!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about HPC and how your background makes you a great fit for our team. Don’t forget to mention your experience with GPU architecture and containerised systems.
Showcase Your Projects: If you've worked on any relevant projects, make sure to include them in your application. We love seeing real-world examples of your work, especially if they involve optimising ML infrastructure or automating pipelines.
Apply Through Our Website: We encourage you to apply through our website for the best chance of getting noticed. It’s super easy, and we’ll be able to review your application more efficiently. Plus, it shows you’re serious about joining us at StudySmarter!
How to prepare for a job interview at Hlx Life Sciences
✨Know Your HPC Stuff
Make sure you brush up on your knowledge of high-performance computing, especially around GPU architecture and high-throughput storage systems. Be ready to discuss your past experiences with ML compute clusters and how you've tackled performance bottlenecks.
✨Show Off Your Collaboration Skills
Since you'll be working closely with AI Scientists, Data Engineers, and DevSecOps teams, it's crucial to demonstrate your ability to collaborate effectively. Prepare examples of how you've partnered with different teams in the past to drive successful projects.
✨Be Ready for Technical Challenges
Expect some technical questions or challenges during the interview. They might ask you to solve a problem related to systems design or optimisation. Practise explaining your thought process clearly and logically, as this will showcase your problem-solving skills.
✨Understand the Company’s Needs
Research the company’s current projects and challenges in ML infrastructure. Tailor your responses to show how your skills can directly address their needs, especially in areas like automated lifecycle management and compliance in sensitive research environments.