Job Board

Companies

HAYS Specialist Recruitment

Senior HPC Infrastructure Engineer

Hampshire Full-Time 43200 - 72000 £ / year (est.) No home office possible

At a Glance

Tasks: Design and deliver high-performance computing clusters in a fully remote role.
Company: Join a pioneering company shaping the future of cloud infrastructure and AI.
Benefits: Enjoy unlimited holiday, share options, and 100% remote working.
Why this job: Be part of cutting-edge HPC solutions and a collaborative team culture.
Qualifications: Experience with Slurm, Infiniband, Ansible, and scripting in Bash or Python required.
Other info: Great opportunities for career development and enhanced family-friendly policies.

The predicted salary is between 43200 - 72000 £ per year.

Job Description

Your new company
Join a pioneering organisation at the forefront of AI and High Performance Computing (HPC) infrastructure. With a strong focus on innovation and ethical computing, this company is building scalable, GPU-optimised environments that support cutting-edge research and enterprise workloads.

Your new role
This is a fully remote, hands-on technical role where you'll lead the design, deployment, and optimisation of large-scale AI and HPC clusters. You'll architect end-to-end solutions across compute, storage, and networking – working closely with internal teams, OEMs, and external suppliers to deliver high-performance infrastructure.

You'll be responsible for creating detailed technical designs, including hardware specifications, data centre layouts, cabling, and power/cooling requirements.

You'll install and tune Linux-based operating systems, configure SLURM job schedulers, and optimise high-speed networking technologies such as Infiniband and RoCE.

The role also involves Scripting and automation (Ansible, Terraform), troubleshooting complex distributed systems, and mentoring junior engineers and service teams.This is an ideal opportunity for someone who thrives in project-led infrastructure work and wants to shape the future of AI and HPC platforms.

What you'll need to succeed
To be successful in this role, you'll bring:HPC Cluster Expertise:

Proven experience designing, deploying, and scaling large HPC environments (hundreds to thousands of nodes).
SLURM Scheduler Configuration: Deep understanding of SLURM partitions, priorities, and resource management.
Networking: Strong knowledge of high-performance networking (Infiniband, RoCE, RDMA) and troubleshooting interconnectivity issues.
Linux Systems: Advanced Linux administration skills, including performance tuning and OS-level troubleshooting.
Storage Systems: Experience with parallel/distributed file systems (eg Lustre, Ceph, WEKA, VAST).
Automation & Scripting: Proficiency in Bash, Python, and tools like Ansible and Terraform for deployment and maintenance.
Monitoring & Resilience: Experience implementing monitoring solutions and ensuring high availability and security compliance.
Documentation & Mentoring: Excellent written communication skills and a collaborative approach to mentoring and knowledge sharing.

Desirable Experience

Containerisation in HPC (Singularity, Docker, Apptainer)
Familiarity with AI/ML workflows, GPU-aware MPI, and NVLink
Experience in cloud, academic, or research environments
Vendor hardware validation and data centre planning

What you'll get in return

Share options.
Unlimited holiday policy.
100% Remote working.
Fantastic opportunities to develop – they make a habit of promoting in-house.
A great team with a passion for working collaboratively.
Enhanced family-friendly policies.
A truly flexible workplace!

What you need to do now
If you're interested in this role, click 'apply now' to forward an up-to-date copy of your CV, or call us now.

Hays Specialist Recruitment Limited acts as an employment agency for permanent recruitment and employment business for the supply of temporary workers. By applying for this job you accept the T&C's, Privacy Policy and Disclaimers which can be found on our website.

Senior HPC Infrastructure Engineer employer: HAYS Specialist Recruitment

This pioneering company is an exceptional employer, offering a fully remote work environment that prioritises flexibility and employee well-being with an unlimited holiday policy. With a strong focus on innovation in cloud infrastructure and sustainability, employees are encouraged to grow through fantastic development opportunities and a collaborative team culture, making it an ideal place for those looking to make a meaningful impact in the field of high-performance computing.

Contact Detail:

HAYS Specialist Recruitment Recruiting Team

View HAYS Specialist Recruitment Profile

StudySmarter Expert Advice 🤫

We think this is how you could land Senior HPC Infrastructure Engineer

✨Tip Number 1

Familiarise yourself with the latest trends in HPC and GPU-optimised technologies. Being able to discuss recent advancements or innovations during your interview can demonstrate your passion and knowledge in the field.

✨Tip Number 2

Network with professionals in the HPC community. Engaging with others in the industry through forums, webinars, or social media can provide insights into the role and may even lead to referrals.

✨Tip Number 3

Prepare to showcase your hands-on experience with tools like Slurm and Ansible. Be ready to discuss specific projects where you’ve successfully implemented these technologies, as practical examples can set you apart.

✨Tip Number 4

Highlight your problem-solving skills and ability to work collaboratively. Since this role involves working closely with various teams, demonstrating your teamwork and communication abilities will be crucial during the interview.

We think you need these skills to ace Senior HPC Infrastructure Engineer

Experience with Slurm job schedulers

Knowledge of Infiniband and RoCE networking technologies

Proficiency in Ansible for automation and configuration management

Strong understanding of networking fundamentals

Familiarity with data centre infrastructure planning

End-to-end experience in deploying and scaling HPC clusters

Understanding of GPU-optimised server architecture

Comfortable scripting in Bash, Python, or similar languages

Ability to collaborate with software engineers

Experience in technical delivery and project management

Some tips for your application 🫡

Tailor Your CV: Make sure your CV highlights your experience with HPC job schedulers like Slurm, high-speed networking technologies such as Infiniband and RoCE, and your proficiency in Ansible. Customise your CV to reflect the specific skills mentioned in the job description.

Craft a Compelling Cover Letter: Write a cover letter that showcases your passion for high-performance computing and cloud infrastructure. Mention specific projects where you've successfully deployed HPC clusters and how your skills align with the company's innovative approach.

Showcase Relevant Projects: Include a section in your application that details relevant projects or experiences. Highlight your end-to-end experience in deploying and scaling HPC clusters, as well as any scripting or automation tasks you've undertaken using Bash or Python.

Proofread and Edit: Before submitting your application, thoroughly proofread your documents. Check for any spelling or grammatical errors, and ensure that all technical terms are used correctly. A polished application reflects your attention to detail.

How to prepare for a job interview at HAYS Specialist Recruitment

✨Showcase Your Technical Expertise

Be prepared to discuss your experience with HPC job schedulers like Slurm, and how you've managed and tuned them in previous roles. Highlight specific projects where you successfully deployed and scaled HPC clusters, as this will demonstrate your hands-on experience.

✨Demonstrate Networking Knowledge

Since the role requires deep knowledge of high-speed networking technologies such as Infiniband and RoCE, be ready to explain your understanding of these technologies. Discuss any relevant experiences where you configured networks in complex environments.

✨Emphasise Automation Skills

Proficiency in Ansible for automation and configuration management is crucial. Prepare examples of how you've used Ansible to streamline processes or improve efficiency in your previous roles. This will show your ability to enhance platform capabilities.

✨Prepare for Problem-Solving Scenarios

Expect questions that assess your problem-solving skills, especially in relation to server architecture and data centre infrastructure. Think of scenarios where you had to troubleshoot issues or optimise performance, and be ready to share your thought process.

Senior HPC Infrastructure Engineer

HAYS Specialist Recruitment

Location: Hampshire

Senior HPC Infrastructure Engineer

Hampshire

Full-Time

43200 - 72000 £ / year (est.)
HAYS Specialist Recruitment

1000+

View HAYS Specialist Recruitment Profile

Similar positions in other companies

UK’s top job board for Gen Z

Discover now

Senior HPC Infrastructure Engineer

At a Glance

Job Description

Senior HPC Infrastructure Engineer employer: HAYS Specialist Recruitment

StudySmarter Expert Advice 🤫

✨Tip Number 1

✨Tip Number 2

✨Tip Number 3

✨Tip Number 4

We think you need these skills to ace Senior HPC Infrastructure Engineer

Some tips for your application 🫡

How to prepare for a job interview at HAYS Specialist Recruitment

Senior HPC Infrastructure Engineer

Land your dream job quicker with Premium

Similar positions in other companies

UK’s top job board for Gen Z