HPC Engineer - Generative Biology Institute in Oxford

HPC Engineer - Generative Biology Institute in Oxford

Oxford Full-Time 50000 - 60000 £ / year (est.) No working from home possible
Ellison Institute of Technology Oxford

At a Glance

  • Tasks: Join a dynamic team to enhance HPC platforms for groundbreaking biological research.
  • Company: Generative Biology Institute, a leader in scientific computing and innovation.
  • Benefits: Competitive salary, travel allowance, enhanced holiday pay, and private medical insurance.
  • Other info: Flexible hybrid working and opportunities for career growth in a collaborative environment.
  • Why this job: Make a real impact in cutting-edge research while learning new technologies.
  • Qualifications: Degree in relevant field or equivalent experience; Linux systems knowledge preferred.

The predicted salary is between 50000 - 60000 £ per year.

Your Role

Working as part of a new Scientific Computing team within GBI, the HPC Engineer will help operate, improve, and scale the data and computing platform that will enable cutting‑edge research in engineering biology. This is a broad, hands‑on role at the interface of Linux systems, high‑performance computing, cloud infrastructure, Kubernetes, Slurm, storage, monitoring, and researcher support. They will help turn emerging researcher needs and operational lessons into robust platform improvements, reusable tooling, and clear runbooks. This role is particularly suited to someone who enjoys practical systems work, learning new technologies, and collaborating closely with scientists and engineers. We do not expect candidates to have deep experience in every technology listed in this description. Instead, we are looking for a strong, scientifically minded systems engineer: someone who can troubleshoot complex environments, communicate clearly with multidisciplinary teams, learn unfamiliar tools quickly, and help build reliable, scalable services that advance GBI’s scientific mission.

Key Responsibilities

  • Operate, maintain, and improve GBI’s hybrid HPC platform, including Linux‑based compute environments, Slurm/Slinky workloads, Kubernetes/OKE services, Open OnDemand, GPU and CPU partitions, and shared storage.
  • Help provision, configure, scale, and validate compute, storage, networking, and platform services using infrastructure as code, configuration management, and automation tools such as Terraform, Helm and Ansible.
  • Monitor platform health, capacity, job scheduling, GPU utilisation, storage behaviour, and network performance; investigate issues using tools such as Prometheus and Grafana.
  • Support researchers in using our Scientific Computing Platform, including triaging user issues and translating common pain points into platform improvements.
  • Build and maintain reproducible runtime environments, container images, and workflow‑supporting services for scientific computing workloads, including bioinformatics, AI/ML, data processing, and simulation workflows.
  • Contribute to safe rollout and maintenance processes for Slurm images, worker node pools, scheduler configuration, container runtime changes, security updates, and monitoring improvements.
  • Create and maintain clear technical documentation, runbooks, validation checks, and issue/PR notes so the platform can be operated consistently and improved safely by the wider team.

Requirements

Essential Knowledge, Skills and Experience

  • Bachelor’s or Master’s degree in Computer Science, Computational Biology, Engineering, Physics, Mathematics, or a related discipline, or equivalent practical experience.
  • Hands‑on experience supporting or administering Linux‑based systems in an HPC, cloud, research, academic, or production environment.
  • Working knowledge of HPC or batch‑computing concepts, including schedulers, resource requests, queues/partitions, shared filesystems, and multi‑user compute environments; Slurm experience is preferred.
  • Ability to troubleshoot issues across systems, networking, storage, identity, containers, schedulers, and user workloads, and to follow problems through to a reliable operational fix.
  • Experience with scripting, automation, and version‑controlled operational changes using tools such as Git, CI/CD, Terraform, Ansible, Helm, or similar.
  • Ability to work closely with multidisciplinary research teams, understand scientific computing needs, and deliver practical services that advance scientific goals.
  • Strong communication and documentation skills, with the ability to explain technical concepts clearly to scientists, engineers, and non‑specialist audiences.
  • A proactive, learning‑oriented approach suited to a new team building and improving a platform while also operating it day to day.

Desirable Knowledge, Skills and Experience

  • Experience operating Slurm clusters, Slinky/slurm‑operator, Open OnDemand, JupyterLab services, or other researcher‑facing HPC portals and access patterns.
  • Experience with Kubernetes or managed Kubernetes platforms such as OCI OKE, EKS, GKE, or AKS, including Helm, Argo CD, operators, services, storage classes, and workload troubleshooting.
  • Experience with cloud infrastructure, particularly OCI, and with infrastructure as code and remote execution models such as Terraform Cloud.
  • Experience with shared and high‑performance storage such as Lustre, BeeGFS, GPFS, NFS, OCI File Storage, object storage, or data movement workflows for large scientific datasets.
  • Experience supporting GPU‑accelerated workloads, NVIDIA tooling, CUDA‑aware environments, DCGM metrics, GPU health monitoring, and/or AI/ML and bioinformatics workloads on shared compute platforms.
  • Experience with containerised HPC and scientific workflow tooling, such as Apptainer/Singularity, Docker/Podman, Pyxis/Enroot, Nextflow, Snakemake, CWL, or WDL.
  • Experience building monitoring and operational dashboards using Prometheus, Grafana, exporter metrics, alerting rules, or capacity and reliability reporting.
  • Familiarity with identity, access, and security controls in Linux or research environments, such as OIDC, Okta ASA/PAM, least‑privilege access, and security patching.
  • Experience working in a scientific, academic, life‑science, or research computing environment where requirements evolve through close collaboration with researchers.

Benefits

  • Salary: Competitive + travel allowance + bonus.
  • Enhanced holiday pay.
  • Pension.
  • Life Assurance.
  • Income Protection.
  • Private Medical Insurance.
  • Hospital Cash Plan.
  • Therapy Services.
  • Perk Box.
  • Electric Car Scheme.

Working Together – What It Involves

You must have the right to work permanently in the UK with a willingness to travel as necessary. In certain cases, we can consider sponsorship, and this will be assessed on a case‑by‑case basis. You will live in, or within easy commuting distance of, Oxford (or be willing to relocate). Hybrid working.

HPC Engineer - Generative Biology Institute in Oxford employer: Ellison Institute of Technology Oxford

At the Generative Biology Institute, we pride ourselves on fostering a collaborative and innovative work culture that empowers our employees to thrive. As an HPC Engineer, you will be at the forefront of scientific computing, working alongside multidisciplinary teams in a dynamic environment that encourages continuous learning and professional growth. With competitive salaries, comprehensive benefits, and a commitment to employee well-being, including hybrid working options and support for personal development, GBI is an exceptional employer for those seeking meaningful contributions to cutting-edge research in Oxford.

Ellison Institute of Technology Oxford

Contact Details:

Ellison Institute of Technology Oxford Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land HPC Engineer - Generative Biology Institute in Oxford

Tip Number 1

Get to know the company and its culture! Research GBI and understand their mission in generative biology. This will help you tailor your conversations and show that you're genuinely interested in being part of their team.

Tip Number 2

Network like a pro! Connect with current employees on LinkedIn or attend relevant events. Building relationships can give you insider info and might even lead to a referral, which is always a bonus!

Tip Number 3

Prepare for technical interviews by brushing up on your skills! Since this role involves HPC and Linux systems, practice troubleshooting scenarios and be ready to discuss your hands-on experience with tools like Slurm and Kubernetes.

Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you’re serious about joining the team at GBI and ready to contribute to their scientific mission.

We think you need these skills to ace HPC Engineer - Generative Biology Institute in Oxford

Linux Systems Administration
High-Performance Computing (HPC)
Cloud Infrastructure
Kubernetes
Slurm
Infrastructure as Code
Terraform

Some tips for your application 🫡

Tailor Your CV:Make sure your CV reflects the skills and experiences that match the HPC Engineer role. Highlight any hands-on experience with Linux systems, high-performance computing, and cloud infrastructure. We want to see how your background aligns with our needs!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you're excited about the role and how your unique skills can contribute to our Scientific Computing team. Keep it engaging and personal – we love to see your personality come through!

Showcase Your Problem-Solving Skills:In your application, don’t just list your technical skills; share examples of how you've tackled complex issues in past roles. We’re looking for someone who can troubleshoot and improve systems, so let us know how you’ve done this before!

Apply Through Our Website:We encourage you to apply directly through our website for the best chance of getting noticed. It’s super easy, and you’ll be able to upload all your documents in one go. Plus, it helps us keep track of your application better!

How to prepare for a job interview at Ellison Institute of Technology Oxford

Know Your Tech Stack

Familiarise yourself with the technologies mentioned in the job description, especially Linux systems, HPC concepts, and tools like Slurm and Kubernetes. Even if you don't have deep experience, being able to discuss how you've worked with similar technologies or your approach to learning new ones will impress the interviewers.

Showcase Your Problem-Solving Skills

Prepare examples of how you've troubleshot complex issues in previous roles. Be ready to explain your thought process and the steps you took to resolve problems, particularly in multi-user compute environments or when working with researchers.

Communicate Clearly

Practice explaining technical concepts in simple terms. Since you'll be collaborating with scientists and engineers, demonstrating your ability to communicate effectively with non-specialists will be key. Consider role-playing with a friend to refine your explanations.

Highlight Your Collaborative Spirit

This role involves working closely with multidisciplinary teams, so be prepared to discuss your experiences in collaborative settings. Share specific instances where you contributed to team projects or helped bridge gaps between technical and non-technical team members.