At a Glance
- Tasks: Join our team to build cutting-edge AI infrastructure for groundbreaking cancer research.
- Company: King's e-Research supports innovative computational research across diverse disciplines.
- Benefits: Enjoy high flexibility, remote work, personal development days, and 30 days annual leave.
- Why this job: Be part of impactful projects with technical freedom and a collaborative culture.
- Qualifications: Experience in deploying large-scale infrastructure and mentoring junior staff is essential.
- Other info: This is a full-time role with potential for permanent conversion after funding.
The predicted salary is between 55000 - 80000 £ per year.
About us
King's e-Research department supports cutting edge computational and data intensive research across all disciplines at the College. We provide high performance compute, private and public cloud infrastructure and trusted research environments as the core building blocks of modern, data driven research. Alongside these infrastructure services e-Research provides Research Software Engineering, Infrastructure Engineering and Data Governance expertise to individual research projects.
About the role
We are expanding our Principal Infrastructure Engineering team to support two exciting new projects (in addition to our existing services): Pharos AI, a £43m (£18.9m DSIT, £24m partners) grant to build an AI development platform to unlock the value of large multi-modal cancer datasets hosted in a pair of biobank secure data environments operated by Guy's and Thomas' and Bart's NHS Trusts. This includes extensive support from and collaboration with leading edge industry partners e.g. AI precision medicine and drug discovery. The King's AI+ strategic investment recruiting 20 AI focused fellows and adding £2m of next-generation GPU capacity to King's Computational Research Engineering and Technology Environment (CREATE). Work for these projects will be shared across a team of four principal infrastructure engineers to deliver AI/ML ops infrastructure services, scale out compute to national AI supercomputers and public cloud providers, produce quality portable and open sourced infrastructure as code.
Our Principal Infrastructure Engineers work collaboratively with large amounts of technical freedom and decision making autonomy. We build almost exclusively with FOSS and wish to put more of our work back into the community over time. You can expect to work with the following technologies: Apache, Bacula, Ceph (CephFS, RADOSGW, RBD), Discourse, Flask, Git, GitLab, GLPI, Grafana, Laravel, Let's Encrypt, Linux (primarily Ubuntu), mkdocs, Nginx, OpenStack, OpenSSL, OpenSSH, OpenTofu, OpenOnDemand, OpenVPN, ProxMox, Python, Puppet, SLURM, Spack, Squid, Trivy, VSCode, Wireguard, ZFS.
This is a full-time (35 hours per week) position, offered on a fixed-term contract, currently funded until 31/5/2027, but it is planned to convert to permanent. We also anticipate that a second, similar position will become available shortly, subject to funding approval. Candidates may be considered for this additional role if funding is confirmed.
About you
To be successful in this role, we are looking for candidates to have the following skills and experience:
- Essential criteria
- Demonstrable ability to deploy and maintain large-scale compute, storage and/or networking infrastructure through code (e.g. Ansible, Puppet, Terraform, OpenTofu)
- Demonstrable ability to develop software with experience as the primary developer of projects with a large modular codebase, ideally dealing with issues such as concurrency, caching and performance scaling
- Demonstrable ability to diagnose network and operating system level issues with tools such as strace, tcpdump, etc
- Demonstrable ability to mentor and train more junior technical staff including review of software and infrastructure project code
- Demonstrable ability to deploy and maintain public or private cloud infrastructure, and high performance compute clusters with experience of stability and storage engineering in relation to these
- Demonstrable ability to deploy and maintain monitoring and metrics platforms at scale
- Strong knowledge of security fundamentals and practical experience of securing Linux systems and related infrastructure
- Proven ability to work with a high degree of autonomy within a high performing engineering team, fostering a culture of transparent collaboration and building technical consensus where necessary
- Desirable criteria
- Experience profiling and optimising AI/ML workloads
- Experience deploying, configuring and maintaining Kubernetes clusters
- Experience developing applications for and deployed onto Kubernetes clusters
- Performance profiling of compute and/or IO intensive workloads
- Ability to read, understand and troubleshoot opensource software written in C
Further Information
Benefits: KCL Grade 8 £64,139 - £73,529 High flex mostly remote working (typically between 1 - 8 days in the office per month depending on personal preference) 1 day every 2 weeks dedicated to personal development on relevant technology of your choosing Conference attendance (e.g. CERN storage week, FOSDEM Brussels, CIUK, AI UK) 35 hour week 30 days annual leave (plus Christmas closure)
We pride ourselves on being inclusive and welcoming. We embrace diversity and want everyone to feel that they belong and are connected to others in our community. We are committed to working with our staff and unions on these and other issues, to continue to support our people and to develop a diverse and inclusive culture at King's.
We ask all candidates to submit a copy of their CV, and a supporting statement, detailing how they meet the essential criteria listed in the advert. If we receive a strong field of candidates, we may use the desirable criteria to choose our final shortlist, so please include your evidence against these where possible.
Principal Research Infrastructure Engineer employer: Kings College London
Contact Detail:
Kings College London Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Principal Research Infrastructure Engineer
✨Tip Number 1
Familiarise yourself with the specific technologies mentioned in the job description, such as Apache, Ceph, and OpenStack. Having hands-on experience or projects that showcase your skills with these tools can set you apart during discussions.
✨Tip Number 2
Engage with the open-source community related to the technologies used at King's e-Research. Contributing to relevant projects or forums can demonstrate your commitment and expertise, making you a more attractive candidate.
✨Tip Number 3
Prepare to discuss your experience with mentoring junior staff, as this is a key aspect of the role. Think of specific examples where you've successfully guided others, which will highlight your leadership skills.
✨Tip Number 4
Stay updated on the latest trends in AI and ML, especially regarding infrastructure needs. Being able to speak knowledgeably about how these technologies impact research can show your enthusiasm and relevance for the position.
We think you need these skills to ace Principal Research Infrastructure Engineer
Some tips for your application 🫡
Understand the Role: Read the job description thoroughly to grasp the essential and desirable criteria. Tailor your application to highlight how your skills and experiences align with what they are looking for.
Craft a Strong Supporting Statement: In your supporting statement, clearly address each of the essential criteria listed in the job advert. Use specific examples from your past experiences to demonstrate your capabilities and achievements.
Highlight Relevant Skills: Make sure to emphasise your experience with technologies mentioned in the job description, such as Ansible, Puppet, Terraform, and Kubernetes. This will show that you have the technical expertise required for the role.
Review and Edit: Before submitting your application, review your CV and supporting statement for clarity and conciseness. Check for any spelling or grammatical errors, and ensure that all information is accurate and up-to-date.
How to prepare for a job interview at Kings College London
✨Showcase Your Technical Skills
Be prepared to discuss your experience with deploying and maintaining large-scale compute, storage, and networking infrastructure. Highlight specific projects where you've used tools like Ansible, Puppet, or Terraform, and be ready to explain the challenges you faced and how you overcame them.
✨Demonstrate Problem-Solving Abilities
Expect questions that assess your ability to diagnose network and operating system issues. Familiarise yourself with tools such as strace and tcpdump, and be ready to provide examples of how you've used these tools in past roles to troubleshoot complex problems.
✨Emphasise Mentorship Experience
Since mentoring junior staff is a key part of the role, prepare to discuss your experience in training others. Share specific instances where you've reviewed code or provided guidance, and explain how you fostered a collaborative learning environment.
✨Align with Their Values
Research King's e-Research department and their commitment to open-source software and community contributions. Be ready to discuss how your values align with theirs, and share any relevant experiences where you've contributed to open-source projects or fostered collaboration within teams.