Site Reliability Engineer in City of London
Site Reliability Engineer

Site Reliability Engineer in City of London

City of London Full-Time 36000 - 60000 ÂŁ / year (est.) No home office possible
Go Premium
H

At a Glance

  • Tasks: Design and manage secure Kubernetes infrastructure for high-security environments.
  • Company: Join Helsing, a defence AI company dedicated to protecting democracies.
  • Benefits: Competitive salary, relocation support, and a commitment to diversity and inclusion.
  • Why this job: Make a real impact in defence technology while working with cutting-edge AI solutions.
  • Qualifications: Experience in Kubernetes, cloud-native technologies, and strong scripting skills.
  • Other info: Collaborative culture that values integrity and ethical technology use.

The predicted salary is between 36000 - 60000 ÂŁ per year.

Helsing is a defence AI company. Our mission is to protect our democracies. We aim to achieve technological leadership so that open societies can continue to make sovereign decisions and control their ethical standards. As democracies, we believe we have a special responsibility to be thoughtful about the development and deployment of powerful technologies like AI. We take this responsibility seriously. We are an ambitious and committed team of engineers, AI specialists and customer‑facing programme managers. We are looking for mission‑driven people to join our European teams – and apply their skills to solve the most complex and impactful problems. We embrace an open and transparent culture that welcomes healthy debates on the use of technology in defence, its benefits, and its ethical implications.

The Role: Much of our work takes place in high‑security on‑premise environments, and we are looking for a Site Reliability Engineer to support our high security environments. Your role as a Site Reliability Engineer will be to design, implement, and manage our on‑premise Kubernetes infrastructure. We are looking for engineers with a strong work ethic and prioritisation skills. We value team players who communicate clearly, share knowledge generously, and collaborate effectively to move their team — and our mission—forward.

Day‑to‑Day:

  • Design and build cloud‑native infrastructure platforms on‑premises, focusing on Kubernetes‑based solutions that enable our development teams to operate services at scale.
  • Create robust observability frameworks using Grafana, Prometheus, and distributed tracing to ensure system reliability and performance.
  • Architect and implement secure, multi‑tenant Kubernetes clusters with strong access controls, policy‑as‑code governance, and zero‑trust networking between red and black network domains.
  • Develop operators and controllers to automate infrastructure provisioning and compliance.
  • Build and maintain MLOps platforms enabling AI researchers to deploy, monitor, and scale machine learning models in production.
  • Collaborate closely with our Security teams to implement supply chain security, container scanning, and runtime protection across our cloud‑native stack.

Key Skills:

  • Scripting: experience in either Python, Go, Rust or Bash/ Shell for automation and tooling.
  • Kubernetes Expertise: deep experience operating production Kubernetes clusters, writing custom controllers/operators, and implementing service mesh architectures (Istio/Linkerd).
  • Cloud‑Native Technologies: hands‑on experience with CNCF ecosystem, e.g. including Helm, ArgoCD, Flux and container runtime security tools like Falco.
  • Observability Stack: expert‑level knowledge of Grafana, Prometheus, Loki, Tempo, and OpenTelemetry. Experience building custom dashboards, alerts, and SLI/SLO frameworks.
  • Networking: expert understanding of networking concepts, protocols and security.
  • MLOps Platforms: experience with Kubeflow, MLflow, or similar platforms.
  • Infrastructure as Code: proficiency with Terraform, Ansible, and Kubernetes manifest templating. Experience with policy‑as‑code tools like OPA/Gatekeeper.
  • System Administration: deep understanding of Linux/Unix system administration and highly available, distributed systems. Comfortable building out data and telemetry pipelines for debugging and future‑proofing solutions.

Should Apply If You:

  • Have a high level of personal integrity, reliability, and attention to detail.
  • Have a software engineering mindset with a passion for building platforms and tools that multiply developer productivity.
  • Have experience running cloud‑native workloads in on‑premises or air‑gapped environments.
  • Are willing to relocate to Munich, London, or Paris.

Helsing is an equal opportunities employer. We are committed to equal employment opportunity regardless of race, religion, sexual orientation, age, marital status, disability or gender identity.

Site Reliability Engineer in City of London employer: Helsing

Helsing is an exceptional employer for Site Reliability Engineers, offering a unique opportunity to work at the forefront of defence AI technology in vibrant cities like Munich, London, or Paris. Our open and transparent culture fosters collaboration and healthy debate, while our commitment to employee growth ensures that you will continually develop your skills in a mission-driven environment. Join us to make a meaningful impact on the future of democracies through innovative technology.
H

Contact Detail:

Helsing Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer in City of London

✨Tip Number 1

Network like a pro! Reach out to current employees at Helsing on LinkedIn or other platforms. Ask them about their experiences and share your passion for AI and defence tech. This can give you insider info and might even lead to a referral!

✨Tip Number 2

Prepare for the technical interview by brushing up on your Kubernetes skills. Set up a mini project where you can demonstrate your expertise in building and managing clusters. Show us what you've got!

✨Tip Number 3

Don’t underestimate the power of soft skills! Practice articulating your thoughts clearly and concisely. Being a team player is key, so be ready to discuss how you’ve collaborated effectively in past projects.

✨Tip Number 4

Apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you’re genuinely interested in joining our mission-driven team at Helsing. Let’s make a difference together!

We think you need these skills to ace Site Reliability Engineer in City of London

Kubernetes Expertise
Scripting (Python, Go, Rust, Bash/Shell)
GitOps Workflows
CI/CD Automation
Cloud-Native Technologies (Helm, ArgoCD, Flux)
Observability Stack (Grafana, Prometheus, Loki, Tempo, OpenTelemetry)
Networking Concepts and Security
MLOps Platforms (Kubeflow, MLflow)
Infrastructure as Code (Terraform, Ansible)
Policy-as-Code Tools (OPA/Gatekeeper)
Linux/Unix System Administration
Attention to Detail
Software Engineering Mindset
Building Data and Telemetry Pipelines

Some tips for your application 🫡

Show Your Passion for AI and Defence: When writing your application, let us see your enthusiasm for AI and its role in defence. Share any relevant experiences or projects that highlight your commitment to ethical technology and how you can contribute to our mission.

Tailor Your Skills to the Role: Make sure to align your skills with the key requirements listed in the job description. Highlight your experience with Kubernetes, cloud-native technologies, and scripting languages like Python or Go. We want to see how your background fits perfectly with what we need!

Be Clear and Concise: Keep your application straightforward and to the point. Use clear language to describe your experiences and achievements. We appreciate well-structured applications that make it easy for us to see your qualifications at a glance.

Apply Through Our Website: Don’t forget to submit your application through our website! It’s the best way for us to receive your details and ensures you’re considered for the role. Plus, it shows you’re serious about joining our team!

How to prepare for a job interview at Helsing

✨Know Your Kubernetes Inside Out

Make sure you brush up on your Kubernetes knowledge before the interview. Be ready to discuss your experience with production clusters, custom controllers, and service mesh architectures. They’ll want to see that you can not only operate these systems but also innovate and improve them.

✨Showcase Your Scripting Skills

Prepare to talk about your experience with scripting languages like Python, Go, or Bash. Have examples ready where you've used these skills for automation or tooling. This is a key part of the role, so demonstrating your proficiency will definitely set you apart.

✨Demonstrate Your Understanding of Observability

Familiarise yourself with tools like Grafana and Prometheus. Be prepared to explain how you've built observability frameworks in the past and how they contributed to system reliability. They’ll appreciate candidates who can articulate the importance of monitoring and performance metrics.

✨Emphasise Team Collaboration

Helsing values team players, so be ready to share examples of how you've collaborated effectively in previous roles. Discuss how you communicate with team members and share knowledge. Highlighting your ability to work well in a team will resonate with their open and transparent culture.

Site Reliability Engineer in City of London
Helsing
Location: City of London
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

H
  • Site Reliability Engineer in City of London

    City of London
    Full-Time
    36000 - 60000 ÂŁ / year (est.)
  • H

    Helsing

    50-100
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>