At a Glance
- Tasks: Design and manage secure Kubernetes infrastructure for high-stakes AI projects.
- Company: Join Helsing, a pioneering defence AI company dedicated to protecting democracies.
- Benefits: Competitive salary, relocation support, and a commitment to diversity and inclusion.
- Other info: Dynamic team environment with opportunities for personal and professional growth.
- Why this job: Make a real impact in defence technology while working with cutting-edge AI solutions.
- Qualifications: Experience in Kubernetes, cloud-native technologies, and a passion for automation.
The predicted salary is between 36000 - 60000 £ per year.
Helsing is a defence AI company. Our mission is to protect our democracies. We aim to achieve technological leadership so that open societies can continue to make sovereign decisions and control their ethical standards. As democracies, we believe we have a special responsibility to be thoughtful about the development and deployment of powerful technologies like AI. We take this responsibility seriously. We are an ambitious and committed team of engineers, AI specialists and customer‐facing programme managers. We are looking for mission‐driven people to join our European teams – and apply their skills to solve the most complex and impactful problems. We embrace an open and transparent culture that welcomes healthy debates on the use of technology in defence, its benefits, and its ethical implications.
Much of our work takes place in high‐security on‐premise environments, and we are looking for a Site Reliability Engineer to support our high security environments. Your role as a Site Reliability Engineer will be to design, implement, and manage our on‐premise Kubernetes infrastructure. We are looking for engineers with a strong work ethic and prioritisation skills. We value team players who communicate clearly, share knowledge generously, and collaborate effectively to move their team — and our mission—forward.
Day‐to‐Day
- Design and build cloud‐native infrastructure platforms on‐premises, focusing on Kubernetes‐based solutions that enable our development teams to operate services at scale.
- Create robust observability frameworks using Grafana, Prometheus, and distributed tracing to ensure system reliability and performance.
- Architect and implement secure, multi‐tenant Kubernetes clusters with strong access controls, policy‐as‐code governance, and zero‐trust networking between red and black network domains.
- Develop operators and controllers to automate infrastructure provisioning and compliance.
- Build and maintain MLOps platforms enabling AI researchers to deploy, monitor, and scale machine learning models in production.
- Collaborate closely with our Security teams to implement supply chain security, container scanning, and runtime protection across our cloud‐native stack.
Key Skills
- Scripting: experience in either Python, Go, Rust or Bash/ Shell for automation and tooling.
- Experience with GitOps workflows and CI/CD automation.
- Kubernetes Expertise: deep experience operating production Kubernetes clusters, writing custom controllers/operators, and implementing service mesh architectures (Istio/Linkerd).
- Cloud‐Native Technologies: hands‐on experience with CNCF ecosystem, e.g. including Helm, ArgoCD, Flux and container runtime security tools like Falco.
- Observability Stack: expert‐level knowledge of Grafana, Prometheus, Loki, Tempo, and OpenTelemetry. Experience building custom dashboards, alerts, and SLI/SLO frameworks.
- Networking: expert understanding of networking concepts, protocols and security.
- MLOps Platforms: experience with Kubeflow, MLflow, or similar platforms.
- Infrastructure as Code: proficiency with Terraform, Ansible, and Kubernetes manifest templating. Experience with policy‐as‐code tools like OPA/Gatekeeper.
- System Administration: deep understanding of Linux/Unix system administration and highly available, distributed systems.
- Comfortable building out data and telemetry pipelines for debugging and future‐proofing solutions.
Should Apply If You
- Have a high level of personal integrity, reliability, and attention to detail.
- Have a software engineering mindset with a passion for building platforms and tools that multiply developer productivity.
- Have experience running cloud‐native workloads in on‐premises or air‐gapped environments.
- Are willing to relocate to Munich, London, or Paris.
Helsing is an equal opportunities employer. We are committed to equal employment opportunity regardless of race, religion, sexual orientation, age, marital status, disability or gender identity.
Site Reliability Engineer in London employer: Helsing
Helsing is an exceptional employer for Site Reliability Engineers, offering a unique opportunity to work at the forefront of defence AI technology in vibrant cities like Munich, London, or Paris. Our open and transparent culture fosters collaboration and healthy debate, while our commitment to employee growth ensures that you will continually develop your skills in a mission-driven environment. Join us to make a meaningful impact on the future of democracies through innovative technology.
StudySmarter Expert Advice🤫
We think this is how you could land Site Reliability Engineer in London
✨Tip Number 1
Network like a pro! Reach out to current employees at Helsing on LinkedIn or other platforms. Ask them about their experiences and any tips they might have for the interview process. This insider info can give us a leg up!
✨Tip Number 2
Prepare for technical interviews by brushing up on your Kubernetes skills. We should practice common scenarios and problems that might come up, especially around cloud-native technologies and observability stacks. The more we know, the more confident we'll feel!
✨Tip Number 3
Showcase our passion for the mission! When we get the chance to chat with interviewers, let’s express why we care about using technology for good, especially in defence. It’ll help us stand out as mission-driven candidates.
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure our application gets seen by the right people. Plus, it shows we’re serious about joining the team at Helsing.
We think you need these skills to ace Site Reliability Engineer in London
Some tips for your application 🫡
Show Your Passion:When writing your application, let your enthusiasm for the role shine through! We want to see how your skills and experiences align with our mission at Helsing. Make it personal and connect your background to what we do.
Tailor Your CV:Don’t just send a generic CV! We love seeing candidates who take the time to tailor their applications. Highlight your relevant experience with Kubernetes, cloud-native technologies, and any MLOps platforms you've worked with. It shows us you’re serious about the role!
Be Clear and Concise:Keep your application clear and to the point. We appreciate well-structured information that’s easy to read. Use bullet points where necessary and make sure to highlight your key achievements in a way that’s easy for us to digest.
Apply Through Our Website:We encourage you to apply directly through our website! It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it gives you a chance to explore more about our culture and values.
How to prepare for a job interview at Helsing
✨Know Your Kubernetes Inside Out
Make sure you brush up on your Kubernetes knowledge before the interview. Be ready to discuss your experience with production clusters, custom controllers, and service mesh architectures. They’ll want to see that you can not only operate but also innovate within their high-security environments.
✨Showcase Your Scripting Skills
Prepare to demonstrate your scripting abilities in Python, Go, Rust, or Bash. Have examples ready where you've automated processes or built tools that improved productivity. This will show them you have the software engineering mindset they’re looking for.
✨Understand Their Mission
Familiarise yourself with Helsing's mission to protect democracies through AI. Be prepared to discuss how your skills can contribute to this goal. Showing that you align with their values will set you apart from other candidates.
✨Be Ready for Technical Challenges
Expect technical questions or even a practical test related to cloud-native technologies and observability stacks. Brush up on Grafana, Prometheus, and MLOps platforms like Kubeflow. Being able to solve problems on the spot will demonstrate your expertise and confidence.