Site Reliability Engineer in Cambridge

Job Board

Companies

Darktrace

Site Reliability Engineer

Site Reliability Engineer in Cambridge

Cambridge Full-Time 60000 - 80000 £ / year (est.) No working from home possible

Apply Now

At a Glance

Tasks: Shape the future of platform reliability and solve complex challenges.
Company: Join a leading tech firm focused on innovation and collaboration.
Benefits: Competitive salary, flexible work options, and opportunities for professional growth.
Other info: Dynamic role with excellent career advancement potential.
Why this job: Make a real impact in a cloud-native environment with expert teams.
Qualifications: Experience in Site Reliability Engineering and strong programming skills required.

The predicted salary is between 60000 - 80000 £ per year.

Requirements

Proven experience in Site Reliability Engineering, DevOps, or infrastructure engineering
Deep expertise in at least one of the following areas:

Observability & monitoring (metrics, logging, distributed tracing)
Performance engineering & capacity planning
Data infrastructure reliability (databases, streaming, pipelines)
Security-focused SRE (hardening, compliance automation, secrets management)
Network reliability & traffic management

Strong programming skills (e.g. Go, Python, or similar)
Experience with cloud platforms (AWS, GCP, Azure) and Kubernetes
Strong communication skills, with the ability to explain complex technical concepts clearly
Self-driven with the ability to identify and prioritise high-impact work independently
(Desirable) Experience building internal developer platforms or tooling
(Desirable) Contributions to open-source, technical blogs, or public speaking
(Desirable) Experience working in regulated environments
(Desirable) Familiarity with SLO frameworks and error budget management
(Desirable) Relevant certifications in your specialist domain

What the job involves

We’re looking for a Site Reliability Engineer (SRE) to bring deep expertise in a key reliability domain and help shape the future of our platform reliability strategy. SRE sits at the heart of our operational trifecta alongside Platform Engineering and DevSecOps. In this role, you’ll act as the go‑to authority in your area of specialism, working across teams to embed best practices, solve complex reliability challenges, and improve system resilience at scale.

Unlike a generalist SRE, this role focuses on a core domain of expertise—such as observability, performance engineering, data infrastructure reliability, security‑focused SRE, or network reliability—while influencing reliability standards across the wider engineering organisation.

Domain Expertise & Strategy

Act as the subject matter expert in your chosen reliability domain
Define and implement standards, frameworks, and best practices across SRE, Platform Engineering, and DevSecOps
Stay current with industry trends and bring innovative ideas into the organisation

Engineering & Delivery

Design and implement solutions to complex, cross‑cutting reliability challenges
Build tooling, automation, and frameworks to improve system resilience and scalability
Lead deep‑dive investigations into systemic issues and drive long‑term fixes

Collaboration & Platform Integration

Partner with Platform Engineering to ensure your domain is embedded within the internal developer platform
Collaborate with DevSecOps to integrate security, compliance, and resilience practices
Contribute to cross‑team initiatives that improve reliability across the stack

Incident & Operational Excellence

Play a key role in incident response, particularly within your specialism
Contribute to on‑call rotations and continuous improvement of operational processes
Develop runbooks, documentation, and training materials to support teams

Success Measures

Improved reliability and performance within your domain of specialism
Adoption of best practices across SRE, Platform Engineering, and DevSecOps
Reduction in incidents and faster resolution times
Scalable, well‑integrated solutions within the internal platform
Strong collaboration across teams and measurable improvements in operational maturity
Shape reliability strategy in a modern, cloud‑native engineering environment
Work on complex, high‑impact systems at scale
Collaborate with expert teams across Platform Engineering and DevSecOps
Take ownership of a domain and drive meaningful, organisation‑wide impact

Site Reliability Engineer in Cambridge employer: Darktrace

As a Site Reliability Engineer at our company, you will thrive in a dynamic and innovative work culture that prioritises collaboration and continuous learning. We offer competitive benefits, including professional development opportunities and a supportive environment that encourages you to take ownership of your projects and drive impactful change. Located in a vibrant tech hub, you'll be part of a forward-thinking team dedicated to shaping the future of platform reliability while enjoying a healthy work-life balance.

Contact Details:

Darktrace Recruitment Team

View Darktrace profile

We think you need these skills to ace Site Reliability Engineer in Cambridge

Site Reliability Engineering

DevOps

Infrastructure Engineering

Observability & Monitoring

Performance Engineering

Capacity Planning

Data Infrastructure Reliability

Security-focused SRE

Network Reliability

Traffic Management

Programming Skills (Go, Python)

Cloud Platforms (AWS, GCP, Azure)

Kubernetes

Communication Skills

Problem-Solving Skills

Site Reliability Engineer in Cambridge

Darktrace

Location: Cambridge

Apply Now

Site Reliability Engineer in Cambridge

At a Glance

Site Reliability Engineer in Cambridge employer: Darktrace

We think you need these skills to ace Site Reliability Engineer in Cambridge

Company

Product

Help