Senior Site Reliability Engineer in London

Job Board

Companies

United States Digital Space LLC

Senior Site Reliability Engineer

Senior Site Reliability Engineer in London

London Full-Time 70000 - 90000 £ / year (est.) Home office (partial)

Apply Now

At a Glance

Tasks: Enhance reliability engineering systems and mentor junior engineers in a collaborative environment.
Company: Join a pioneering consulting firm transforming businesses with innovative solutions.
Benefits: Enjoy a hybrid work model, competitive salary, and opportunities for professional growth.
Other info: Be part of a diverse team that values collaboration and continuous improvement.
Why this job: Make a real impact by driving engineering excellence and shaping the future of reliability.
Qualifications: 5-8 years in Site Reliability Engineering with strong cloud and automation skills.

The predicted salary is between 70000 - 90000 £ per year.

The company partners with leaders in business and society to tackle their most important challenges and capture their greatest opportunities. BCG was the pioneer in business strategy when it was founded in 1963. Today, we help clients with total transformation— inspiring complex change, enabling organizations to grow, building competitive advantage, and driving bottom-line impact. To succeed, organizations must blend digital and human capabilities. Our diverse, global teams bring deep industry and functional expertise and a range of perspectives to spark change. BCG delivers solutions through leading-edge management consulting along with technology and design, corporate and digital ventures—and business purpose. We work in a uniquely collaborative model across the firm and throughout all levels of the client organization, generating results that allow our clients to thrive.

The Senior Site Reliability Engineer is responsible for running the engineering capability behind a defined area of reliability across the organisation. The role works across multiple SRE disciplines including infrastructure, cloud, observability, automation, identity, security, and network operations, applying engineering thinking to reduce operational toil, improve resilience, and embed reliability and governance into delivery and operational workflows.

The role drives engineering quality and consistency within its scope of responsibility, contributes to wider engineering standards, and helps shape how reliability is delivered across the organisation. It builds reusable patterns, mentors engineers, and provides senior engineering input across a wider set of stakeholders.

Core responsibilities

Run and continuously improve the reliability engineering systems within scope, including automation, pipelines, observability, and operational tooling.
Design and implement engineering solutions that eliminate operational toil at scale and embed reliability into delivery workflows.
Help shape engineering standards, patterns, and reusable frameworks across the SRE practice.
Lead the engineering response to complex incidents within scope, drive systemic remediation, and contribute to post-incident learning.
Mentor and coach less senior engineers across reliability engineering, automation, observability, and SRE principles.
Drive cross-team collaboration with engineering, platform, and operations functions to embed reliability and governance through engineering controls.
Communicate engineering status, risks, and recommendations clearly to senior stakeholders and leadership forums.
Contribute to monthly operational reviews with structured metrics on service health, ingestion or pipeline performance, automation coverage, and improvement progress.

What You'll Bring

5–8 years of experience in Site Reliability Engineering, Platform Engineering, or related operational engineering disciplines.
Strong hands-on experience across multiple SRE domains, including cloud, automation, observability, and CI/CD.
Demonstrated experience designing and implementing automation and reliability solutions at scale.
Deep knowledge of at least one cloud platform (AWS or Azure), including networking, identity, and observability primitives.
Experience with Infrastructure-as-Code (e.g. Terraform) and CI/CD pipelines.
Strong scripting experience (e.g. Python).
Experience leading incident response and driving systemic improvement.
Strong stakeholder engagement and technical communication skills.
Deep hands-on experience with one or more enterprise observability platforms (e.g. Splunk, Datadog).
Proven experience designing and operating telemetry pipelines, ingestion controls, and observability cost management.
Proven experience designing signals (SLIs, SLOs, synthetic checks, alerts) and ops automation triggered from those signals.
Experience driving SLO/SLI practices across multiple teams.
Deep hands-on experience operating cloud infrastructure across at least two of AWS, Azure, GCP, or Alibaba Cloud.
Proven experience designing reusable IaC patterns and landing zone components across cloud providers.
Strong working knowledge of cloud networking, account management, identity primitives, and policy enforcement across providers.
Experience driving cloud platform engineering standards and governance across multiple teams.
Deep hands-on experience with identity platforms (e.g. Entra ID) and secrets management (e.g. HashiCorp Vault).
Proven experience designing OIDC, workload identity, and dynamic credential patterns.
Experience driving Zero Trust and least-privilege adoption across multiple teams.
Deep hands-on experience with security tooling embedded in CI/CD pipelines.
Proven experience designing policy-as-code controls and secure-by-default patterns.
Experience driving secure engineering adoption across multiple teams.
Deep hands-on experience with hybrid and cloud network architectures.
Proven experience designing automated network controls through IaC.
Experience driving Zero Trust segmentation and network observability adoption.

Preferred qualifications

Experience working within a federated, multi-cloud, or large enterprise environment.
Familiarity with containerisation (Docker) and orchestration (Kubernetes).
Experience with secrets management tooling (e.g. HashiCorp Vault).
Cloud certification at professional level.
Experience with policy-as-code tooling (e.g. OPA, Sentinel).
Experience contributing to engineering communities of practice.
Experience with AIOps, noise reduction, and event correlation.
Experience with event-driven ops automation platforms (e.g. ServiceNow, PagerDuty, custom workflows).
Ability to lead complex observability platform incidents and capacity reviews.
Experience with cloud FinOps, cost engineering, and chargeback tooling.
Hands-on experience with Alibaba Cloud platform architecture.
Experience with cloud policy-as-code tools (e.g. AWS Service Control Policies, Azure Policy, OPA).
Strong understanding of identity-related security risks and mitigations.
Strong understanding of common security risks and mitigations across the SDLC.
Strong understanding of network reliability, observability, and security patterns.

Who You’ll Work With

Hybrid or on-site work model.
Operates as a senior individual contributor with mentorship and cross-team influence.
Expected to participate in on-call rotation and lead incident response.
Occasional travel may be required for team or stakeholder engagement.

The company is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, age, religion, sex, sexual orientation, gender identity / expression, national origin, disability, protected veteran status, or any other characteristic protected under national, provincial, or local law, where applicable, and those with criminal histories will be considered in a manner consistent with applicable state and local laws.

Senior Site Reliability Engineer in London employer: United States Digital Space LLC

At BCG, we pride ourselves on being an exceptional employer, offering a dynamic work culture that fosters collaboration and innovation. Our commitment to employee growth is evident through mentorship opportunities and a focus on continuous improvement, particularly in the Senior Site Reliability Engineer role, where you will have the chance to shape engineering standards and drive impactful change. Located in a vibrant environment, we provide a unique blend of professional development and meaningful work that empowers our teams to thrive and make a difference.

Contact Details:

United States Digital Space LLC Recruitment Team

View United States Digital Space LLC profile

StudySmarter Expert Advice🤫

We think this is how you could land Senior Site Reliability Engineer in London

✨Join the IT Consultancy Buzz

Get involved in local or virtual IT consultancy meetups and forums. This is where we can rub shoulders with industry professionals, get insights into what United States Digital Space LLC values, and even spot unadvertised opportunities. Don't miss out on these chances to make a name for ourselves in the IT world!

✨Show Off Your Skills

Create a personal project or case study relevant to the challenges United States Digital Space LLC might face. Use platforms like GitHub or Medium to share your findings. This not only demonstrates our consulting skills but shows a proactive attitude, making us stand out from the crowd when applying for that full-time gig.

✨Leverage LinkedIn for Connections

Follow and engage with the relevant thought leaders and influencers in IT consultancy on LinkedIn. Share insightful content and join discussions to gain visibility. A well-placed comment or shared article could catch the attention of someone at United States Digital Space LLC!

✨Direct Apply to United States Digital Space LLC

Let's not forget to apply directly through the United States Digital Space LLC website! Tailor your application to showcase our understanding of their consulting style and how we can contribute to their projects. A personalised approach can make a huge difference in landing that full-time position!

We think you need these skills to ace Senior Site Reliability Engineer in London

Site Reliability Engineering

Platform Engineering

Cloud Computing (AWS, Azure, GCP, Alibaba Cloud)

Automation

Observability

CI/CD Pipelines

Infrastructure-as-Code (Terraform)

Scripting (Python)

Incident Response

Stakeholder Engagement

Technical Communication

Telemetry Pipeline Design

SLI/SLO Practices

Identity Management (Entra ID)

Security Tooling in CI/CD

Some tips for your application 🫡

Showcase Your Problem-Solving Skills:In IT consulting, it's all about problem-solving, so make sure your CV highlights your analytical skills and any relevant projects you've tackled. Mention specific technologies or methodologies you've used to resolve issues or improve processes; this shows you can think critically and deliver results, which is vital for us at United States Digital Space LLC.

Highlight Relevant Certifications:Certifications like ITIL, PMP, or even specific tech stack qualifications can really make you stand out. Make sure to include these in your CV, as they not only demonstrate your expertise but also your commitment to staying current in the field. We love seeing candidates who are proactive about their professional development!

Tailor Your Cover Letter:Your cover letter is your chance to connect personally with us at United States Digital Space LLC. Share stories about your experiences in IT consulting, and how they shaped your desire to join our team. Mention why you’re excited about this particular role, and how you see yourself contributing to our projects.

Keep It Clear and Concise:We're all busy, so make sure your application is easy to read. Use bullet points for key achievements, and don’t overload us with jargon. A clean, professional layout goes a long way. Remember, the clearer your application, the more likely we are to invite you in for an interview!

How to prepare for a job interview at United States Digital Space LLC

✨Brush Up on Your Technical Skills

For an IT consulting role, be ready to demonstrate your technical prowess. You might face questions on systems integration, cloud technologies, or even troubleshooting specific software. If you have experience with tools like AWS, Azure, or even specific programming languages, make sure you can talk about them fluently.

✨Showcase Your Problem-Solving Approach

IT consulting is all about solving problems for clients. Think about how you can illustrate your approach to a past challenge using the STAR method (Situation, Task, Action, Result). It's a great way to show how you tackle complex issues and come up with effective solutions.

✨Know the Business Impact of IT Solutions

When discussing your experiences, focus not just on the tech solutions you implemented, but also on their business impact. Employers want to see that you can connect IT with organisational goals. Prep examples that highlight how your tech contributions improved efficiency or reduced costs for past clients or projects.

✨Prepare for Behavioural Questions

Since IT consulting often involves teamwork and client interactions, expect behavioural questions that assess your interpersonal skills. Be prepared with examples that demonstrate your adaptability, communication skills, and how you handle client feedback. Before the interview, think of situations where you worked closely with clients to create effective IT strategies or changes.

Senior Site Reliability Engineer in London

United States Digital Space LLC

Location: London

Apply Now

Senior Site Reliability Engineer in London

At a Glance

Senior Site Reliability Engineer in London employer: United States Digital Space LLC

StudySmarter Expert Advice🤫

We think you need these skills to ace Senior Site Reliability Engineer in London

Some tips for your application 🫡

How to prepare for a job interview at United States Digital Space LLC

Company

Product

Help