At a Glance
- Tasks: Lead reliability engineering efforts and improve operational systems across the organisation.
- Company: Join Boston Consulting Group, a leader in business strategy and transformation.
- Benefits: Enjoy a hybrid work model, competitive salary, and opportunities for professional growth.
- Other info: Collaborative environment with a focus on mentorship and cross-team influence.
- Why this job: Make a real impact by driving innovation and mentoring future engineers.
- Qualifications: 5-8 years in Site Reliability Engineering with strong cloud and automation skills.
The predicted salary is between 80000 - 100000 £ per year.
Boston Consulting Group partners with leaders in business and society to tackle their most important challenges and capture their greatest opportunities. BCG was the pioneer in business strategy when it was founded in 1963. Today, we help clients with total transformation‑inspiring complex change, enabling organizations to grow, building competitive advantage, and driving bottom‑line impact. To succeed, organizations must blend digital and human capabilities. Our diverse, global teams bring deep industry and functional expertise and a range of perspectives to spark change.
The Senior Site Reliability Engineer is responsible for running the engineering capability behind a defined area of reliability across the organisation. The role works across multiple SRE disciplines including infrastructure, cloud, observability, automation, identity, security, and network operations, applying engineering thinking to reduce operational toil, improve resilience, and embed reliability and governance into delivery and operational workflows. The role drives engineering quality and consistency within its scope of responsibility, contributes to wider engineering standards, and helps shape how reliability is delivered across the organisation. It builds reusable patterns, mentors engineers, and provides senior engineering input across a wider set of stakeholders. The ideal candidate is a senior practitioner who is comfortable operating across multiple domains, balances delivery with mentorship, and can articulate engineering trade‑offs clearly to both technical and non‑technical audiences.
Core Responsibilities
- Run and continuously improve the reliability engineering systems within scope, including automation, pipelines, observability, and operational tooling.
- Design and implement engineering solutions that eliminate operational toil at scale and embed reliability into delivery workflows.
- Help shape engineering standards, patterns, and reusable frameworks across the SRE practice.
- Lead the engineering response to complex incidents within scope, drive systemic remediation, and contribute to post‑incident learning.
- Mentor and coach senior engineers across reliability engineering, automation, observability, and SRE principles.
- Drive cross‑team collaboration with engineering, platform, and operations functions to embed reliability and governance through engineering controls.
- Communicate engineering status, risks, and recommendations clearly to senior stakeholders and leadership forums.
- Contribute to monthly operational reviews with structured metrics on service health, ingestion or pipeline performance, automation coverage, and improvement progress.
What You’ll Bring
- 5–8 years of experience in Site Reliability Engineering, Platform Engineering, or related operational engineering disciplines.
- Strong hands‑on experience across multiple SRE domains, including cloud, automation, observability, and CI/CD.
- Demonstrated experience designing and implementing automation and reliability solutions at scale.
- Deep knowledge of at least one cloud platform (AWS or Azure), including networking, identity, and observability primitives.
- Experience with Infrastructure‑as‑Code (e.g. Terraform) and CI/CD pipelines.
- Strong scripting experience (e.g. Python).
- Experience leading incident response and driving systemic improvement.
- Strong stakeholder engagement and technical communication skills.
- Deep hands‑on experience with one or more enterprise observability platforms (e.g. Splunk, Datadog).
- Proven experience designing and operating telemetry pipelines, ingestion controls, and observability cost management.
- Proven experience designing signals (SLIs, SLOs, synthetic checks, alerts) and ops automation triggered from those signals.
- Experience driving SLO/SLI practices across multiple teams.
- Deep hands‑on experience operating cloud infrastructure across at least two of AWS, Azure, GCP, or Alibaba Cloud.
- Proven experience designing reusable IaC patterns and landing‑zone components across cloud providers.
- Strong working knowledge of cloud networking, account management, identity primitives, and policy enforcement across providers.
- Experience driving cloud platform engineering standards and governance across multiple teams.
- Deep hands‑on experience with identity platforms (e.g. Entra ID) and secrets management (e.g. HashiCorp Vault).
- Proven experience designing OIDC, workload identity, and dynamic credential patterns.
- Experience driving Zero Trust and least‑privilege adoption across multiple teams.
- Deep hands‑on experience with security tooling embedded in CI/CD pipelines.
- Proven experience designing policy‑as‑code controls and secure‑by‑default patterns.
- Experience driving secure engineering adoption across multiple teams.
- Deep hands‑on experience with hybrid and cloud network architectures.
- Proven experience designing automated network controls through IaC.
- Experience driving Zero Trust segmentation and network observability adoption.
Preferred Qualifications
- Experience working within a federated, multi‑cloud, or large enterprise environment.
- Familiarity with containerisation (Docker) and orchestration (Kubernetes).
- Experience with secrets management tooling (e.g. HashiCorp Vault).
- Cloud certification at professional level.
- Experience with policy‑as‑code tooling (e.g. OPA, Sentinel).
- Experience contributing to engineering communities of practice.
- Experience with AIOps, noise reduction, and event correlation.
- Experience with event‑driven ops automation platforms (e.g. ServiceNow, PagerDuty, custom workflows).
- Ability to lead complex observability platform incidents and capacity reviews.
- Experience with cloud FinOps, cost engineering, and chargeback tooling.
- Hands‑on experience with Alibaba Cloud platform architecture.
- Experience with cloud policy‑as‑code tools (e.g. AWS Service Control Policies, Azure Policy, OPA).
- Strong understanding of identity‑related security risks and mitigations.
- Strong understanding of common security risks and mitigations across the SDLC.
- Strong understanding of network reliability, observability, and security patterns.
Who You’ll Work With
- Hybrid or on‑site work model.
- Operates as a senior individual contributor with mentorship and cross‑team influence.
- Expected to participate in on‑call rotation and lead incident response.
- Occasional travel may be required for team or stakeholder engagement.
Boston Consulting Group is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, age, religion, sex, sexual orientation, gender identity / expression, national origin, disability, protected veteran status, or any other characteristic protected under national, provincial, or local law, where applicable, and those with criminal histories will be considered in a manner consistent with applicable state and local laws. BCG is an E‑Verify Employer.
Senior Site Reliability Engineer in London employer: Boston Consulting Group (BCG)
At Boston Consulting Group, we pride ourselves on fostering a dynamic and inclusive work environment that empowers our employees to thrive. As a Senior Site Reliability Engineer, you will benefit from unparalleled opportunities for professional growth, mentorship from industry leaders, and the chance to work on transformative projects that make a real impact. Our collaborative culture, combined with a commitment to innovation and excellence, makes BCG an exceptional employer for those seeking meaningful and rewarding careers in technology.
Contact Details:
Boston Consulting Group (BCG) Recruitment Team
StudySmarter Expert Advice🤫
We think this is how you could land Senior Site Reliability Engineer in London
✨Tip Number 1
Network like a pro! Reach out to your connections in the industry, attend meetups, and engage in online forums. You never know who might have the inside scoop on job openings or can refer you directly.
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your projects and contributions. This gives potential employers a tangible look at what you can do, especially in SRE and automation.
✨Tip Number 3
Prepare for interviews by brushing up on both technical and soft skills. Practice explaining complex concepts in simple terms, as you'll need to communicate effectively with both technical and non-technical stakeholders.
✨Tip Number 4
Apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you're genuinely interested in joining our team at BCG.
We think you need these skills to ace Senior Site Reliability Engineer in London
Some tips for your application 🫡
Tailor Your Application:Make sure to customise your CV and cover letter for the Senior Site Reliability Engineer role. Highlight your experience in SRE disciplines like cloud, automation, and observability, and show how you can bring value to our team at BCG.
Showcase Your Technical Skills:Don’t hold back on showcasing your hands-on experience with tools and technologies relevant to the role. Mention your expertise in Infrastructure-as-Code, CI/CD pipelines, and any cloud platforms you've worked with, as these are key to what we do.
Communicate Clearly:When writing your application, keep it clear and concise. Use straightforward language to explain your technical skills and experiences, making it easy for us to see how you can contribute to our engineering standards and practices.
Apply Through Our Website:We encourage you to apply directly through our website. This way, you’ll ensure your application gets to the right people and you can easily track your application status. Plus, it’s the best way to show your enthusiasm for joining our team!
How to prepare for a job interview at Boston Consulting Group (BCG)
✨Know Your SRE Domains
Make sure you brush up on your knowledge across multiple SRE disciplines like cloud, automation, and observability. Be ready to discuss specific projects where you've applied these skills, as this will show your hands-on experience and understanding of the role.
✨Prepare for Technical Questions
Expect in-depth technical questions about your experience with Infrastructure-as-Code, CI/CD pipelines, and cloud platforms. Practise explaining complex concepts in simple terms, as you'll need to communicate effectively with both technical and non-technical stakeholders.
✨Showcase Your Mentorship Skills
Since mentoring is a key part of the role, think of examples where you've guided junior engineers or led teams through challenging situations. Highlight how you foster collaboration and knowledge sharing, as this aligns with the company's collaborative model.
✨Demonstrate Incident Response Experience
Be prepared to discuss your experience leading incident responses and driving systemic improvements. Share specific incidents you've managed, what you learned from them, and how you implemented changes to prevent future issues. This will showcase your problem-solving skills and resilience.