Staff Site Reliability Engineer in Cheltenham
Staff Site Reliability Engineer

Staff Site Reliability Engineer in Cheltenham

Cheltenham Full-Time 124000 - 141000 ÂŁ / year (est.) Home office (partial)
Obsidian Security

At a Glance

  • Tasks: Lead the reliability vision for a cutting-edge SaaS platform and tackle complex system challenges.
  • Company: Join Obsidian Security, a fast-growing tech company transforming SaaS security.
  • Benefits: Enjoy competitive pay, equity options, flexible time off, and comprehensive healthcare.
  • Other info: Be part of a diverse team driving innovation in SaaS security.
  • Why this job: Make a real impact by safeguarding critical infrastructure for major enterprises.
  • Qualifications: 5+ years in SRE roles with strong expertise in AWS, Kubernetes, and observability tools.

The predicted salary is between 124000 - 141000 ÂŁ per year.

Founded in 2017, Obsidian Security was created to close a critical gap: securing the SaaS applications where modern business happens—platforms like Microsoft 365, Salesforce, and hundreds more. Backed by top investors including Greylock, Norwest Venture Partners, and IVP, we’ve built a complete SaaS security platform to reduce risk, detect and respond to threats, and prevent breaches at the source. Our team includes leaders who helped define the categories of endpoint and identity security at CrowdStrike, Okta, Cylance, and Carbon Black. Now, we’re transforming how SaaS is secured—in the era of agentic AI. Today, Obsidian is trusted by global enterprises like Snowflake, T-Mobile, and Pure Storage. We protect more than 200 organizations across North America, Europe, the Middle East, Southeast Asia, Australia, and New Zealand—including many of the world’s largest Fortune 1000 and Global 2000 companies. With strong global momentum, a growing partner ecosystem including SentinelOne, Databricks, and Google Cloud, and a major fundraise on the horizon, we’re scaling quickly toward long-term growth and IPO readiness. Join us as we define the future of SaaS security!

As a Staff SRE at Obsidian, you will define and drive the company-wide reliability vision for a complex, multi-tenant SaaS platform serving enterprise and financial customers. You will operate as a strategic partner to DevOps and Platform Engineering leadership, shaping a unified reliability strategy that scales across the organization. Your core mandate: ensure Obsidian detects, diagnoses, and communicates system issues before customers are impacted—consistently and predictably. This is a hands-on technical role that involves architecting and leading the implementation of systems that handle real-world complexity, including upstream SaaS dependencies, sparse and noisy signals, and mission-critical enterprise workloads.

Key Responsibilities
  • Reliability Strategy & Architecture - Define and lead long-term reliability strategy across services. Establish end-to-end system visibility frameworks and guide architecture for observability, detection, and resilience.
  • Cross-Org Leadership - Partner across teams to embed reliability, standardize SLI/SLOs, and serve as a technical escalation expert.
  • Detection & Observability - Build intelligent detection systems (anomaly detection, connector health models) and enable self-service observability.
  • Incident Management - Define and evolve a tiered incident communication strategy, improve response practices, and lead postmortems to strengthen reliability and customer trust.
  • Execution - Contribute hands-on to system design, monitoring, and debugging across distributed systems and data pipelines.
Required Qualifications
  • 5+ years in SRE, Production Engineering, or related roles
  • 3+ years operating at a senior or technical leadership level (Staff or equivalent scope)
  • Deep expertise in: AWS and/or GCP, Kubernetes and Helm, Observability stacks (Prometheus, Grafana, or equivalent), CI/CD systems (GitLab CI/CD, ArgoCD, etc.)
  • Proven experience designing and scaling reliability systems for multi-tenant SaaS platforms
  • Strong debugging and systems thinking across distributed microservices and legacy systems
  • Demonstrated ability to lead initiatives that improve incident detection, response, and system resilience
  • Hands-on engineering approach with a track record of building—not just configuring—reliability systems
Preferred Qualifications
  • Experience in B2B SaaS serving enterprise or financial customers
  • Familiarity with third-party SaaS connector architectures and ingestion patterns
  • Experience building anomaly detection or intelligent alerting systems
  • Experience designing customer-facing status pages and incident communication frameworks
Why This Role
  • Drive org-wide reliability strategy
  • Own and build new detection & observability systems
  • Tackle complex distributed systems challenges
  • Safeguard critical infrastructure for financial customers
What Success Looks Like
  • Issues caught and resolved before customer impact
  • Reliability is measurable and continuously improving
  • Teams self-serve observability with scalable tools
  • Clear, proactive incident communication builds trust
  • Reliability becomes a competitive advantage
Employee Benefits

Our competitive benefits packages are designed to support our employees' well-being, both at work and at home. Our US based employees enjoy:

  • Competitive compensation with equity and 401k
  • Comprehensive healthcare with dental and vision coverage
  • Flexible paid time off and paid holiday time off
  • 12 weeks of new parent or family leave
  • Personal and professional development resources

For more details on our US benefits, or for information on our international benefits, please see here.

Pay Transparency

Please note that the base pay range is a guideline and for candidates who receive an offer, the base pay will vary based on factors such as work location, as well as the knowledge, skills and experience of the candidate. In addition to a competitive base salary, this position is eligible for equity awards and may be eligible for sales commission or incentive compensation based on the role or function within the company.

At Obsidian, we are proud to be an equal-opportunity employer. We value diversity and hire for talent, passion, and compassion. In compliance with federal law, all persons hired will be required to submit satisfactory proof of identity and legal authorization. If you have a need that requires accommodation, please contact accommodations@obsidiansecurity.com. Information collected and processed as part of any job applications you choose to submit is subject to Obsidian’s Applicant Privacy Policy.

Base Salary Range

ÂŁ124,000 - ÂŁ141,000 GBP

Staff Site Reliability Engineer in Cheltenham employer: Obsidian Security

At Obsidian Security, we pride ourselves on being an exceptional employer that fosters a culture of innovation and collaboration. Our commitment to employee growth is evident through comprehensive professional development resources and a competitive benefits package, including flexible paid time off and generous parental leave. Join us in our mission to redefine SaaS security while enjoying the unique advantages of working in a dynamic, fast-paced environment that values diversity and empowers every team member to make a meaningful impact.
Obsidian Security

Contact Detail:

Obsidian Security Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Staff Site Reliability Engineer in Cheltenham

✨Tip Number 1

Network like a pro! Reach out to current employees at Obsidian on LinkedIn or other platforms. Ask them about their experiences and any tips they might have for landing the Staff SRE role. Personal connections can make a huge difference!

✨Tip Number 2

Prepare for technical interviews by brushing up on your SRE skills. Focus on real-world scenarios you’ve faced, especially around incident management and system reliability. We want to hear how you’ve tackled challenges in distributed systems!

✨Tip Number 3

Showcase your hands-on experience! Be ready to discuss specific projects where you built or improved reliability systems. Highlight your expertise with AWS, GCP, and observability stacks—this is your chance to shine!

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you’re genuinely interested in joining the Obsidian team. Let’s get you on board!

We think you need these skills to ace Staff Site Reliability Engineer in Cheltenham

Reliability Strategy
System Architecture
End-to-End System Visibility
Observability Frameworks
Incident Management
Anomaly Detection
Kubernetes
AWS
GCP
CI/CD Systems
Debugging Skills
Systems Thinking
Technical Leadership
Cross-Organisational Collaboration
Customer Communication

Some tips for your application 🫡

Tailor Your Application: Make sure to customise your CV and cover letter for the Staff Site Reliability Engineer role. Highlight your experience with AWS, GCP, and any relevant SRE projects you've worked on. We want to see how your skills align with our mission!

Showcase Your Technical Skills: Don’t just list your skills—demonstrate them! Include specific examples of how you've designed and implemented reliability systems in the past. We love seeing hands-on experience that shows you can tackle real-world challenges.

Communicate Clearly: When writing your application, clarity is key. Use straightforward language and structure your thoughts logically. We appreciate a well-organised application that makes it easy for us to see your qualifications.

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you're genuinely interested in joining our team at Obsidian!

How to prepare for a job interview at Obsidian Security

✨Know Your Stuff

Make sure you brush up on your technical skills, especially around AWS, GCP, Kubernetes, and observability stacks like Prometheus and Grafana. Be ready to discuss how you've applied these in real-world scenarios, as this role is all about hands-on experience.

✨Understand the Company’s Vision

Familiarise yourself with Obsidian Security's mission to secure SaaS applications. Think about how your experience aligns with their goals and be prepared to share insights on how you can contribute to their reliability strategy.

✨Prepare for Scenario Questions

Expect questions that ask you to solve hypothetical problems related to incident management or system reliability. Practice articulating your thought process clearly, as they’ll want to see how you approach complex challenges.

✨Show Your Leadership Skills

Since this role involves cross-org leadership, be ready to discuss past experiences where you led initiatives or collaborated with different teams. Highlight your ability to standardise processes and improve incident detection and response.

Staff Site Reliability Engineer in Cheltenham
Obsidian Security
Location: Cheltenham

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>