Site Reliability Engineer
Site Reliability Engineer

Site Reliability Engineer

London Full-Time 57600 - 84000 £ / year (est.) No home office possible
T

At a Glance

  • Tasks: Shape our SRE strategy and optimise production environments for reliability and efficiency.
  • Company: Thredd is a rapidly growing company focused on building reliable, scalable systems.
  • Benefits: Enjoy a collaborative culture, high-impact role, and the chance to lead SRE best practices.
  • Why this job: Make a real impact by improving infrastructure and service reliability in an innovative environment.
  • Qualifications: Experience with SRE principles, coding skills in Python or C#, and cloud platforms like AWS required.
  • Other info: This is a full-time, mid-senior level position based in London.

The predicted salary is between 57600 - 84000 £ per year.

Join to apply for the Site Reliability Engineer role at Thredd

Join to apply for the Site Reliability Engineer role at Thredd

Get AI-powered advice on this job and more exclusive features.

Are you passionate about building reliable, scalable, and high-performing systems? Do you thrive on solving complex infrastructure challenges while driving automation and observability best practices? If so, we want to hear from you!
At Thredd, we’re looking for a Site Reliability Engineer to act as a North Star for this evolving discipline. As our first engineer in this role, you’ll have the unique opportunity to shape our SRE strategy, establish best practices, and set the standard for service reliability and performance.
What You’ll Do
Define strategies for Application Performance Monitoring, Unit Cost, and Chaos Engineering.
Continuously optimize production environments to enhance reliability and efficiency.
Implement and apply MTTR, SLO, and SLI principles to ensure high service standards.
Respond to incidents, analyze root causes, and drive long-term improvements.
Maintain fault-tolerant, scalable, and cost-effective infrastructures and services.
Monitor availability, latency, and system health to keep our platform running smoothly.
Lead blameless postmortems and refine our incident response processes.
Provide feedback loops to development teams on operational gaps and resiliency concerns.
Support services before they go live with system design consulting, capacity planning, and launch reviews.
Scale systems sustainably through automation and infrastructure evolution.
Deeply understand our customers’ needs and the critical role Thredd plays in their businesses.
What You’ll Be Working On
Building and maintaining the infrastructure, tooling, and technical foundation of Thredd.
Ensuring high service uptime and reliability so product teams can innovate effectively.
Playing a key role in shaping the core technology layers that drive our platform’s success.
What You Need
Proven experience implementing SRE principles at scale, including deep knowledge of SLI/SLO/SLA differences.
A product engineering background with strong coding skills in Python, C#, or similar.
Experience with incident management frameworks and evolving them for efficiency.
Expertise in cloud platforms (AWS preferred) and container orchestration (Docker, Kubernetes, ECS).
Solid understanding of microservices, service mesh, and modern architectural concepts.
A collaborative mindset – you thrive on helping others and driving company-wide impact.
Nice to Have
Experience working in regulated industries (e.g., PCI compliance).
Background in capacity planning, performance, and load testing.
Sysadmin skills for troubleshooting disk, network, and infrastructure issues.
Why Join Thredd?
The chance to define and lead SRE best practices from the ground up.
A high-impact role in a rapidly growing company.
A collaborative, innovation-driven culture where your expertise will shape our platform’s future.
If you’re excited about scaling infrastructure, improving reliability, and making a real impact, apply now and help us build the future of Thredd!

Seniority level

  • Seniority level

    Mid-Senior level

Employment type

  • Employment type

    Full-time

Job function

  • Job function

    Engineering and Information Technology

Referrals increase your chances of interviewing at Thredd by 2x

Get notified about new Site Reliability Engineer jobs in London, England, United Kingdom .

London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 3 weeks ago

Greater London, England, United Kingdom 2 months ago

London, England, United Kingdom 9 hours ago

London, England, United Kingdom 5 days ago

London, England, United Kingdom 6 days ago

London, England, United Kingdom 1 week ago

South Croydon, England, United Kingdom 1 week ago

City Of London, England, United Kingdom 1 week ago

London, England, United Kingdom 6 days ago

London, England, United Kingdom 2 months ago

London, England, United Kingdom 2 weeks ago

Greater London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 6 days ago

London, England, United Kingdom 2 months ago

Site Reliability Engineer, Traffic Platform

London, England, United Kingdom 1 week ago

London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 1 week ago

London, England, United Kingdom 2 weeks ago

City Of London, England, United Kingdom £80,000.00-£100,000.00 3 weeks ago

London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 1 day ago

London, England, United Kingdom 1 week ago

London, England, United Kingdom 6 days ago

London, England, United Kingdom 5 days ago

London, England, United Kingdom 5 days ago

London, England, United Kingdom 3 weeks ago

Site Reliability Engineer – Field Operations

London, England, United Kingdom 1 week ago

London, England, United Kingdom 1 week ago

London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 2 weeks ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

Site Reliability Engineer employer: Thredd

At Thredd, we pride ourselves on fostering a collaborative and innovation-driven culture where your expertise as a Site Reliability Engineer will directly shape our platform's future. With the unique opportunity to define and lead SRE best practices from the ground up, you'll be part of a rapidly growing company that values your contributions and offers ample opportunities for professional growth in the vibrant city of London.
T

Contact Detail:

Thredd Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer

✨Tip Number 1

Familiarise yourself with SRE principles, especially SLI, SLO, and SLA differences. Being able to discuss these concepts confidently during your interview will demonstrate your expertise and understanding of the role.

✨Tip Number 2

Showcase your experience with cloud platforms, particularly AWS, and container orchestration tools like Docker and Kubernetes. Prepare examples of how you've used these technologies to solve real-world problems in previous roles.

✨Tip Number 3

Be ready to discuss your approach to incident management and how you've improved processes in the past. Highlight any specific frameworks you've implemented and the impact they had on service reliability.

✨Tip Number 4

Demonstrate your collaborative mindset by preparing examples of how you've worked with cross-functional teams. Emphasising your ability to drive company-wide impact will resonate well with Thredd's culture.

We think you need these skills to ace Site Reliability Engineer

Application Performance Monitoring
Chaos Engineering
MTTR, SLO, and SLI principles
Incident Management Frameworks
Cloud Platforms (AWS preferred)
Container Orchestration (Docker, Kubernetes, ECS)
Microservices Architecture
Service Mesh Concepts
Python or C# Programming
Capacity Planning
Performance and Load Testing
Troubleshooting Disk, Network, and Infrastructure Issues
Collaborative Mindset
Blameless Postmortems
Operational Feedback Loops

Some tips for your application 🫡

Understand the Role: Before applying, make sure you fully understand the responsibilities and requirements of the Site Reliability Engineer position at Thredd. Familiarise yourself with SRE principles, incident management frameworks, and the technologies mentioned in the job description.

Tailor Your CV: Customise your CV to highlight relevant experience and skills that align with the job description. Emphasise your background in implementing SRE principles, coding skills, and any experience with cloud platforms and container orchestration.

Craft a Compelling Cover Letter: Write a cover letter that showcases your passion for building reliable systems and your ability to solve complex infrastructure challenges. Mention specific examples from your past experiences that demonstrate your expertise in SRE practices and your collaborative mindset.

Highlight Relevant Projects: If you have worked on projects that involved application performance monitoring, automation, or incident response, be sure to include these in your application. Detail your role in these projects and the impact they had on service reliability and performance.

How to prepare for a job interview at Thredd

✨Showcase Your SRE Knowledge

Be prepared to discuss your experience with SRE principles, particularly SLI, SLO, and SLA. Highlight specific examples where you've implemented these concepts in previous roles, as this will demonstrate your understanding and capability in the field.

✨Demonstrate Problem-Solving Skills

Expect to face scenario-based questions that assess your ability to troubleshoot and resolve incidents. Share detailed accounts of past incidents you've managed, focusing on your approach to root cause analysis and the improvements you implemented afterwards.

✨Familiarise Yourself with Their Tech Stack

Research Thredd's technology stack, especially their use of cloud platforms like AWS and container orchestration tools such as Docker and Kubernetes. Being knowledgeable about their systems will allow you to engage in more meaningful discussions during the interview.

✨Emphasise Collaboration

Thredd values a collaborative mindset, so be ready to discuss how you've worked with cross-functional teams in the past. Share examples of how you've contributed to team success and driven company-wide impact through collaboration.

Site Reliability Engineer
Thredd

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

T
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>