Site Reliability Engineer
Site Reliability Engineer

Site Reliability Engineer

Full-Time 60000 - 80000 £ / year (est.) Home office (partial)
Thredd

At a Glance

  • Tasks: Design and implement scalable, reliable network solutions while driving automation and best practices.
  • Company: Join Thredd, a next-gen payments partner transforming the fintech landscape.
  • Benefits: Flexible working model, competitive salary, and opportunities for professional growth.
  • Why this job: Shape the future of service reliability and make a real impact in a dynamic environment.
  • Qualifications: Experience in infrastructure, coding skills in Python, and knowledge of cloud platforms.
  • Other info: Be part of a diverse team committed to innovation and excellence.

The predicted salary is between 60000 - 80000 £ per year.

Are you passionate about building reliable, scalable, and high‑performing systems? Do you thrive on solving complex infrastructure challenges while driving automation and observability best practices? If so, we want to hear from you!

As our first engineer in this role, you’ll have the unique opportunity to shape our SRE strategy, establish best practices, and set the standard for service reliability and performance.

The Impact You’ll Have as a Site Reliability Engineer:
  • Design and oversee the implementation of complex, secure, and scalable network solutions that support global transaction processing.
  • Lead network innovation by identifying opportunities to adopt emerging technologies and drive efficiency.
  • Coordinate and prioritise network‑related initiatives across teams, balancing operational needs with strategic growth.
  • Mentor and support engineers within the team, fostering technical excellence and a customer‑focused mindset.
  • Drive performance and reporting, delivering insights and data that help optimise system health and uptime.
  • Collaborate with stakeholders, vendors, and service providers to ensure seamless integration and service quality.
  • Develop and enforce quality assurance protocols and documentation standards across our network landscape.
  • Own strategic network planning, ensuring infrastructure evolves in step with our product and market expansion.
What You’ll Bring to the Site Reliability Engineer Position:
  • Proven experience building and maintaining infrastructure, tooling, and technical foundations at scale.
  • Strong track record of ensuring high service uptime and reliability to empower product teams to innovate effectively.
  • Expertise in shaping and evolving core technology layers that underpin a successful, high‑growth platform.
  • Proven experience implementing SRE principles at scale, including deep knowledge of SLI/SLO/SLA differences.
  • A product engineering background with strong coding skills in Python or similar.
  • Experience with incident management frameworks and evolving them for efficiency.
  • Expertise in cloud platforms (AWS preferred) and container orchestration (Docker, Kubernetes, ECS).
  • Solid understanding of microservices, service mesh, and modern architectural concepts.
  • A collaborative mindset – you thrive on helping others and driving company‑wide impact.
Nice to Have:
  • Experience working in regulated industries (e.g., PCI compliance).
  • Background in capacity planning, performance, and load testing.
  • Sysadmin skills for troubleshooting disk, network, and infrastructure issues.

This Site Reliability Engineer position requires you to be in the London office (Holborn) one day per week.

Here at Thredd, we are committed to building a diverse and inclusive workplace where everyone feels valued, respected and empowered. We welcome applications from people of all backgrounds, experiences and identities. If you require any adjustments during the recruitment process, please let us know and we would be happy to support you.

Our Values:
  • Own it and deliver – Taking responsibility for your own performance and being successful in your own role.
  • Collaborate purposefully – Building trusted relationships with colleagues, supporting activities and being successful together.
  • Think differently – Asking questions to check understanding and sharing your ideas to support continuous improvement.
  • Act courageously – Stepping out of your comfort zone and embracing change to help you learn and grow.

Site Reliability Engineer employer: Thredd

At Thredd, we pride ourselves on being an exceptional employer that fosters a culture of innovation and collaboration. As a Site Reliability Engineer in our Holborn office, you'll not only have the chance to shape our SRE strategy but also benefit from a flexible working model, mentorship opportunities, and a commitment to diversity and inclusion. Join us to work with cutting-edge technology in a supportive environment that values your contributions and encourages professional growth.
Thredd

Contact Detail:

Thredd Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer

✨Tip Number 1

Network, network, network! Get out there and connect with folks in the industry. Attend meetups, webinars, or even just grab a coffee with someone who’s already in the SRE space. You never know where a casual chat might lead!

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those that highlight your experience with cloud platforms, automation, and incident management. This is your chance to shine and demonstrate what you can bring to the table.

✨Tip Number 3

Prepare for interviews by brushing up on SRE principles and be ready to discuss real-world scenarios. Think about how you’ve tackled challenges in the past and be prepared to share those stories. We love hearing about your problem-solving skills!

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team at Thredd. Let’s make it happen!

We think you need these skills to ace Site Reliability Engineer

Infrastructure Management
Automation
Observability Best Practices
Network Solutions Design
Emerging Technologies Adoption
Mentoring and Support
Performance Optimisation
Stakeholder Collaboration
Quality Assurance Protocols
SRE Principles Implementation
SLI/SLO/SLA Knowledge
Coding Skills in Python
Incident Management Frameworks
Cloud Platforms (AWS)
Container Orchestration (Docker, Kubernetes, ECS)
Microservices Understanding

Some tips for your application 🫡

Show Your Passion: Let us see your enthusiasm for building reliable and scalable systems right from the start. Use your application to highlight specific projects or experiences that showcase your love for solving complex infrastructure challenges.

Tailor Your Application: Make sure to customise your CV and cover letter to reflect the skills and experiences mentioned in the job description. We want to see how your background aligns with our needs, so don’t be shy about drawing those connections!

Be Clear and Concise: When writing your application, keep it straightforward and to the point. We appreciate clarity, so avoid jargon unless it’s relevant. Make it easy for us to see why you’re a great fit for the Site Reliability Engineer role.

Apply Through Our Website: We encourage you to submit your application through our website. It’s the best way for us to receive your details and ensures you’re considered for the role. Plus, it shows you’re keen on joining our team!

How to prepare for a job interview at Thredd

✨Know Your SRE Principles

Make sure you brush up on your understanding of SLI, SLO, and SLA differences. Being able to articulate these concepts clearly will show that you have a solid grasp of Site Reliability Engineering principles, which is crucial for the role.

✨Showcase Your Technical Skills

Prepare to discuss your experience with cloud platforms like AWS and container orchestration tools such as Docker and Kubernetes. Bring examples of how you've implemented these technologies in past projects to demonstrate your hands-on expertise.

✨Emphasise Collaboration

Since this role involves mentoring and working with various teams, be ready to share examples of how you've successfully collaborated with others in previous roles. Highlight your ability to foster a customer-focused mindset and drive company-wide impact.

✨Prepare for Problem-Solving Questions

Expect to face questions that assess your problem-solving skills, especially around incident management and troubleshooting. Think of specific challenges you've faced and how you approached them, as this will showcase your analytical thinking and resilience.

Site Reliability Engineer
Thredd

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>