At a Glance
- Tasks: Shape our SRE strategy and optimise production environments for reliability and efficiency.
- Company: Thredd is a rapidly growing company focused on building reliable, scalable systems.
- Benefits: Enjoy a collaborative culture, high-impact role, and the chance to lead SRE best practices.
- Why this job: Make a real impact by improving infrastructure and service reliability in an innovative environment.
- Qualifications: Experience with SRE principles, coding skills in Python or C#, and cloud platforms like AWS required.
- Other info: This is a full-time, mid-senior level position based in London.
The predicted salary is between 57600 - 84000 £ per year.
Are you passionate about building reliable, scalable, and high‑performing systems? Do you thrive on solving complex infrastructure challenges while driving automation and observability best practices? If so, we want to hear from you!
Site Reliability Engineer
At Thredd, we’re looking for a Site Reliability Engineer to act as a North Star for this evolving discipline. As our first engineer in this role, you’ll have the unique opportunity to shape our SRE strategy, establish best practices, and set the standard for service reliability and performance.
The Impact You’ll Have as a Site Reliability Engineer
- Design and oversee the implementation of complex, secure, and scalable network solutions that support global transaction processing.
- Lead network innovation by identifying opportunities to adopt emerging technologies and drive efficiency.
- Coordinate and prioritise network‑related initiatives across teams, balancing operational needs with strategic growth.
- Mentor and support engineers within the team, fostering technical excellence and a customer‑focused mindset.
- Drive performance and reporting, delivering insights and data that help optimise system health and uptime.
- Collaborate with stakeholders, vendors, and service providers to ensure seamless integration and service quality.
- Develop and enforce quality assurance protocols and documentation standards across our network landscape.
- Own strategic network planning, ensuring infrastructure evolves in step with our product and market expansion.
What You’ll Bring to the Site Reliability Engineer Position
- Proven experience building and maintaining infrastructure, tooling, and technical foundations at scale.
- Strong track record of ensuring high service uptime and reliability to empower product teams to innovate effectively.
- Expertise in shaping and evolving core technology layers that underpin a successful, high‑growth platform.
- Proven experience implementing SRE principles at scale, including deep knowledge of SLI/SLO/SLA differences.
- A product engineering background with strong coding skills in Python or similar.
- Experience with incident management frameworks and evolving them for efficiency.
- Expertise in cloud platforms (AWS preferred) and container orchestration (Docker, Kubernetes, ECS).
- Solid understanding of microservices, service mesh, and modern architectural concepts.
- A collaborative mindset – you thrive on helping others and driving company‑wide impact.
Nice to Have
- Experience working in regulated industries (e.g., PCI compliance).
- Background in capacity planning, performance, and load testing.
- Sysadmin skills for troubleshooting disk, network, and infrastructure issues.
Where you’ll work
Our working model varies depending on the specific role and team requirements. We strive to provide flexibility whilst ensuring that each position is best supported for optimal collaboration and performance.
This Site Reliability Engineer position requires you to be in the London office (Holborn) one day per week.
About Us
Thredd is the trusted next‑gen payments partner for innovators looking to modernise their payments offering. Certified by Mastercard, Visa and Diners & Discover, we process billions of debit, prepaid, and credit transactions annually, supporting consumer and corporate fintechs, digital banks, and embedded finance providers across the globe. Our unique offering is our client‑centric approach, combining hands‑on support with modern, reliable, and scalable technology.
Our assured solution accelerates the development and delivery of consumer and corporate payments components embedded within digital banks, as well as for expense management, B2B payments, crypto, lending, credit, Buy Now Pay Later, FX, remittance, and open banking innovators.
Since 2007, Thredd has enabled market leaders through our highly reliable, secure, and scalable platform and supported many of our client’s growth journeys – from early‑stage startup through to globally recognized unicorns, including Monzo, Revolut, and Starling.
Diversity and Inclusion at Thredd
Here at Thredd, we are committed to building a diverse and inclusive workplace where everyone feels valued, respected and empowered. We welcome applications from people of all backgrounds, experiences and identities. If you require any adjustments during the recruitment process, please let us know and we would be happy to support you.
Our Values
- Own it and deliver – Taking responsibility for your own performance and being successful in your own role
- Collaborate purposefully – Building trusted relationships with colleagues, supporting activities and being successful together
- Think differently – Asking questions to check understanding and sharing your ideas to support continuous improvement
- Act courageously – Stepping out of your comfort zone and embracing change to help you learn and grow
#J-18808-Ljbffr
Site Reliability Engineer employer: Thredd
Contact Detail:
Thredd Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Site Reliability Engineer
✨Tip Number 1
Familiarise yourself with SRE principles, especially SLI, SLO, and SLA differences. Being able to discuss these concepts confidently during your interview will demonstrate your expertise and understanding of the role.
✨Tip Number 2
Showcase your experience with cloud platforms, particularly AWS, and container orchestration tools like Docker and Kubernetes. Prepare examples of how you've used these technologies to solve real-world problems in previous roles.
✨Tip Number 3
Be ready to discuss your approach to incident management and how you've improved processes in the past. Highlight any specific frameworks you've implemented and the impact they had on service reliability.
✨Tip Number 4
Demonstrate your collaborative mindset by preparing examples of how you've worked with cross-functional teams. Emphasising your ability to drive company-wide impact will resonate well with Thredd's culture.
We think you need these skills to ace Site Reliability Engineer
Some tips for your application 🫡
Understand the Role: Before applying, make sure you fully understand the responsibilities and requirements of the Site Reliability Engineer position at Thredd. Familiarise yourself with SRE principles, incident management frameworks, and the technologies mentioned in the job description.
Tailor Your CV: Customise your CV to highlight relevant experience and skills that align with the job description. Emphasise your background in implementing SRE principles, coding skills, and any experience with cloud platforms and container orchestration.
Craft a Compelling Cover Letter: Write a cover letter that showcases your passion for building reliable systems and your ability to solve complex infrastructure challenges. Mention specific examples from your past experiences that demonstrate your expertise in SRE practices and your collaborative mindset.
Highlight Relevant Projects: If you have worked on projects that involved application performance monitoring, automation, or incident response, be sure to include these in your application. Detail your role in these projects and the impact they had on service reliability and performance.
How to prepare for a job interview at Thredd
✨Showcase Your SRE Knowledge
Be prepared to discuss your experience with SRE principles, particularly SLI, SLO, and SLA. Highlight specific examples where you've implemented these concepts in previous roles, as this will demonstrate your understanding and capability in the field.
✨Demonstrate Problem-Solving Skills
Expect to face scenario-based questions that assess your ability to troubleshoot and resolve incidents. Share detailed accounts of past incidents you've managed, focusing on your approach to root cause analysis and the improvements you implemented afterwards.
✨Familiarise Yourself with Their Tech Stack
Research Thredd's technology stack, especially their use of cloud platforms like AWS and container orchestration tools such as Docker and Kubernetes. Being knowledgeable about their systems will allow you to engage in more meaningful discussions during the interview.
✨Emphasise Collaboration
Thredd values a collaborative mindset, so be ready to discuss how you've worked with cross-functional teams in the past. Share examples of how you've contributed to team success and driven company-wide impact through collaboration.