Site Reliability Engineer

Site Reliability Engineer

Full-Time 36000 - 60000 £ / year (est.) No home office possible
Go Premium
Thought Machine

At a Glance

  • Tasks: Join us in building reliable, scalable applications for the future of banking.
  • Company: Thought Machine, a leading fintech revolutionising banking technology.
  • Benefits: Competitive salary, flexible hours, generous leave, and a vibrant workplace culture.
  • Other info: Diverse team culture that values learning and professional growth.
  • Why this job: Make a real impact on global finance while collaborating with brilliant minds.
  • Qualifications: Experience in software engineering, particularly with Python, Golang, or Java.

The predicted salary is between 36000 - 60000 £ per year.

Thought Machine’s mission is bold – to properly and permanently rid the world’s banks of legacy technology. To achieve this, we have developed the foundations of modern banking through core and payments technology which run natively in the cloud. What we are attempting is hard and means we need great people working together to build great technology. We have grown rapidly in the past few years – growing our team to more than 550 individuals across offices in London, New York, Singapore and Sydney. We have raised more than $500m in funding and are now valued at $2.7bn.

Our investors include Molten Ventures, Eurazeo, Intesa Sanpaolo, Temasek, Nyca Partners, JPMorgan Chase Strategic Investments, Standard Chartered Ventures, and more. We have created a culture that enables our team to produce the best work in the industry while ensuring we have fun along the way. We’re regularly cited as having a fantastic workplace culture and have been recognised by Sifted magazine as having one of the highest Glassdoor ratings for a UK fintech company and the industry's most generous employee share package. Named one of the world’s most innovative fintechs by Global Finance Magazine, we were also recognised by the Financial Times as one of Europe’s fastest-growing companies for two consecutive years—and a UK Best Employer for 2026.

Thought Machine’s Site Reliability Engineers are the guardians of mission‑critical systems for the world’s most influential financial institutions. As a member of our elite, globally distributed team, you’ll be entrusted with running and maintaining the robust production infrastructure that powers our customers' cutting‑edge Core Banking and Payments platforms. This is an opportunity to make a tangible impact on the global financial landscape while collaborating with brilliant minds to solve complex engineering challenges. This role will be part of the Site Reliability Engineering team at Thought Machine HQ in London.

The team is deeply involved in tackling the technical challenges of executing Thought Machine’s growth ambitions – expect to be working with senior stakeholders in the organisation, our customers, and working on programmes and initiatives that are critical to the success of the company.

As an SRE at Thought Machine, you will be responsible for:

  • Supporting the product engineering teams in building highly fault‑tolerant, scalable applications by participating in design discussions, engaging in RFCs and code reviews.
  • Contributing to the execution of department strategies such as implementing disaster recovery, backup, redundancy, and capacity planning activities.
  • Participating in a global on‑call rotation responsible for identifying and fixing bottlenecks in SaaS customer environments.
  • Regular maintenance of production systems that host Vault products.
  • Contributing to the evolution of our SaaS products by building features that foster exceptional reliability and an unparalleled user experience.
  • Implementing and testing DR strategies to ensure the highest level of resilience and fault tolerance of the platform.
  • Maintaining high‑quality written documentation of assets, processes and runbooks that are used by the team in their day‑to‑day operations.
  • Collaborating effectively with team members, actively participating in knowledge sharing, and continuously growing your own technical understanding of Vault Products.

What We’re Looking For:

  • You have experience successfully delivering engineering tasks and projects with a focus on reliability and scalability.
  • You possess a good understanding of design patterns relevant to hosting and networking architectures.
  • You proactively champion product development, driven by a desire to build truly exceptional products, not just solve immediate challenges.
  • You have a strong background working in either Python, Golang or Java, having used one of these programming languages to build production level software.
  • You have experience working with Kubernetes or other container orchestration systems.
  • You have experience with automation/configuration management, e.g. Terraform, Puppet, Chef, Ansible.
  • You have a good understanding of one or more of the following areas: Database Administration, Networking, Observability Tools (such as Prometheus, Jaeger) or automation infrastructure.
  • You have solid experience working with either GCP or AWS.

Benefits:

  • Highly competitive salary
  • Pension plan (match up to 5%)
  • Life insurance - three times annual salary
  • Competitive maternity (six months fully paid) and paternity leave (four weeks fully paid)
  • Shared parental leave (matched to our maternity leave for the same point in time)
  • 25 days holiday and bank holidays
  • Flexible working hours
  • Cycle‑to‑work scheme
  • Electric car scheme
  • Season ticket loan
  • Access to outstanding learning materials and courses
  • Sports and hobby clubs, subsidised by Thought Machine
  • All the latest tech you need
  • Start the day properly with fresh fruit and cereals
  • Huge range of healthy (and not‑so‑healthy) snacks, smoothies and drinks
  • A talented and experienced team as your colleagues
  • An environment where we encourage learning and progress
  • Two charity days a year
  • Weekly food pop‑up

We actively hire candidates who demonstrate technical excellence in their field and welcome people of all ages and backgrounds, providing everyone with equal access to professional development. You are encouraged to apply even if your experience doesn’t accurately match the job description. We also encourage applications from those with different abilities, including candidates with ADHD, autism, dyslexia or dyspraxia.

Site Reliability Engineer employer: Thought Machine

Thought Machine is an exceptional employer, offering a vibrant work culture that prioritises innovation and collaboration. With a commitment to employee growth, we provide extensive learning opportunities, competitive benefits including generous parental leave and a flexible working environment, all while being part of a rapidly growing fintech company at the forefront of modern banking technology in London. Join us to make a meaningful impact on the global financial landscape alongside a talented team dedicated to excellence.
Thought Machine

Contact Detail:

Thought Machine Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer

✨Tip Number 1

Network like a pro! Reach out to current or former employees on LinkedIn and ask about their experiences at Thought Machine. A friendly chat can give you insider info and maybe even a referral!

✨Tip Number 2

Prepare for the interview by brushing up on your technical skills. Make sure you can talk confidently about your experience with Python, Golang, or Java, and be ready to discuss your work with Kubernetes and cloud platforms.

✨Tip Number 3

Show your passion for reliability and scalability! During interviews, share specific examples of how you've tackled engineering challenges in the past. This will demonstrate that you're not just a problem-solver but a proactive champion for great products.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining the Thought Machine team.

We think you need these skills to ace Site Reliability Engineer

Site Reliability Engineering
Fault-Tolerant Systems
Scalable Applications
Disaster Recovery
Capacity Planning
On-Call Support
Production System Maintenance
Documentation Skills
Collaboration
Python
Golang
Java
Kubernetes
Terraform
AWS

Some tips for your application 🫡

Tailor Your Application: Make sure to customise your CV and cover letter to highlight your experience with reliability and scalability. We want to see how your skills align with our mission to rid banks of legacy technology!

Showcase Your Technical Skills: Don’t hold back on detailing your experience with Python, Golang, or Java. If you've worked with Kubernetes or automation tools like Terraform, let us know! We love seeing your technical prowess shine through.

Be Clear and Concise: When writing your application, keep it straightforward. Use clear language and avoid jargon where possible. We appreciate a well-structured application that gets straight to the point!

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it’s super easy!

How to prepare for a job interview at Thought Machine

✨Know Your Tech Inside Out

Make sure you brush up on your knowledge of Python, Golang, or Java, as well as Kubernetes and cloud platforms like GCP or AWS. Be ready to discuss how you've used these technologies in past projects, especially focusing on reliability and scalability.

✨Understand the Company Culture

Thought Machine values collaboration and innovation, so be prepared to talk about how you work in teams and contribute to a positive workplace culture. Share examples of how you've fostered teamwork or tackled challenges with colleagues.

✨Prepare for Technical Challenges

Expect to face technical questions or scenarios during the interview. Practice explaining your thought process when solving problems related to disaster recovery, capacity planning, or automation. This will show your ability to think critically under pressure.

✨Show Your Passion for Learning

Thought Machine encourages continuous growth, so highlight any recent learning experiences or courses you've taken. Discuss how you stay updated with industry trends and how you plan to contribute to the evolution of their SaaS products.

Site Reliability Engineer
Thought Machine
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>