Site Reliability Engineer (SRE) in London

Site Reliability Engineer (SRE) in London

London Full-Time 70000 - 80000 £ / year (est.) No home office possible
æ

At a Glance

  • Tasks: Ensure platform resilience and performance while collaborating with engineering teams.
  • Company: Join Arbor, a mission-driven tech company transforming education for happier school environments.
  • Benefits: Enjoy 32 days holiday, wellness support, and flexible working options.
  • Why this job: Make a real impact in education while working with passionate professionals.
  • Qualifications: Experience in performance monitoring, scripting, and cloud technologies required.
  • Other info: Dynamic team culture with opportunities for professional development and community volunteering.

The predicted salary is between 70000 - 80000 £ per year.

Location: Remote

Salary: £70,000 - £80,000

About us

At Arbor, we’re on a mission to transform the way schools work for the better. We believe in a future of work in schools where being challenged doesn’t mean being burnt out and overworked. Our MIS and school management tools are already making a difference in over 7,000 schools and trusts, giving time and power back to staff, turning data into clear, actionable insights, and supporting happier working days.

About the role

We are looking for an enthusiastic and proactive Site Reliability Engineer to join our SRE team and help us ensure we provide world‑class resilience and performance across the platform. The remit and focus of the role is to advise on all aspects of site reliability including availability, scalability, observability and capacity planning.

Core responsibilities

  • Proactively monitor and analyse platform performance.
  • Collaborate with engineering teams to address performance bottlenecks and ensure scalability.
  • Assist engineering teams with implementing and reviewing SLOs.
  • Continually improve observability through monitoring and alerting, and dashboards, using tools such as DataDog or Prometheus.
  • Ensure the service is highly available and resilient.
  • Champion best practices in design for high availability.
  • Devise runbooks and run game sessions to test our DR plan, H/A and backups.
  • Conduct assessments of capacity and plan for scaling to meet current and future business needs.
  • Work closely with the Head of Platform Engineering and Head of SRE to strategise and implement scalable solutions.
  • Participate in blameless postmortems to identify root cause and corrective actions.
  • Develop and maintain playbooks and documentation.

About you

  • Experience in performance monitoring and analysis.
  • Capacity planning experience.
  • Scripting and automation skills, with experience in relevant technologies.
  • Experience with Infrastructure as Code, in particular, Terraform.
  • Understanding of relational database technologies and their cloud versions (e.g. AWS Aurora).
  • Experience with messaging and distributed asynchronous workloads.
  • Experience with nginx or similar technologies.
  • Familiarity with SRE processes.
  • Aware of DevOps principles like the 3 ways and 5 ideals.

Bonus Skills

  • Experience with other database technologies and cloud platforms.
  • Past experience with Enterprise solutions running at scale.
  • Familiarity with Kanban and Agile development processes.
  • Experience with containerisation, for example Docker.
  • Familiarity with software best practices such as Refactoring, Clean Code, Domain-Driven Design and Test-Driven Development.

What we offer

  • A dedicated wellbeing team who champion initiatives such as mindfulness, lunch n learns, manager training, mental health first aid training and much more!
  • 32 days holiday (plus Bank Holidays).
  • Life Assurance paid out at 3x annual salary.
  • Comprehensive wellness benefit provided by AIG Smart Health.
  • Private Dental Insurance with Bupa.
  • Salary sacrifice Pension provided by Scottish Widows.
  • Enhanced maternity and adoption leave (20 weeks full pay) and paternity (6 weeks full pay) pay.
  • Access to services such as Calm and Bippit (financial wellbeing coaching).
  • All of our roles champion flexible working.
  • Social committees that plan team, office and company wide events.
  • Dedicated professional development training budget.
  • Volunteer with a charity of your choice for a day each year.
  • Dog friendly offices!

Interview process

  • Phone screen
  • 1st stage
  • 2nd stage

We are committed to a fair and comfortable recruitment process, so if you require any reasonable adjustments during your application or interview process, please reach out to a member of the team.

Arbor Education is an equal opportunities organisation. Our goal is for Arbor to be a workplace which represents, celebrates and supports people from all backgrounds.

Refer a friend

Know someone else who would be good for this role? You can refer a friend, family member or colleague, if they are offered a role with Arbor, we will say thank you with a voucher valued up to £200!

Site Reliability Engineer (SRE) in London employer: 慨正橡扯

At Arbor, we pride ourselves on being an exceptional employer that prioritises the wellbeing and professional growth of our team members. With a strong focus on creating a supportive work culture, we offer generous benefits such as 32 days of holiday, comprehensive wellness support, and a dedicated training budget to help you thrive in your role as a Site Reliability Engineer. Join us in transforming education while enjoying a flexible working environment and the opportunity to make a real impact in over 7,000 schools.
æ

Contact Detail:

慨正橡扯 Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer (SRE) in London

✨Tip Number 1

Network like a pro! Reach out to your connections in the tech world, especially those in SRE roles. A friendly chat can lead to insider info about job openings or even referrals that could give you a leg up.

✨Tip Number 2

Show off your skills! Create a personal project or contribute to open-source projects that highlight your SRE expertise. This not only boosts your portfolio but also gives you something tangible to discuss during interviews.

✨Tip Number 3

Prepare for the technical interview! Brush up on your performance monitoring and capacity planning skills. Be ready to discuss your experience with tools like DataDog or Prometheus, as well as your approach to incident response.

✨Tip Number 4

Apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in being part of our mission to make schools better places to work.

We think you need these skills to ace Site Reliability Engineer (SRE) in London

Performance Monitoring
Capacity Planning
Scripting and Automation
Infrastructure as Code
Terraform
Relational Database Technologies
AWS Aurora
Messaging and Distributed Workloads
nginx
SRE Processes
DevOps Principles
Containerisation
Docker
Agile Development Processes
Software Best Practices

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the Site Reliability Engineer role. Highlight your experience with performance monitoring, capacity planning, and any relevant technologies like Terraform or DataDog. We want to see how your skills align with our mission!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Share your passion for improving school environments and how your background makes you a great fit for our team. Let us know why you’re excited about the role and what you can bring to Arbor.

Showcase Your Projects: If you've worked on any relevant projects, don’t hold back! Include links or descriptions of your work that demonstrate your skills in automation, scripting, or infrastructure as code. We love seeing real examples of your expertise!

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows you’re keen to join our team at Arbor!

How to prepare for a job interview at 慨正橡扯

✨Know Your Tech Stack

Make sure you’re familiar with the technologies mentioned in the job description, like Terraform and AWS Aurora. Brush up on your scripting and automation skills, as well as your understanding of performance monitoring tools like DataDog or Prometheus. Being able to discuss these confidently will show that you're ready for the role.

✨Showcase Your Problem-Solving Skills

Prepare examples of how you've tackled performance bottlenecks or capacity planning challenges in the past. Use the STAR method (Situation, Task, Action, Result) to structure your answers. This will help demonstrate your proactive approach and ability to collaborate effectively with engineering teams.

✨Understand SRE Principles

Familiarise yourself with SRE processes and DevOps principles, such as the 3 ways and 5 ideals. Be ready to discuss how you’ve applied these concepts in previous roles. This knowledge will highlight your commitment to best practices in site reliability and your fit for the team.

✨Ask Insightful Questions

Prepare thoughtful questions about the company’s approach to site reliability and how they measure success. Inquire about their current challenges and how the SRE team collaborates with other departments. This shows your genuine interest in the role and helps you assess if it’s the right fit for you.

Site Reliability Engineer (SRE) in London
慨正橡扯
Location: London

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>