At a Glance
- Tasks: Join our SRE team to enhance platform resilience and performance.
- Company: Arbor is transforming education with innovative management tools for over 7,000 schools.
- Benefits: Enjoy 32 days holiday, flexible working, and a dedicated wellbeing team.
- Why this job: Make a real impact in education while working in a supportive and joyful environment.
- Qualifications: Experience in performance monitoring, scripting, and cloud technologies required.
- Other info: Dog-friendly offices and opportunities for volunteering and personal development.
The predicted salary is between 36000 - 60000 £ per year.
Social network you want to login/join with:
At Arbor, we’re on a mission to transform the way schools work for the better.
We believe in a future of work in schools where being challenged doesn’t mean being burnt out and overworked. Where data guides progress without overwhelming staff. And where everyone working in a school is reminded why they got into education every day.
Our MIS and school management tools are already making a difference in over 7,000 schools and trusts. Giving time and power back to staff, turning data into clear, actionable insights, and supporting happier working days.
At the heart of our brand is a recognition that the challenges schools face today aren’t just about efficiency, outputs and productivity – but about creating happier working lives for the people who drive education everyday: the staff. We want to make schools more joyful places to work, as well as learn.
About the role
We are looking for an enthusiastic and proactive Site Reliability Engineer to join our SRE team and help us ensure we provide world-class resilience and performance across the platform. The remit and focus of the role is to advise on all aspects of site reliability including availability, scalability, observability and capacity planning. It’s a broad and exciting role, so we’re looking for someone up for a challenge – if you’re an energetic and a collaborative Site Reliability Engineer, this is the role for you.
Core responsibilities
- Proactively monitor and analyse platform performance.
- Collaborate with engineering teams to address performance bottlenecks and ensure scalability.
- Assist engineering teams with implementing and reviewing SLOs
- Continually improve observability through monitoring and alerting, and dashboards, using tools such as DataDog or Prometheus for example.
- Work with other teams to ensure it is effective and provides full coverage.
- Ensure the service is highly available and resilient
- Champion best practices in design for high availability
- Devise runbooks and run game sessions to test our DR plan, H/A and backups
- Conduct assessments of capacity and plan for scaling to meet current and future business needs.
- Work closely with the Head of Platform Engineering and Head of SRE to strategize and implement scalable solutions.
- Work closely with the Platform team, feature teams and, 2nd line support and other stakeholders to ensure a good level of service is provided for our customers and embed SRE practices.
- Key player in the response and troubleshooting of incidents, ensuring rapid resolution and minimising downtime.
- Participate in blameless postmortems to identify root cause and corrective actions
- Develop and maintain playbooks and documentation
Requirements
About you
- Experience in performance monitoring and analysis
- Capacity planning experience
- Scripting and automation skills, with experience in relevant technologies.
- Experience with Infrastructure as Code, in particular, Terraform
- Understanding of relational database technologies and their cloud versions (e.g. AWS Aurora)
- Experience with messaging and distributed asynchronous workloads
- Experience with nginx or similar technologies
- Familiarity with SRE processes.
- Aware of DevOps principles like the 3 ways and 5 ideals.
Bonus Skills
- Experience with other database technologies and cloud platforms.
- Past experience with enterprise solutions running at scale
- Familiarity with kanban and agile development processes
- Experience with containerisation, for example Docker
- Familiarity with software best practices such as Refactoring, Clean Code, Domain-Driven Design and Test-Driven Development.
What we offer
The chance to work alongside a team of hard-working, passionate people in a role where you’ll see the impact of your work everyday. We also offer:
- A dedicated wellbeing team who champion initiatives such as mindfulness, lunch n learns, manager training, mental health first aid training and much more!
- 32 days holiday (plus Bank Holidays). This is made up of 25 days annual leave plus 7 extra company wide days given over Easter, Summer & Christmas
- Life Assurance paid out at 3x annual salary
- Comprehensive wellness benefit provided by AIG Smart Health, which provides a 24/7 virtual GP service, Mental health support, Counselling, and personalised Health Checks
- Private Dental Insurance with Bupa
- Salary sacrifice Pension provided by Scottish Widows
- Enhanced maternity and adoption leave (20 weeks full pay) and paternity (6 weeks full pay) pay
- 5 free return to work maternity coaching sessions, helping you adapt to this new exciting time of life!
- Access to services such as Calm and Bippit (financial wellbeing coaching)
- All of our roles champion flexible working and we are happy to discuss what this means to you
- Social committees that plan team, office and company wide events to bring people together and celebrate success
- Volunteer with a charity of your choice for a day each year
- Dog friendly offices!
- Phone screen
- 1st stage
- 2nd stage
We are committed to a fair and comfortable recruitment process, so if you require any reasonable adjustments during your application or interview process, please reach out to a member of the team at (emailprotected) .
Our commitment is also backed by our partnership with Neurodiversity Consultancy, Lexxic who provide us with training, support and advice.
Arbor Education is an equal opportunities organisation
Our goal is for Arbor to be a workplace which represents, celebrates and supports people from all backgrounds, and which gives them the tools they need to thrive – whatever their ambitions may be so we support and promote diversity and equality, and actively encourage applications from people of all backgrounds.
Refer a friend
Know someone else who would be good for this role? You can refer a friend, family member or colleague, if they are offered a role with Arbor, we will say thank you with a voucher valued up to £200! Simply email: (emailprotected)
Please note: We are unable to provide visa sponsorship at this time.
#J-18808-Ljbffr
Site Reliability Engineer employer: Arbor Education
Contact Detail:
Arbor Education Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Site Reliability Engineer
✨Tip Number 1
Familiarise yourself with Arbor's mission and values. Understanding their focus on creating happier working lives in schools will help you align your responses during interviews, showcasing how your skills as a Site Reliability Engineer can contribute to this goal.
✨Tip Number 2
Highlight your experience with performance monitoring tools like DataDog or Prometheus. Be prepared to discuss specific instances where you've improved system reliability or scalability, as this will demonstrate your hands-on expertise relevant to the role.
✨Tip Number 3
Showcase your collaborative skills by preparing examples of how you've worked with cross-functional teams in the past. Since the role involves working closely with various stakeholders, illustrating your teamwork abilities will be crucial.
✨Tip Number 4
Be ready to discuss your approach to incident response and troubleshooting. Sharing your experiences with blameless postmortems and how you've implemented corrective actions will highlight your commitment to continuous improvement in site reliability.
We think you need these skills to ace Site Reliability Engineer
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights relevant experience and skills that align with the Site Reliability Engineer role. Focus on your performance monitoring, capacity planning, and automation skills, as well as any experience with tools like DataDog or Prometheus.
Craft a Compelling Cover Letter: In your cover letter, express your enthusiasm for Arbor's mission to improve the educational environment. Mention specific aspects of the job description that excite you and how your background makes you a great fit for the team.
Showcase Relevant Projects: If you have worked on projects related to site reliability, performance monitoring, or cloud technologies, be sure to include these in your application. Highlight your role, the challenges faced, and the outcomes achieved.
Prepare for Technical Questions: Anticipate technical questions related to SRE practices, DevOps principles, and the tools mentioned in the job description. Be ready to discuss your experience with Infrastructure as Code, relational databases, and incident response strategies.
How to prepare for a job interview at Arbor Education
✨Understand the Role
Make sure you have a solid grasp of what a Site Reliability Engineer does, especially in the context of Arbor's mission. Familiarise yourself with key responsibilities like performance monitoring, capacity planning, and collaboration with engineering teams.
✨Showcase Your Technical Skills
Be prepared to discuss your experience with relevant technologies such as Terraform, DataDog, or Prometheus. Highlight any past projects where you've successfully implemented SRE practices or improved system reliability.
✨Demonstrate Problem-Solving Abilities
Expect to be asked about how you've handled incidents in the past. Prepare examples that showcase your ability to troubleshoot effectively and participate in blameless postmortems to identify root causes.
✨Align with Company Values
Arbor values creating happier working lives for staff. Be ready to discuss how your personal values align with this mission and how you can contribute to a positive work environment.