Senior Site Reliability Engineer
Senior Site Reliability Engineer

Senior Site Reliability Engineer

London Full-Time 43200 - 72000 £ / year (est.) No home office possible
L

At a Glance

  • Tasks: Join us as a Senior Site Reliability Engineer, enhancing our SRE practices and system reliability.
  • Company: We're loveholidays, a fast-growing online travel agency making dream holidays a reality.
  • Benefits: Enjoy a 5% pension contribution, training budget, discounted holidays, and up to 30 days off.
  • Why this job: Be part of a tech-driven culture that values open source and innovation in travel.
  • Qualifications: Experience in SRE practices, performance testing, and observability tools is essential.
  • Other info: Work with cutting-edge cloud technologies and contribute to a vibrant engineering community.

The predicted salary is between 43200 - 72000 £ per year.

We are a rapidly growing online travel agency with technology at the heart of our success. In 2022, we sent millions of people on their dream holiday. With a million visitors a day, our 100+ services handle 8k requests per second, while maintaining p95 search latency of 150ms. Our observability captures and processes 1TB of logs a day and 350k metric samples a second. We focus on differentiation by relying heavily on open source, while also giving back through contributions to public repositories, open sourcing in-house tools and sponsoring conferences.

Responsibilities

  • Contribute to the evolution of SRE practices like incident management, blameless postmortems, SLOs and error budgets.
  • Build reliable, performant, auto-scalable and highly available systems.
  • Support the existing Platform Infrastructure team.
  • Level up SRE practices across the teams.
  • Improve reliability KPIs of the platform.
  • Help balance reliability with feature delivery using SLOs and error budgets.

Our engineering teams own the lifecycle of services from first commit to high-load operation in production. Your responsibility will be to help engineering teams succeed at operations, not to run their services for them.

What you’ll be working on

  • Exposing slow running code paths in critical applications using tools like Java Flight Recorder or Go’s pprof.
  • Writing tools or modifying existing applications with reliability and performance in mind.
  • Ensuring our systems and their individual components can withstand x10 load by improving our performance testing.
  • Shortening mean time to discovery and recovery with improvements to observability and alerting.

We place a strong focus on observability, continually evolving our monitoring and alerting stack, currently centred around the Mimir (Prometheus), Grafana, Loki, Tempo ecosystem. Our service mesh (Linkerd) provides uniform observability of all production services at 10s intervals. Performance and scalability are integral to our software and infrastructure development process, achieved by combining Computer Science fundamentals and cutting edge cloud technologies. Low-level debugging and troubleshooting.

What we’ll give back to you

  • Company pension contributions at 5%.
  • Training budget for you to learn on the job and level yourself up.
  • Discounted holidays for you, your family and friends.
  • 25 days of holidays per annum (plus 8 public holidays) increases by 1 day for every second year of service, up to a maximum 30 days per annum.
  • Ability to buy and sell annual leave.
  • Cycle to work scheme, season ticket loan and eye care vouchers.

About the company: loveholidays offer a bespoke way of searching for your next getaway, giving you the chance to personalise your holiday with the ultimate flexibility. Plus, book confidently knowing your holiday is ATOL protected.

Senior Site Reliability Engineer employer: loveholidays

At loveholidays, we pride ourselves on being a forward-thinking online travel agency that places technology at the forefront of our operations. As a Senior Site Reliability Engineer, you will thrive in a dynamic work culture that encourages innovation and collaboration, with ample opportunities for professional growth through training budgets and a supportive team environment. Enjoy a range of benefits including generous holiday allowances, discounted travel for you and your loved ones, and a commitment to employee well-being, making loveholidays an exceptional place to advance your career while contributing to unforgettable travel experiences.
L

Contact Detail:

loveholidays Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Senior Site Reliability Engineer

✨Tip Number 1

Familiarise yourself with the tools and technologies mentioned in the job description, such as Java Flight Recorder, Go’s pprof, and the Mimir ecosystem. Having hands-on experience or projects showcasing your skills with these tools can set you apart during discussions.

✨Tip Number 2

Engage with the open-source community related to the technologies we use. Contributing to public repositories or participating in discussions can demonstrate your commitment to the field and help you network with professionals who might influence hiring decisions.

✨Tip Number 3

Prepare to discuss your experience with incident management and blameless postmortems. Be ready to share specific examples of how you've implemented SLOs and error budgets in previous roles, as this aligns closely with our expectations for the Senior Site Reliability Engineer position.

✨Tip Number 4

Showcase your understanding of balancing reliability with feature delivery. Think of scenarios where you've had to make trade-offs between performance and new features, and be prepared to discuss how you approached those challenges.

We think you need these skills to ace Senior Site Reliability Engineer

Site Reliability Engineering (SRE) practices
Incident Management
Blameless Postmortems
Service Level Objectives (SLOs)
Error Budgets
Performance Testing
Observability Tools (e.g., Prometheus, Grafana, Loki, Tempo)
Java Flight Recorder
Go’s pprof
Low-level Debugging
Troubleshooting Skills
Cloud Technologies
Auto-scaling Systems
Monitoring and Alerting
Strong Programming Skills
Collaboration with Engineering Teams

Some tips for your application 🫡

Understand the Role: Before applying, make sure you fully understand the responsibilities and requirements of the Senior Site Reliability Engineer position. Familiarise yourself with SRE practices, observability tools, and performance testing methodologies mentioned in the job description.

Tailor Your CV: Customise your CV to highlight relevant experience and skills that align with the job description. Emphasise your knowledge of open source technologies, incident management, and any previous work with performance optimisation or reliability engineering.

Craft a Compelling Cover Letter: Write a cover letter that showcases your passion for technology and travel. Discuss how your background in SRE can contribute to the company's goals, particularly in improving reliability KPIs and enhancing observability.

Showcase Relevant Projects: If you have worked on projects involving observability stacks like Prometheus, Grafana, or similar tools, be sure to mention these in your application. Providing specific examples of how you've improved system performance or reliability will strengthen your application.

How to prepare for a job interview at loveholidays

✨Understand the SRE Practices

Familiarise yourself with key SRE concepts such as incident management, blameless postmortems, SLOs, and error budgets. Be prepared to discuss how you have implemented or improved these practices in your previous roles.

✨Showcase Your Technical Skills

Be ready to demonstrate your expertise in performance testing and observability tools like Prometheus, Grafana, and others mentioned in the job description. Prepare examples of how you've used these tools to enhance system reliability.

✨Discuss Load Handling Strategies

Since the role involves ensuring systems can withstand increased loads, be prepared to talk about your experience with load testing and scaling applications. Share specific instances where you successfully improved system performance under high demand.

✨Emphasise Collaboration

Highlight your ability to work with engineering teams to improve operations without taking over their services. Discuss how you’ve supported teams in achieving their goals while maintaining a focus on reliability and performance.

Senior Site Reliability Engineer
loveholidays
Location: London

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

L
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>