Senior Site Reliability Engineer
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Full-Time 48000 - 72000 £ / year (est.) Home office possible
Go Premium
A

At a Glance

  • Tasks: Design and maintain scalable, reliable systems in a global AWS environment.
  • Company: Join Airalo, the world's first eSIM store revolutionising telecom for travellers.
  • Benefits: Enjoy health insurance, remote work perks, and an all-expenses-paid company retreat.
  • Why this job: Make a real impact in a diverse team while enhancing your skills in SRE.
  • Qualifications: 5 years of SRE experience, strong AWS and Kubernetes knowledge required.
  • Other info: Flexible hours, guaranteed rest periods, and a blameless culture for learning.

The predicted salary is between 48000 - 72000 £ per year.

About Airalo

Alo Airalo is the world's first eSIM store that helps people connect in over 200 countries and regions across the globe. We are building the next digital service that revolutionises the telecom industry. We are a travel-tech company and an equal-opportunity environment that values and executes diversity, inclusion, and equity. Our team is spread across 50 countries and six continents. What glues us together is our commitment to changing the way you connect.

About you

We hope that you care deeply about the quality of your work, the intrinsic worth of tasks, and the success of your team. You are self-disciplined and do not require micromanagement in terms of your skillset and work ethic. You do your best to flourish as an individual every day while working hard to foster a collaborative team environment. You believe in the importance of being — and staying — authentic, honest, positive, and kind. You are a good interlocutor with clear and concise communication. You are able to manage multiple projects, have an analytical mind, pay keen attention to detail, and love to get your hands dirty. You are cognizant, tolerant, and welcoming of vulnerabilities and cultural differences.

About the Role

  • Position: Full-time / Employee
  • Location: Remote-first
  • Benefits: Health Insurance, work-from-anywhere stipend, annual wellness & learning credits, annual all-expenses-paid company retreat in a gorgeous destination & other benefits

On-Call:

  • Participating in our on-call rotation is a core expectation of this role. It's essential for maintaining 24/7 service reliability across our global operations, ensuring our systems remain resilient and our customers experience uninterrupted service, regardless of time zone or geography.
  • Paid Rotation: We offer standby fees overtime pay.
  • Delayed Start: No on-call duties for your first 6 months.
  • Rest & Recovery: Guaranteed rest periods and flexible hours following night incidents.
  • Shared Load: Rotations are split (Weekdays vs. Weekends) to minimise fatigue.

We are looking for a Senior Site Reliability Engineer to join our growing engineering team. We are a company that values SRE principles and practices. We believe in empowering our SREs to make data-driven decisions, automate operational tasks, and continuously improve the reliability of our systems. We foster a blameless culture where everyone is encouraged to learn from mistakes and share knowledge. If you are passionate about building and maintaining highly reliable systems, we would love to hear from you.

What you'll do:

  • Lead the design of scalable, fault-tolerant and self-healing systems in a multi-region AWS environment.
  • Define and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to drive architectural decisions and error budget policies.
  • Conduct blameless post-incident reviews to uncover systemic root causes and implement long-term preventive measures.
  • Identify patterns of manual work and lead the development of internal tools/automation to permanently eliminate them.
  • Develop and maintain automated runbooks and playbooks for common operational tasks and complex incident response.
  • Shift from simple monitoring to deep observability, ensuring high cardinality data leads to proactive actionable insights.
  • Proactively identify and mitigate operational risks through chaos engineering and architecture reviews.
  • Work with software engineers to design systems for reliability, scalability, and maintainability from the early stages of the SDLC.
  • Continuously evaluate and optimise system performance, capacity, and cost efficiency.
  • Beyond just participating, you will refine the on-call experience to reduce alert fatigue, improve MTTR, and ensure sustainable rotation health.

Must Haves:

  • Bachelor's degree in Computer Engineering or a similar discipline.
  • 5 years of experience as a Site Reliability Engineer or in a similar role.
  • 3 years of experience with AWS services including strong knowledge of container orchestration.
  • 2 years of Kubernetes experience.
  • Deep understanding of observability principles and tools like Prometheus, Datadog, OpenTelemetry.
  • Experience with leading incident management and complex postmortem analysis.
  • Experience and interest in managing infrastructure as code (Terraform).
  • Experience with chaos engineering and other techniques for testing system resilience.
  • Experience with CI/CD tools such as GitHub Actions for automated delivery.
  • Proficiency in at least one programming language (Python, Go, Java, etc.) for building automation and internal tooling.
  • Event-driven architecture experience (SNS, SQS etc).
  • Ability to work independently and collaboratively in a fast-paced environment.
  • Team player and open to new ideas.
  • Good communication skills and fluency in English.

Good to have:

  • Prior experience with Scrum and other agile methods.
  • Certification in relevant areas such as AWS Certified DevOps Engineer, Certified Kubernetes Administrator (CKA), or similar.
  • Prior experience with Telco Core Networks (e.g., 5G/LTE Packet Core, IMS, Signalling) and low-latency networking.
  • Experience with AI-driven SRE tools for anomaly detection and improvements.
  • Contributions to open-source SRE projects or communities.
  • Prior work experience in telecommunications.
  • Deep understanding of eSIM and GSMA related technologies and services.

If you are interested in this position, please apply via the link. By applying, you acknowledge and agree that, in case of successful application, Airalo may request to run background checks as a condition for entering into an agreement with you. Rest assured that these checks will only occur upon your prior consent and at the end of the selection process, and will be strictly limited to what is allowed under the laws that are applicable to you. All data that you share or that we collect in connection with such checks will be processed in accordance with our Privacy Policy.

We sincerely thank all applicants in advance for submitting their interest in this opportunity. Airalo is an equal-opportunity employer and values diversity, equity & inclusion. We do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We are committed to providing reasonable accommodations upon request for individuals with disabilities throughout our job interview process.

Senior Site Reliability Engineer employer: Airalo

Airalo is an exceptional employer that champions a remote-first work culture, offering employees the flexibility to work from anywhere while enjoying comprehensive benefits such as health insurance, wellness credits, and annual retreats in stunning locations. With a strong commitment to diversity, inclusion, and employee growth, Airalo fosters a collaborative environment where team members are empowered to innovate and learn from each other, making it an ideal place for those passionate about technology and reliability in the telecom industry.
A

Contact Detail:

Airalo Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Senior Site Reliability Engineer

✨Tip Number 1

Network like a pro! Reach out to current or former employees at Airalo on LinkedIn. A friendly chat can give you insider info and maybe even a referral, which can really boost your chances.

✨Tip Number 2

Prepare for the interview by diving deep into SRE principles and AWS services. Brush up on your knowledge of observability tools and chaos engineering techniques. The more you know, the more confident you'll feel!

✨Tip Number 3

Show off your problem-solving skills during the interview. Be ready to discuss past incidents you've managed and how you approached them. Remember, they love a blameless culture, so focus on learning and improvement.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you’re genuinely interested in joining the team at Airalo.

We think you need these skills to ace Senior Site Reliability Engineer

AWS Services
Container Orchestration
Kubernetes
Observability Principles
Prometheus
Datadog
OpenTelemetry
Incident Management
Postmortem Analysis
Infrastructure as Code
Terraform
Chaos Engineering
CI/CD Tools
GitHub Actions
Programming (Python, Go, Java)

Some tips for your application 🫡

Show Your Passion: When writing your application, let your enthusiasm for the role shine through! We want to see that you genuinely care about building reliable systems and improving processes. Share specific examples of how you've done this in the past.

Tailor Your CV: Make sure your CV is tailored to the Senior Site Reliability Engineer position. Highlight relevant experience with AWS, Kubernetes, and observability tools. We love seeing how your skills align with what we’re looking for!

Be Clear and Concise: We appreciate clear communication, so keep your application straightforward. Avoid jargon and get straight to the point about your skills and experiences. This will help us understand your fit for the role quickly.

Apply Through Our Website: Don’t forget to apply through our website! It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it’s super easy to do!

How to prepare for a job interview at Airalo

✨Know Your Stuff

Make sure you brush up on your technical skills, especially around AWS, Kubernetes, and observability tools. Be ready to discuss your experience with these technologies in detail, as they are crucial for the Senior Site Reliability Engineer role.

✨Show Your Problem-Solving Skills

Prepare to share specific examples of how you've tackled complex incidents or improved system reliability in the past. Highlight your experience with blameless post-incident reviews and how you've implemented long-term solutions.

✨Communicate Clearly

Since clear communication is key, practice explaining technical concepts in a straightforward manner. You might be asked to describe your thought process during an incident or how you would approach a new project, so clarity is essential.

✨Emphasise Team Collaboration

Airalo values a collaborative environment, so be ready to discuss how you've worked with cross-functional teams. Share examples of how you've fostered teamwork and contributed to a positive work culture, as this aligns with their core values.

Senior Site Reliability Engineer
Airalo
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

A
  • Senior Site Reliability Engineer

    Full-Time
    48000 - 72000 £ / year (est.)
  • A

    Airalo

    50-100
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>