Site Reliability Engineer

Site Reliability Engineer

Edinburgh Full-Time 43200 - 72000 £ / year (est.) Home office possible
N

At a Glance

  • Tasks: Improve system reliability and performance while collaborating with engineers and product owners.
  • Company: Join an inclusive team committed to innovation and professional development.
  • Benefits: Enjoy remote work options and a collaborative work culture.
  • Why this job: Make a real impact by enhancing product reliability and driving innovation.
  • Qualifications: Strong knowledge of site reliability engineering and programming languages required.
  • Other info: Flexible working hours with a focus on continuous improvement.

The predicted salary is between 43200 - 72000 £ per year.

Join us as a Site Reliability Engineer. In this key role, you’ll improve, drive, and embed non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and services. You’ll enjoy significant stakeholder interaction, working in collaboration with engineers and product owners to ensure a principled approach to deliver change in a safe and secure way. This is a chance to join an inclusive team with a collaborative ethos and a commitment to innovation and professional development.

As our Site Reliability Engineer, you’ll work closely with our feature team and other colleagues to meet defined service level objectives and continually improve system and environment reliability. You’ll define SLOs, SLIs, and error budgets that support finding the right balance between risk reliability and continuous improvement. You’ll also provide structure and help to our release process, suggesting and making improvements where possible. You’ll scale systems sustainably through mechanisms like automation, evolving them by pushing for changes that improve reliability and velocity. We’ll also look to you to coach and provide guidance to colleagues and the wider team, leading where required.

In addition to this, you’ll:

  • Proactively contribute new ideas and innovations to meet short-term and longer-term goals.
  • Continually balance and manage any potential risks.
  • Be accountable for the day-to-day development and health of both production and non-production environments and respond to any incidents as required.
  • Provide technical expertise and input to establish the risk tolerance of products and services.
  • Communicate incident status updates clearly and frequently to other teams, customers, and stakeholders and support blameless post-mortems.

The skills you’ll need:

We’re looking for someone with strong knowledge of reliability systems thinking and experience of site reliability engineering. You’ll need experience of using a data-driven and scientific approach to fact finding. We’ll also look for financial services knowledge, and the ability to identify wider business impact, risk, and opportunity, and make connections across key outputs and processes. We’re also looking for:

  • Good knowledge and experience of programming languages.
  • Strong knowledge of deploy and release services, automation, and troubleshooting.
  • Experience of utilising tools and technology across the software development lifecycle.
  • Experience using mathematical and statistical models to assess trends.
  • Strong communication skills with the ability to proactively engage with a wide range of stakeholders.
  • In-depth experience with observability tools such as Grafana, Prometheus, and OpenTelemetry.
  • Strong knowledge of public cloud environments such as AWS and GCP, and Infrastructure as Code tools such as Terraform.

Hours: 35

Ways of Working: Remote First

Site Reliability Engineer employer: NatWest Group

As a Site Reliability Engineer with us, you'll be part of an inclusive and innovative team that values collaboration and professional growth. We offer a remote-first work culture that promotes flexibility and work-life balance, alongside opportunities for continuous learning and development in cutting-edge technologies. Join us to make a meaningful impact while enjoying a supportive environment that encourages new ideas and fosters your career advancement.
N

Contact Detail:

NatWest Group Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer

✨Tip Number 1

Familiarise yourself with the specific tools and technologies mentioned in the job description, such as Grafana, Prometheus, and Terraform. Having hands-on experience or projects showcasing these tools can set you apart during discussions.

✨Tip Number 2

Engage with the Site Reliability Engineering community online. Join forums, attend webinars, or participate in relevant meetups to network with professionals in the field. This can provide insights into industry trends and potentially lead to referrals.

✨Tip Number 3

Prepare to discuss your approach to incident response and risk management. Be ready to share examples of how you've handled incidents in the past, focusing on your communication strategies and the outcomes of those situations.

✨Tip Number 4

Showcase your ability to work collaboratively by highlighting any past experiences where you’ve successfully partnered with cross-functional teams. Emphasising your communication skills and teamwork will resonate well with our inclusive culture.

We think you need these skills to ace Site Reliability Engineer

Site Reliability Engineering
Reliability Systems Thinking
Data-Driven Decision Making
Financial Services Knowledge
Programming Languages
Deploy and Release Services
Automation
Troubleshooting
Software Development Lifecycle Tools
Mathematical and Statistical Modelling
Observability Tools (Grafana, Prometheus, OpenTelemetry)
Public Cloud Environments (AWS, GCP)
Infrastructure as Code (Terraform)
Strong Communication Skills
Stakeholder Engagement

Some tips for your application 🫡

Tailor Your CV: Make sure your CV highlights relevant experience in site reliability engineering and showcases your knowledge of reliability systems thinking. Include specific examples of how you've improved system reliability or performance in previous roles.

Craft a Compelling Cover Letter: In your cover letter, express your enthusiasm for the role and the company. Discuss how your skills align with the job description, particularly your experience with observability tools and cloud environments like AWS and GCP.

Showcase Technical Skills: Clearly outline your technical expertise in programming languages, automation, and troubleshooting. Mention any experience you have with Infrastructure as Code tools like Terraform, as well as your familiarity with monitoring tools such as Grafana and Prometheus.

Highlight Communication Abilities: Since the role involves significant stakeholder interaction, emphasise your strong communication skills. Provide examples of how you've effectively communicated incident updates or collaborated with teams to achieve common goals.

How to prepare for a job interview at NatWest Group

✨Showcase Your Technical Expertise

Be prepared to discuss your experience with reliability systems and site reliability engineering. Highlight specific projects where you've implemented SLOs, SLIs, or error budgets, and be ready to explain the impact of your work on system reliability.

✨Demonstrate Problem-Solving Skills

Expect scenario-based questions that assess your ability to handle incidents and improve system performance. Use examples from your past experiences to illustrate how you approached challenges and what solutions you implemented.

✨Communicate Clearly and Effectively

Strong communication skills are crucial for this role. Practice articulating complex technical concepts in a way that is understandable to non-technical stakeholders. Be ready to discuss how you would keep teams informed during incidents.

✨Emphasise Collaboration and Innovation

This position requires working closely with various teams. Share examples of how you've collaborated with engineers and product owners in the past. Also, be prepared to discuss any innovative ideas you have for improving processes or systems.

Site Reliability Engineer
NatWest Group
N
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>