Job Board

Companies

Natobotics

Site Reliability Engineer

Site Reliability Engineer in London

London Full-Time 48000 - 72000 £ / year (est.) No home office possible

Apply now

At a Glance

Tasks: Transform the SDLC environment with a focus on reliability, automation, and performance.
Company: Join Natobotics, a forward-thinking tech company in London.
Benefits: Enjoy a hybrid work model, competitive pay, and opportunities for growth.
Why this job: Make a real impact by enhancing system reliability and driving innovation.
Qualifications: 15+ years of experience in SRE, cloud platforms, and automation tools.
Other info: Be part of a culture that values collaboration and continuous improvement.

The predicted salary is between 48000 - 72000 £ per year.

Join to apply for the Site Reliability Engineer role at Natobotics.

Location: London. Work Mode: Hybrid. Contract Role.

Experience Level: 15+ Years.

A Site Reliability Engineer is responsible for transforming the SDLC environment with an engineering-focused role that emphasizes system reliability, automation, and performance in a non-production setting.

Responsibilities

Automate environment lifecycle: Develop Infrastructure as Code (IaC) to automate provisioning, teardown, and configuration of test environments, integrating them with the CI/CD pipeline.
Establish service level objectives (SLOs): Define and measure SLIs for test environments, such as availability and provisioning time.
Monitor environment health and performance: Use observability tools like Prometheus and Grafana to track the health of test environments, identify bottlenecks, and resolve issues proactively, not reactively.
Manage incident response: Lead the incident management process for test environment issues, conducting blameless post-mortems to understand the root causes and implement lasting fixes.
Minimize toil: Automate manual, repetitive tasks associated with test environments to free up engineering time for more strategic work.

Strategic and cultural responsibilities

Drive continuous improvement: Analyze environment performance data, incident reports, and post-mortems to identify opportunities for continuous improvement and innovation.
Balance reliability and speed: Use an "error budget" for test environments. If environments are highly reliable, teams can use the budget for quicker feature development. If reliability is low, the focus shifts to improving stability.
Instil a reliability culture: Promote a blameless culture around test environment incidents, encouraging shared ownership and collaboration between development, QA, and SRE teams.
Capacity planning: Anticipate the future resource needs of test environments by analysing usage patterns and project forecasts. Ensure the infrastructure can scale to meet demand.
Advance test data management: Work with Test Data Managers to ensure that test data is not only readily available but also consistent, compliant, and automatically provisioned with the environments.

Technical Skills

Expertise in tooling: Proficiency with monitoring and logging tools (e.g., Prometheus, Splunk, Grafana), CI/CD platforms (e.g., Jenkins, GitLab CI), and configuration management tools (e.g., Ansible, Terraform).
Cloud infrastructure knowledge: Deep understanding of cloud platforms like AWS, including experience with containerization technologies (Docker, Kubernetes) and serverless computing.
Scripting and programming: Strong scripting skills in languages such as Python or Bash to automate environment management tasks.
Systems and networking knowledge: Solid understanding of Linux systems, networking concepts, and database management.

Soft Skills

Leadership and influence: The ability to champion SRE practices and influence technical and business stakeholders across different teams.
Problem-solving: Strong analytical and debugging skills for investigating and resolving complex environment issues under pressure.
Communication: Excellent communication and collaboration skills to bridge the gap between development, QA, and operations teams.
Adaptability: A proactive and adaptable mindset to keep pace with evolving technology and development methodologies.

Employment and Location

Seniority level: Mid-Senior level
Employment type: Contract
Location: London, England, United Kingdom

Note: Referrals increase your chances of interviewing at Natobotics by 2x.

Site Reliability Engineer in London employer: Natobotics

At Natobotics, we pride ourselves on fostering a dynamic and inclusive work culture that empowers our employees to thrive. As a Site Reliability Engineer in London, you will benefit from a hybrid work model, competitive compensation, and opportunities for professional growth through continuous learning and innovation. Join us to be part of a collaborative team that values reliability and encourages a blameless culture, ensuring that your contributions make a meaningful impact.

Contact Detail:

Natobotics Recruiting Team

View Natobotics Profile

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer in London

✨Network Like a Pro

Get out there and connect with folks in the industry! Attend meetups, webinars, or even local tech events. The more people you know, the better your chances of landing that Site Reliability Engineer role.

✨Show Off Your Skills

Don’t just talk about your experience; demonstrate it! Create a portfolio showcasing your projects, especially those involving IaC, CI/CD, and monitoring tools like Prometheus and Grafana. This will make you stand out to potential employers.

✨Ace the Interview

Prepare for technical interviews by brushing up on your problem-solving skills and understanding of cloud infrastructure. Be ready to discuss your past experiences with incident management and how you’ve driven continuous improvement in your previous roles.

✨Apply Through Our Website

Make sure to apply directly through our website for the best chance at getting noticed. We love seeing candidates who are proactive and genuinely interested in joining our team at Natobotics!

We think you need these skills to ace Site Reliability Engineer in London

Infrastructure as Code (IaC)

Service Level Objectives (SLOs)

Observability tools (Prometheus, Grafana)

Incident management

Continuous improvement

Capacity planning

Test data management

Monitoring and logging tools (Splunk)

CI/CD platforms (Jenkins, GitLab CI)

Configuration management tools (Ansible, Terraform)

Cloud infrastructure (AWS)

Containerization technologies (Docker, Kubernetes)

Scripting skills (Python, Bash)

Linux systems knowledge

Problem-solving

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the Site Reliability Engineer role. Highlight your experience with automation, cloud infrastructure, and any relevant tools like Prometheus or Terraform. We want to see how your skills match what we're looking for!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about SRE and how you can contribute to our team. Be sure to mention specific projects or experiences that relate to the responsibilities listed in the job description.

Showcase Your Problem-Solving Skills: In your application, don’t forget to highlight your problem-solving abilities. Share examples of how you've tackled complex issues in past roles, especially those related to system reliability and performance. We love seeing how you think on your feet!

Apply Through Our Website: We encourage you to apply through our website for the best chance of getting noticed. It’s super easy, and you’ll be able to keep track of your application status. Plus, we’re excited to see your application come through!

How to prepare for a job interview at Natobotics

✨Know Your Tech Inside Out

Make sure you brush up on your knowledge of the tools mentioned in the job description, like Prometheus, Grafana, and CI/CD platforms. Be ready to discuss how you've used these tools in past roles and how they can be applied to improve system reliability.

✨Showcase Your Problem-Solving Skills

Prepare to share specific examples of how you've tackled complex issues in previous positions. Use the STAR method (Situation, Task, Action, Result) to structure your answers, focusing on your analytical skills and how you resolved incidents effectively.

✨Emphasise Automation Experience

Since automation is key for this role, come prepared with examples of how you've automated processes in the past. Discuss your experience with Infrastructure as Code (IaC) and any scripting languages you've used, like Python or Bash, to streamline operations.

✨Cultivate a Blameless Culture Mindset

Be ready to talk about your approach to incident management and how you promote a blameless culture within teams. Highlight your experience in conducting post-mortems and how you encourage collaboration between development, QA, and SRE teams to foster a reliable environment.

Site Reliability Engineer in London

Natobotics

Location: London

Apply now

Site Reliability Engineer in London

At a Glance

Site Reliability Engineer in London employer: Natobotics

StudySmarter Expert Advice 🤫

✨Network Like a Pro

✨Show Off Your Skills

✨Ace the Interview

✨Apply Through Our Website

We think you need these skills to ace Site Reliability Engineer in London

Some tips for your application 🫡

How to prepare for a job interview at Natobotics

Site Reliability Engineer in London

Land your dream job quicker with Premium