Site Reliability Engineer
Site Reliability Engineer

Site Reliability Engineer

Slough Full-Time 48000 - 72000 £ / year (est.) Home office (partial)
Go Premium
C

At a Glance

  • Tasks: Support digital services and enhance cloud infrastructure performance.
  • Company: Join a leading tech team focused on innovative digital solutions.
  • Benefits: Enjoy flexible working options and a collaborative culture.
  • Why this job: Be at the forefront of technology, driving impactful change in a dynamic environment.
  • Qualifications: Strong Azure experience and automation skills are essential.
  • Other info: Ideal for self-motivated individuals who thrive under pressure.

The predicted salary is between 48000 - 72000 £ per year.

This role is critical to the success of the digital services strategy, cloud infrastructure, and overall data platform. Sitting within the broader Digital Client technology team, this hire will focus on improving release and support processes, and enhancing infrastructure performance, supportability, scalability, and cost efficiency. This person will also serve as the key contact for escalations, major incidents, and release support during European business hours.

Key Responsibilities

  • Act as L3 escalation contact for critical production issues and major incidents
  • Lead or support release cycles and ensure production readiness
  • Collaborate with stakeholders, including Front Office and executive teams
  • Drive remediation of audit findings and ensure alignment with enterprise security standards
  • Ensure operational readiness of digital products (resiliency, observability, etc.)
  • Collaborate with quality engineering to meet non-functional requirements
  • Maintain compliance with enterprise tech strategy and protection mandates

Skills & Experience

  • Strong experience in Microsoft Azure
  • Experience developing automation and orchestration using Azure Python SDK, Terraform, and GitHub Runners
  • Expertise in automating serverless PaaS solutions (Azure App Services, Databricks, Cosmos DB, SQL, Data Lake, HDInsight)
  • Proficient in writing Azure Resource Manager (ARM) templates and Terraform scripts
  • Understanding of cloud security (RBAC, audit logging, authentication)
  • Experience troubleshooting network and application access to Azure resources (NSGs, routing)
  • Familiar with deploying databases and infrastructure via CI/CD pipelines
  • Knowledge of Azure RBAC, Active Directory, and Ping integration
  • Solid grasp of cloud provisioning, disaster recovery, performance monitoring, and business continuity

Soft Skills

  • Strong leadership and communication skills
  • Critical thinking and problem-solving under pressure
  • Self-motivated, works well independently and as part of a team
  • Familiarity with Agile/Scrum environments
C

Contact Detail:

Caspian One Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer

✨Tip Number 1

Familiarise yourself with Microsoft Azure and its services, especially those mentioned in the job description. Having hands-on experience or projects that showcase your skills in Azure will make you stand out during discussions.

✨Tip Number 2

Prepare to discuss your experience with automation and orchestration tools like Terraform and the Azure Python SDK. Be ready to share specific examples of how you've used these tools to improve processes or solve problems in previous roles.

✨Tip Number 3

Brush up on your knowledge of cloud security practices, particularly around RBAC and audit logging. Being able to articulate your understanding of these concepts will demonstrate your readiness for the responsibilities outlined in the role.

✨Tip Number 4

Showcase your soft skills, especially leadership and communication. Think of scenarios where you've successfully led a team or communicated complex technical issues to non-technical stakeholders, as these experiences will be valuable in this role.

We think you need these skills to ace Site Reliability Engineer

Microsoft Azure
Automation and Orchestration
Azure Python SDK
Terraform
GitHub Runners
Serverless PaaS Solutions
Azure App Services
Databricks
Cosmos DB
SQL
Data Lake
HDInsight
Azure Resource Manager (ARM) Templates
Cloud Security
RBAC
Audit Logging
Authentication
Network Troubleshooting
Application Access
CI/CD Pipelines
Active Directory
Ping Integration
Cloud Provisioning
Disaster Recovery
Performance Monitoring
Business Continuity
Leadership Skills
Communication Skills
Critical Thinking
Problem-Solving
Self-Motivated
Agile/Scrum Familiarity

Some tips for your application 🫡

Tailor Your CV: Make sure your CV highlights your experience with Microsoft Azure and automation tools like Terraform and Python SDK. Use specific examples that demonstrate your expertise in cloud infrastructure and incident management.

Craft a Compelling Cover Letter: In your cover letter, explain why you are passionate about Site Reliability Engineering. Mention your experience with release cycles and how you've improved operational readiness in previous roles. Be sure to align your skills with the key responsibilities outlined in the job description.

Showcase Relevant Projects: If you have worked on projects involving Azure PaaS solutions or CI/CD pipelines, include these in your application. Describe your role and the impact of your contributions, especially in terms of performance monitoring and disaster recovery.

Highlight Soft Skills: Don't forget to mention your soft skills, such as leadership and communication abilities. Provide examples of how you've successfully collaborated with teams or led initiatives under pressure, as these are crucial for the role.

How to prepare for a job interview at Caspian One

✨Showcase Your Technical Skills

Be prepared to discuss your experience with Microsoft Azure and the specific tools mentioned in the job description, such as Terraform and Python SDK. Highlight any projects where you've successfully implemented automation or orchestration.

✨Demonstrate Problem-Solving Abilities

Expect to face scenario-based questions that test your critical thinking and troubleshooting skills. Prepare examples of past incidents you've resolved, particularly those involving cloud security or network access issues.

✨Emphasise Collaboration and Communication

Since this role involves working closely with various stakeholders, be ready to discuss how you’ve effectively communicated technical information to non-technical teams. Share experiences where your leadership made a difference during major incidents.

✨Understand the Company’s Digital Strategy

Research the company's digital services strategy and be ready to discuss how your skills align with their goals. Showing that you understand their mission will demonstrate your genuine interest in the role and the company.

Site Reliability Engineer
Caspian One
Location: Slough
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

C
  • Site Reliability Engineer

    Slough
    Full-Time
    48000 - 72000 £ / year (est.)
  • C

    Caspian One

    50-100
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>