Site Reliability Engineer

Site Reliability Engineer

Full-Time 36000 - 60000 £ / year (est.) No home office possible
S

At a Glance

  • Tasks: Elevate system reliability and performance for cutting-edge trading platforms.
  • Company: Dynamic multi-strategy investment firm with a focus on data-driven solutions.
  • Benefits: Competitive salary, flexible work options, and opportunities for professional growth.
  • Why this job: Play a key role in modernising operational capabilities and enhancing performance.
  • Qualifications: 5+ years in SRE or related roles; expertise in Kubernetes and observability tools.
  • Other info: Join a collaborative team and make a real impact in a fast-paced environment.

The predicted salary is between 36000 - 60000 £ per year.

Our client is a multi-strategy investment firm operating highly sophisticated, data-driven trading platforms, and we are seeking a Site Reliability Engineer to help elevate the reliability and performance of the core systems. In this role, you will define standards, influence best practices across our Core Data Services team, and work closely with our DevOps and Cloud teams to drive adoption and consistency. You will take a hands-on approach across both cloud and on-prem environments, ensuring the reliability, performance, and scalability of our trading systems and production platforms. This is an opportunity to play a pivotal role in modernizing our operational capabilities and elevating performance across diverse environments.

Key Responsibilities

  • Define, promote, and embed SRE best practices to build a scalable and resilient infrastructure foundation.
  • Enhance and extend observability and monitoring capabilities using Prometheus, Grafana, Loki, and Tempo to deliver comprehensive visibility into system and application performance.
  • Take part in the on-call rotation (approximately one week per month) to ensure continuous operational support.
  • Establish and evolve reliability standards for applications running within Kubernetes, fine-tuning configurations for optimal performance, cost efficiency, and durability.
  • Build automation and internal tools to streamline deployment processes, improve health checks, and strengthen recovery and failover routines.

Requirements

  • 5+ years of experience in SRE or related roles supporting complex, distributed systems.
  • Bachelor's degree in computer science, engineering, information systems, or equivalent practical experience.
  • Expertise with observability stacks including Prometheus, Grafana, Loki, and Tempo/OTEL.
  • Strong proficiency with Kubernetes orchestration and Docker containerization.
  • Hands-on experience with cloud environments (AWS preferred) and on-premises infrastructure.
  • Skilled in Python, Bash, or Go for automation and tool development.
  • Solid understanding of CI/CD practices, agile workflows, and DevOps principles.
  • High initiative, strong attention to detail, and a passion for reliability engineering.
  • Excellent communication skills, including the ability to simplify and explain technical concepts to a wide range of stakeholders.
  • Experience working with databases (PostgreSQL, Redis, Snowflake), messaging technologies (Kafka, Solace), and/or workflow orchestration tools such as Airflow.

Site Reliability Engineer employer: Selby Jennings

Our client is an exceptional employer, offering a dynamic work culture that fosters innovation and collaboration among talented professionals. With a strong focus on employee growth, they provide ample opportunities for skill development and career advancement, particularly in the fast-paced world of data-driven trading platforms. Located in a vibrant area, employees enjoy a supportive environment that values work-life balance and encourages a hands-on approach to modernising operational capabilities.
S

Contact Detail:

Selby Jennings Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer

✨Tip Number 1

Network like a pro! Reach out to folks in the industry on LinkedIn or at meetups. We can’t stress enough how important it is to make connections; you never know who might have the inside scoop on job openings.

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to SRE and cloud environments. This gives potential employers a taste of what you can do and sets you apart from the crowd.

✨Tip Number 3

Prepare for interviews by practising common SRE scenarios and questions. We recommend doing mock interviews with friends or using online platforms. The more comfortable you are, the better you’ll perform when it counts!

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, we love seeing candidates who take that extra step to connect directly with us.

We think you need these skills to ace Site Reliability Engineer

Site Reliability Engineering (SRE)
Observability Stacks (Prometheus, Grafana, Loki, Tempo)
Kubernetes Orchestration
Docker Containerization
Cloud Environments (AWS preferred)
Python
Bash
Go
CI/CD Practices
Agile Workflows
DevOps Principles
Database Management (PostgreSQL, Redis, Snowflake)
Messaging Technologies (Kafka, Solace)
Workflow Orchestration Tools (Airflow)
Communication Skills

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the Site Reliability Engineer role. Highlight your experience with SRE practices, cloud environments, and any relevant tools like Prometheus or Kubernetes. We want to see how your skills align with what we're looking for!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about reliability engineering and how you can contribute to our team. Keep it concise but impactful – we love a good story that showcases your journey.

Showcase Your Projects: If you've worked on any projects that demonstrate your expertise in observability stacks or automation, make sure to mention them. We appreciate hands-on experience, so share specific examples of how you've tackled challenges in previous roles.

Apply Through Our Website: We encourage you to apply through our website for a smoother process. It helps us keep track of applications and ensures you get the attention you deserve. Plus, it’s super easy – just follow the prompts and let us know why you’d be a great fit!

How to prepare for a job interview at Selby Jennings

✨Know Your SRE Fundamentals

Make sure you brush up on the core principles of Site Reliability Engineering. Understand how to define and promote best practices, especially in relation to building scalable and resilient infrastructures. Be ready to discuss your experience with observability tools like Prometheus and Grafana, as well as your approach to enhancing system performance.

✨Showcase Your Hands-On Experience

Prepare to share specific examples from your past roles where you've taken a hands-on approach in cloud and on-prem environments. Highlight your work with Kubernetes and Docker, and be ready to explain how you've optimised configurations for performance and cost efficiency.

✨Demonstrate Your Automation Skills

Since automation is key in this role, come prepared to discuss your experience with scripting languages like Python, Bash, or Go. Think of instances where you've built internal tools or streamlined deployment processes, and be ready to explain the impact of your contributions.

✨Communicate Clearly and Confidently

Strong communication skills are essential, so practice explaining complex technical concepts in simple terms. Be prepared to engage with a variety of stakeholders and demonstrate your ability to convey information effectively, especially when discussing reliability standards and CI/CD practices.

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

S
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>