Lead Site Reliability Engineer Tech - Development · London
Lead Site Reliability Engineer Tech - Development · London

Lead Site Reliability Engineer Tech - Development · London

London Full-Time 80000 - 100000 £ / year (est.) No home office possible
Dare

At a Glance

  • Tasks: Ensure stability and performance of trading platforms while leading incident response and optimising systems.
  • Company: Dynamic energy trading company at the forefront of technology and data science.
  • Benefits: Competitive salary, health insurance, 38 days holiday, and a vibrant work culture.
  • Other info: Mentorship opportunities and excellent career growth in a collaborative environment.
  • Why this job: Join a team of ambitious individuals and make a real impact in fast-paced trading environments.
  • Qualifications: Extensive SRE experience, strong programming skills, and knowledge of low-latency systems.

The predicted salary is between 80000 - 100000 £ per year.

We are an energy trading company generating liquidity across global commodities markets. We combine deep trading expertise with proprietary technology and the power of data science to be the best-in-class. Our understanding of volatile, data-intensive markets is a key part of our edge.

At Dare, you will be joining a team of ambitious individuals who challenge themselves and each other. We have a culture of empowering exceptional people to become the best version of themselves.

As a Lead Site Reliability Engineer, you will play a critical role in ensuring the stability, scalability, and performance of mission‑critical, low‑latency trading platforms. You’ll work closely with traders, quantitative analysts, and engineers in a fast‑paced environment where precision and speed are essential. This role combines deep technical expertise with leadership responsibility. You will own the reliability strategy while remaining hands‑on with production systems and complex distributed architectures.

You will define and drive reliability practices for latency‑sensitive trading infrastructure, establish and enforce service level objectives, and lead incident response across live trading environments. You’ll focus on optimising system performance and latency, while collaborating with stakeholders to balance reliability, execution speed, and operational risk. Shaping technical direction, you will actively contribute to debugging, automation, and system design, while mentoring engineers to build a high‑performing and resilient engineering culture.

What you’ll be doing:

  • Ensure real‑time trading systems remain stable and performant, proactively monitoring, diagnosing, and resolving issues impacting trading or market connectivity.
  • Lead production incident response as the first line of defence, driving live troubleshooting, root‑cause analysis, and long‑term remediation.
  • Define and own reliability strategy performance including service level objectives, service level indicators, and error budgets for critical trading systems.
  • Collaborate with trading, engineering, and infrastructure teams on capacity planning, upgrades, and low/zero‑downtime migrations.
  • Drive automation across operational workflows using Python, Bash, and SQL to reduce manual intervention.
  • Continuously optimise systems and networks, leveraging deep operating system, networking, and performance expertise.
  • Manage and mentor engineers across London and offshore teams, promoting engineering best practices.
  • Act as a senior escalation point during high‑severity incidents.
  • Participate in and lead on‑call rotations, including nights for ICE market opening hours.
  • Support releases, maintenance, and trading events outside standard hours including weekends.

What You’ll Bring:

  • Extensive experience as a Site Reliability Engineer (SRE), DevOps or Production Support Engineering.
  • Experience within trading, hedge funds, or financial services, ideally close to front‑office systems.
  • Strong understanding of low‑latency, highly distributed trading systems.
  • Deep knowledge of cloud platforms (AWS, GCP, or Azure).
  • Deep expertise in Linux/UNIX environments and command‑line tooling.
  • Advanced understanding of application‑level networking (TCP/IP, UDP).
  • Strong programming/scripting skills (Python, Bash) with SQL proficiency.
  • Experience with CI/CD pipelines and infrastructure‑as‑code (Terraform, Kubernetes).
  • Proven experience in incident management, root‑cause analysis, and system optimisation.
  • Experience managing large‑scale infrastructure, including capacity planning and migrations.
  • Ability to leverage AI to develop and deliver solutions and rapid velocity.

Desirable:

  • Experience in market‑making environment.
  • Strong operating system level performance tuning expertise.
  • Exposure to exchange connectivity and market data systems.
  • Understanding of financial markets and trading workflows.

Benefits & perks:

  • Competitive salary
  • Vitality health insurance and dental cover
  • 38 days of holiday (including bank holidays)

Lead Site Reliability Engineer Tech - Development · London employer: Dare

At Dare, we pride ourselves on being an exceptional employer, offering a dynamic work environment in the heart of London where innovation meets expertise. Our culture fosters collaboration and personal growth, empowering our employees to excel in their roles while enjoying competitive benefits such as comprehensive health insurance and generous holiday allowances. Join us to be part of a forward-thinking team that values your contributions and supports your professional development in the fast-paced world of energy trading.
Dare

Contact Detail:

Dare Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Lead Site Reliability Engineer Tech - Development · London

Tip Number 1

Network like a pro! Get out there and connect with folks in the industry. Attend meetups, webinars, or even just grab a coffee with someone who works in trading or tech. You never know who might have the inside scoop on job openings!

Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to site reliability engineering. This gives potential employers a taste of what you can do and sets you apart from the crowd.

Tip Number 3

Prepare for interviews by brushing up on your technical knowledge and problem-solving skills. Practice common SRE scenarios and be ready to discuss how you've handled incidents in the past. Confidence is key!

Tip Number 4

Don't forget to apply through our website! We love seeing applications directly from candidates who are genuinely interested in joining our team. Plus, it shows you're proactive and keen on being part of our culture.

We think you need these skills to ace Lead Site Reliability Engineer Tech - Development · London

Site Reliability Engineering (SRE)
DevOps
Production Support Engineering
Low-Latency Trading Systems
Cloud Platforms (AWS, GCP, Azure)
Linux/UNIX Environments
Command-Line Tooling
Application-Level Networking (TCP/IP, UDP)
Programming/Scripting (Python, Bash)
SQL Proficiency
CI/CD Pipelines
Infrastructure-as-Code (Terraform, Kubernetes)
Incident Management
Root-Cause Analysis
System Optimisation

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the Lead Site Reliability Engineer role. Highlight your experience with low-latency trading systems and any relevant cloud platforms you've worked with. We want to see how your skills align with our needs!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about this role and how your background in SRE or DevOps makes you a perfect fit for our team. Let us know what excites you about working in a fast-paced trading environment.

Showcase Your Technical Skills: Don’t forget to showcase your technical expertise! Mention your programming skills in Python and Bash, as well as your experience with CI/CD pipelines. We love seeing candidates who can drive automation and optimise systems.

Apply Through Our Website: We encourage you to apply through our website for a smoother application process. It helps us keep track of your application and ensures you don’t miss out on any important updates. We can’t wait to hear from you!

How to prepare for a job interview at Dare

Know Your Tech Inside Out

Make sure you brush up on your technical skills, especially around low-latency trading systems and cloud platforms like AWS or GCP. Be ready to discuss your experience with Python, Bash, and SQL, as well as any CI/CD tools you've used. The more confident you are in your technical knowledge, the better you'll perform.

Understand the Company Culture

Dare values ambition and empowerment, so show them you're a team player who thrives in a fast-paced environment. Research their approach to trading and technology, and be prepared to discuss how you can contribute to their culture of excellence and innovation.

Prepare for Incident Management Scenarios

Since you'll be leading incident response, think of examples from your past experiences where you successfully managed high-severity incidents. Be ready to explain your troubleshooting process and how you conducted root-cause analysis to prevent future issues.

Showcase Your Leadership Skills

As a Lead Site Reliability Engineer, you'll need to mentor others and drive best practices. Prepare to discuss your leadership style and provide examples of how you've guided teams in the past. Highlight any experience you have in managing offshore teams or collaborating across departments.

Lead Site Reliability Engineer Tech - Development · London
Dare
Location: London

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>