At a Glance
- Tasks: Lead the SRE function and drive operational excellence in production systems.
- Company: Join a forward-thinking financial institution transforming its tech landscape.
- Benefits: Competitive salary, flexible working options, and opportunities for professional growth.
- Why this job: Be at the forefront of innovation in banking technology and make a real impact.
- Qualifications: Proven SRE experience and strong skills in monitoring tools.
- Other info: Dynamic role with mentorship opportunities and a focus on AI-driven solutions.
The predicted salary is between 80000 - 100000 £ per year.
Our client is transforming their production support function into a full Site Reliability Engineering (SRE) model, and we’re looking for a hands-on SRE Lead to help establish and lead the SRE capability. We are looking for a hands-on SRE Lead to establish and lead the SRE function, ensuring operational excellence across production systems.
Key Responsibilities:
- Lead the SRE function across the engineering organisation and drive operational excellence across production systems.
- Define and implement the observability and monitoring strategy, including dashboards, alerting, SLOs, SLAs, and error budgets.
- Establish comprehensive monitoring coverage to ensure visibility into system health, infrastructure, and business-critical workflows.
- Drive adoption of AI-driven tools and automation for proactive system troubleshooting, incident triage, and root cause analysis.
- Lead and mentor a team of SRE Engineers embedded within engineering teams.
- Manage incident response processes, including on-call management and post-incident reviews.
- Collaborate with product and engineering teams to build reliability and observability into new systems.
- Monitor UI behaviour and end-to-end system performance, not just infrastructure metrics.
Essential Skills & Experience:
- Proven experience as an SRE Lead or Senior SRE in large-scale, high-availability production environments.
- Strong experience with observability and monitoring tools such as Datadog, Grafana, Prometheus, PagerDuty, or similar.
- Experience managing incident response, on-call processes, and post-incident reviews.
- Strong understanding of operational tooling for data ingestion and calculation pipelines, with the ability to detect anomalies in system behaviour.
- Ability to provide technical leadership and influence engineering stakeholders.
Nice to Have:
- Experience within financial data pipelines, index calculation, or capital markets systems.
- Exposure to AI/ML-based tools for anomaly detection and automated troubleshooting.
- Experience monitoring application-layer and UI behaviour, beyond infrastructure metrics.
- Experience building SRE practices in a greenfield or transformation environment.
SRE Lead (Banking/Financial) in London employer: Ascendion
Contact Detail:
Ascendion Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land SRE Lead (Banking/Financial) in London
✨Tip Number 1
Network like a pro! Reach out to your connections in the banking and financial sectors. Attend meetups, webinars, or industry events where you can chat with folks who might know about openings or can refer you directly.
✨Tip Number 2
Show off your skills! Create a portfolio or a GitHub repository showcasing your SRE projects, especially those involving observability tools like Datadog or Grafana. This gives potential employers a taste of what you can bring to the table.
✨Tip Number 3
Prepare for interviews by brushing up on incident management scenarios. Be ready to discuss how you've handled on-call situations or post-incident reviews in the past. Real-life examples will make you stand out!
✨Tip Number 4
Don’t forget to apply through our website! We’ve got loads of opportunities that might be perfect for you. Plus, it’s a great way to ensure your application gets seen by the right people.
We think you need these skills to ace SRE Lead (Banking/Financial) in London
Some tips for your application 🫡
Tailor Your CV: Make sure your CV reflects the skills and experiences that match the SRE Lead role. Highlight your hands-on experience with observability tools and incident management, as these are key for us.
Craft a Compelling Cover Letter: Use your cover letter to tell us why you're passionate about Site Reliability Engineering. Share specific examples of how you've driven operational excellence in past roles, especially in high-availability environments.
Showcase Your Technical Skills: Don’t just list your skills; demonstrate them! Mention specific tools like Datadog or Grafana that you’ve used, and explain how they contributed to system reliability in your previous jobs.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role without any hiccups!
How to prepare for a job interview at Ascendion
✨Know Your SRE Fundamentals
Make sure you brush up on the core principles of Site Reliability Engineering. Understand key concepts like SLOs, SLAs, and error budgets, as well as how observability tools like Datadog and Grafana work. Being able to discuss these topics confidently will show that you're not just familiar with the theory but can apply it in practice.
✨Showcase Your Hands-On Experience
Prepare to share specific examples from your past roles where you've led SRE initiatives or managed incident responses. Highlight any experience with AI-driven tools for troubleshooting and how you've implemented monitoring strategies. Real-world examples will help demonstrate your capability to lead the SRE function effectively.
✨Emphasise Collaboration Skills
Since the role involves working closely with product and engineering teams, be ready to discuss how you've successfully collaborated in the past. Talk about how you’ve built reliability into systems and mentored other engineers. This will show that you can not only lead but also foster a team-oriented environment.
✨Prepare for Technical Questions
Expect technical questions that assess your understanding of operational tooling and data pipelines. Brush up on how to detect anomalies in system behaviour and be prepared to discuss your approach to incident management. Practising these scenarios will help you feel more confident during the interview.