At a Glance
- Tasks: Lead the establishment of SRE foundations and improve system reliability across platforms.
- Company: Join a forward-thinking tech company focused on operational excellence and innovation.
- Benefits: Enjoy healthcare, retirement planning, paid volunteering days, and wellbeing initiatives.
- Other info: Work in a diverse team with opportunities for personal and professional growth.
- Why this job: Be a technical leader shaping the future of reliability in a collaborative environment.
- Qualifications: 10+ years in SRE or related roles with strong AWS and Kubernetes experience.
The predicted salary is between 80000 - 100000 € per year.
We are evolving our Site Reliability Engineering capabilities to strengthen reliability, observability, security, and operational excellence across our Markets and Risk Intelligence division. As a Technical Lead SRE, you will be a senior hands-on technical person helping shape the foundations of reliability across both new and existing platforms. You will collaborate with Architecture, Engineering, Security, and Platform teams to ensure reliability is built into systems from day one. While this is not a people-management or shift-based role, you will work closely with global teams and may occasionally be called upon for major incidents or critical issues. This position requires a highly proactive, hard-working expert with strong leadership presence and ownership of platform reliability outcomes.
Key Responsibilities
- Lead the establishment of SRE foundations for new projects: building environments, monitoring, alerting, and ensuring operational readiness from day one.
- Collaborate with Architecture and Engineering teams to embed reliability, scalability, security, and observability into system design.
- Define, implement, and champion observability standards, tooling, and guidelines across metrics, logs, traces, and SLIs/SLOs.
- Design and evolve monitoring and alerting solutions that improve visibility, reduce toil, and strengthen system health.
- Continuously drive reliability improvements across our environments through incident reduction, performance tuning, and building resilient patterns.
- Partner with Security teams to ensure our platforms meet compliance, security, and risk-management expectations.
- Lead seamless handovers from project delivery into BAU SRE operations by ensuring documentation, readiness, and strong operational practices.
- Influence architectural and design decisions through data-driven cloud cost optimization and efficiency initiatives.
- Be a technical leader and mentor supporting engineers, shaping engineering standards, and fostering a culture of learning and development.
Person Specification
- Strong technical authority with the ability to influence design and operational decisions.
- Highly collaborative, comfortable working across architecture, engineering, security, and operations teams.
- Calm and methodical under pressure, especially during incidents and critical issues.
- Pragmatic problem-solver who balances reliability, security, cost, and delivery speed.
- Clear communicator, able to explain complex technical concepts to diverse audiences.
Education
- Bachelor’s Degree in Computer Science or related field.
Required Skills And Experience
- 10+ years of hands-on technical experience in SRE, Platform Engineering, Infrastructure, or related roles.
- Strong experience with AWS, including services such as EKS, ECS, EC2, networking, IAM, and managed services.
- Deep hands-on experience with Kubernetes and containerised platforms.
- Strong background in Linux systems administration.
- Proven experience designing and operating observability platforms, including monitoring, logging, and alerting.
- Hands-on experience with Datadog for metrics, logs, APM, and alerting.
- Strong understanding of SRE principles, including SLOs, error budgets, incident management, and reliability engineering.
- Experience working closely with architecture and engineering teams on system design and delivery.
- Solid understanding of cloud security principles and experience collaborating with security teams.
- Experience with cloud cost optimisation strategies and tooling.
- Hands-on experience integrating AI with observability stacks (Prometheus, Grafana, ELK, OpenTelemetry) for proactive issue detection.
Good To Have Skills
- Experience or working knowledge of Microsoft Azure.
- Experience supporting multi-cloud or hybrid environments.
- Exposure to Infrastructure as Code (e.g., Terraform, CloudFormation).
- Experience in large-scale, complex, or regulated environments.
- Knowledge of vector databases and RAG architectures for building internal SRE knowledge assistants.
- Knowledge of Generative AI and LLM platforms (e.g., Claude, Amazon Bedrock).
Career Stage Senior Associate
Benefits LSEG offers a range of tailored benefits and support, including healthcare, retirement planning, paid volunteering days and wellbeing initiatives.
Equal Opportunity Statement LSEG is a proud equal opportunities employer. We do not discriminate on the basis of race, religion, colour, national origin, gender, sexual orientation, gender identity, gender expression, age, marital status, veteran status, pregnancy or disability, or any other basis protected under applicable law. We can reasonably accommodate applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs.
Lead Site Reliability Engineer in Nottingham employer: LSEG
At LSEG, we pride ourselves on being an exceptional employer, offering a dynamic work culture that fosters collaboration and innovation. As a Lead Site Reliability Engineer, you will have the opportunity to shape the future of our platforms while benefiting from tailored support such as healthcare, retirement planning, and wellbeing initiatives. Our commitment to employee growth and development, combined with our inclusive environment, makes LSEG a rewarding place to advance your career in a meaningful way.
StudySmarter Expert Advice🤫
We think this is how you could land Lead Site Reliability Engineer in Nottingham
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, attend meetups, and connect with potential colleagues on LinkedIn. You never know who might have the inside scoop on job openings or can put in a good word for you.
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your projects and contributions. This is a great way to demonstrate your hands-on experience and technical prowess to potential employers.
✨Tip Number 3
Prepare for interviews by brushing up on your technical knowledge and soft skills. Practice common SRE scenarios and be ready to discuss how you've tackled challenges in the past. Confidence is key!
✨Tip Number 4
Don't forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you're genuinely interested in joining our team at StudySmarter.
We think you need these skills to ace Lead Site Reliability Engineer in Nottingham
Some tips for your application 🫡
Tailor Your CV:Make sure your CV reflects the skills and experiences that align with the Lead Site Reliability Engineer role. Highlight your hands-on technical experience, especially with AWS and Kubernetes, to show us you’re the right fit.
Craft a Compelling Cover Letter:Use your cover letter to tell us why you’re passionate about SRE and how your background makes you a great candidate. Be sure to mention any collaborative projects you've worked on with architecture and engineering teams.
Showcase Your Problem-Solving Skills:In your application, give examples of how you've tackled complex issues in past roles. We love seeing pragmatic problem-solvers who can balance reliability, security, and delivery speed.
Apply Through Our Website:Don’t forget to submit your application through our website! It’s the best way for us to receive your details and get the ball rolling on your journey with StudySmarter.
How to prepare for a job interview at LSEG
✨Know Your SRE Principles
Make sure you brush up on your understanding of SRE principles, especially SLOs, error budgets, and incident management. Being able to discuss these concepts confidently will show that you’re not just familiar with the theory but can apply it practically.
✨Showcase Your Technical Skills
Prepare to discuss your hands-on experience with AWS, Kubernetes, and observability platforms like Datadog. Bring specific examples of how you've designed and operated these systems, as well as any challenges you faced and how you overcame them.
✨Collaborate and Communicate
Since this role involves working closely with various teams, be ready to demonstrate your collaborative skills. Share examples of past projects where you worked across architecture, engineering, and security teams, highlighting how you communicated complex technical concepts effectively.
✨Prepare for Incident Scenarios
Expect to be asked about how you handle pressure during incidents. Think of a few scenarios where you successfully managed critical issues, focusing on your calmness, methodical approach, and problem-solving skills. This will help illustrate your leadership presence in high-stress situations.