Capacity and Performance Reliability Manager
Capacity and Performance Reliability Manager

Capacity and Performance Reliability Manager

Full-Time 48000 - 84000 £ / year (est.) No home office possible
L

At a Glance

  • Tasks: Manage capacity and performance for trading platforms, ensuring reliability and efficiency.
  • Company: Join the London Metal Exchange, a global leader in industrial metals trading.
  • Benefits: Enjoy a competitive salary, flexible working hours, and opportunities for professional growth.
  • Why this job: Be at the forefront of trading technology and make a real impact on global markets.
  • Qualifications: Degree or 5+ years in performance and capacity management; ITIL certification preferred.
  • Other info: Diverse and inclusive workplace with a focus on collaboration and innovation.

The predicted salary is between 48000 - 84000 £ per year.

The London Metal Exchange (LME) is the world centre for industrial metals trading. Most of the world’s global non-ferrous futures business is conducted on the LME’s three trading platforms totaling $18 trillion, 178 million lots and 4 billion tonnes with a market open interest high of 1.8 million lots in 2024. The metals community uses the LME, an HKEX Group company, as a venue to transfer or take on price risk, as a physical market of last resort and as the provider of transparent global reference prices.

Overall Purpose of Role

Capacity Management at the LME is a key function, linked to strict regulatory compliance requirements, to actively manage multiple environments. With a large virtual estate encompassing multiple VMWare Clusters and OpenShift Containers Platform (OCP), the Capacity and Performance Reliability Engineer is key to ensure the stability of the platforms. The Capacity and Performance Reliability Engineer is responsible for ensuring the reliability, availability, and performance of all infrastructure and services, proactively identifying and mitigating risks, and driving continuous improvement in operational resilience and service quality. This includes maintenance of the capacity management tool suite, capacity reporting, trend analysis and forecasting, ad-hoc performance investigations, demand management, and governance of the relevant processes and policies. The Capacity and Performance Reliability Engineer must have extensive knowledge of trading technologies and the operation of a trading value, with the ability to incorporate business metrics and knowledge into the technical metrics from the LME core systems.

Responsibilities

  • Capacity Planning & Performance Management: Use historical data and predictive analytics to forecast demand and plan capacity for all environments (virtual, containerised, and physical). Perform stress testing, scenario modelling, and performance tuning to ensure systems can handle peak loads. Automate scaling, resource allocation, and infrastructure provisioning using Infrastructure as Code (IaC) and cloud-native tools. Maintain and enhance the Capacity Management tool suite (e.g., Athene, Grafana), ensuring zero data loss and maximum automation.
  • Collaboration & Continuous Improvement: Work closely with development, operations, and business teams to embed reliability and capacity considerations into system design and delivery. Promote best practices in automation, observability, and incident management. Present findings, reports, and recommendations to business heads, service managers, and technical teams. Build relationships with internal and external stakeholders, including architects, testing teams, service managers, project sponsors, and third-party suppliers.
  • Metrics, Reporting & Governance: Produce regular service and infrastructure capacity plans, reliability reports, and recommendations for action. Own and manage the Capacity Management Recommendations tracker. Report on reliability metrics, incidents, and system health to senior management. Ensure compliance with regulatory requirements and internal governance standards.
  • Reliability Engineering & System Health: Develop, implement, and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets for critical services. Design and manage monitoring, alerting, and observability solutions to detect and prevent failures. Lead incident response, conduct blameless post-incident reviews, and drive corrective actions to prevent recurrence. Champion a reliability-focused, automation-first culture across teams.

Qualifications

  • Educated to degree standard and/or 5+ years of performance and capacity experience.
  • ITIL Foundation Certification.
  • Currently holds or has previously held a similar position.
  • Experience in reliability engineering, site reliability engineering (SRE), or similar roles is highly desirable.

Experience & Knowledge

  • At least 5 years’ experience in performance, capacity, or reliability management within a business-critical global banking, financial services, or technology environment.
  • In-depth knowledge of trading technologies, system performance, and their linkage to business metrics.
  • Proven experience with capacity forecasting, modelling, and analysis techniques.
  • Strong analytical skills for transforming machine data into actionable insights.
  • Experience managing relationships at all levels, from technical specialists to non-technical business representatives.
  • Proficiency with monitoring and automation tools (e.g., Athene, Grafana, Prometheus, DataDog, Terraform, Kubernetes, CI/CD pipelines).
  • Significant SQL knowledge and high-level expertise in Excel.
  • Ability to code in programming and query languages (e.g., Visual Basic, MS SQL, Python).
  • Understanding of APIs and automation scripting.
  • Knowledge of cloud architecture, containers, orchestration, and agile/CICD practices is desirable.

Skills & Core Competencies

  • Demonstrated ability to deliver innovative solutions supporting business and service operations.
  • Excellent communication skills, with the ability to prepare and present clear, concise, and effective reports for senior management.
  • Highly numerate, with strong statistical analysis and system modelling techniques.
  • Experience in business and service capacity management, reliability engineering, and performance optimisation.
  • Comprehensive understanding of queueing theory and system modelling.
  • Collaborative, improvement-oriented mindset with a passion for data and technology.
  • Ability to work independently or as part of a team, taking pride in individual and team deliverables.
  • Flexible yet structured approach to problem-solving, with the ability to analyse complex problems and identify suitable solutions.
  • Well-organised, self-motivated, and enthusiastic about reliability and capacity management.

The LME is committed to creating a diverse environment and is proud to be an equal opportunity employer. In recruiting for our teams, we welcome the unique contributions that you can bring in terms of education, ethnicity, race, sex, gender identity, expression and reassignment, nation of origin, age, languages spoken, colour, religion, disability, sexual orientation and beliefs. In doing so, we want every LME employee to feel our commitment to showing respect for all and encouraging open collaboration and communication.

Capacity and Performance Reliability Manager employer: London Metal Exchange

The London Metal Exchange (LME) is an exceptional employer, offering a dynamic work environment in the heart of London, where innovation and collaboration thrive. With a strong commitment to employee growth, the LME provides extensive training opportunities and fosters a culture of continuous improvement, ensuring that every team member can contribute meaningfully to the world of industrial metals trading. Additionally, the LME values diversity and inclusion, creating a workplace where unique perspectives are celebrated and respected.
L

Contact Detail:

London Metal Exchange Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Capacity and Performance Reliability Manager

✨Tip Number 1

Network like a pro! Reach out to folks in the industry, especially those already at the LME or similar companies. A friendly chat can open doors and give you insights that job descriptions just can't.

✨Tip Number 2

Prepare for interviews by diving deep into the company’s culture and values. The LME is all about collaboration and innovation, so think of examples from your past that showcase these traits. We want to see how you fit into our vibe!

✨Tip Number 3

Show off your skills with real-world examples! When discussing your experience, relate it back to the specific responsibilities mentioned in the job description. This will help us see how you can contribute to capacity management and performance reliability.

✨Tip Number 4

Don’t forget to follow up after your interview! A quick thank-you note can leave a lasting impression and shows your enthusiasm for the role. Plus, it keeps you on our radar as we make decisions.

We think you need these skills to ace Capacity and Performance Reliability Manager

Capacity Management
Performance Management
Predictive Analytics
Infrastructure as Code (IaC)
Cloud-Native Tools
Monitoring and Automation Tools
SQL
Excel
Programming and Query Languages (e.g., Visual Basic, MS SQL, Python)
APIs and Automation Scripting
Reliability Engineering
Incident Management
Statistical Analysis
System Modelling Techniques
Collaboration Skills

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the role of Capacity and Performance Reliability Manager. Highlight your experience in performance management, reliability engineering, and any relevant technologies you've worked with. We want to see how your skills align with what we're looking for!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about this role and how your background makes you a perfect fit. Don’t forget to mention your understanding of trading technologies and capacity management – we love that stuff!

Showcase Your Achievements: When detailing your experience, focus on specific achievements rather than just listing duties. Use metrics where possible to demonstrate your impact, like improvements in system performance or successful capacity forecasts. We appreciate numbers that tell a story!

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you’re keen and know how to navigate our digital space – a key skill for this role!

How to prepare for a job interview at London Metal Exchange

✨Know Your Tech Inside Out

Make sure you brush up on your knowledge of trading technologies and performance management tools like Athene and Grafana. Be ready to discuss how you've used these tools in past roles, especially in relation to capacity forecasting and reliability engineering.

✨Showcase Your Analytical Skills

Prepare to demonstrate your analytical prowess by discussing specific examples where you've transformed machine data into actionable insights. Think about how you've used predictive analytics for capacity planning and be ready to share your thought process.

✨Emphasise Collaboration

The role requires working closely with various teams, so highlight your experience in cross-functional collaboration. Share examples of how you've built relationships with technical and non-technical stakeholders to drive improvements in service quality.

✨Prepare for Scenario Questions

Expect scenario-based questions that test your problem-solving skills. Think about past incidents you've managed, how you led post-incident reviews, and the corrective actions you implemented. This will show your ability to handle real-world challenges effectively.

Capacity and Performance Reliability Manager
London Metal Exchange

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

L
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>