Site Reliability Engineer in Slough

Site Reliability Engineer in Slough

Slough Full-Time 70000 - 90000 € / year (est.) No home office possible
C

At a Glance

  • Tasks: Automate and optimise processes using AI for the Market Risk Platform.
  • Company: Join a leading financial tech firm in London.
  • Benefits: Competitive salary, career growth, and a dynamic work environment.
  • Other info: Opportunity to work with cutting-edge technology in a collaborative team.
  • Why this job: Make a real impact by eliminating operational toil and enhancing efficiency.
  • Qualifications: Senior SRE experience, strong Python skills, and process optimisation expertise.

The predicted salary is between 70000 - 90000 € per year.

We need an experienced SRE to focus predominantly on automation, optimization, and process re-engineering using AI for the Market Risk Platform. Success is measured by capacity created (toil eliminated, fewer manual steps, faster recovery, safer/faster changes) not by being the primary BAU support resources.

Primary Objectives:

  • Eliminate Operational toil and recurring manual work through durable automation.
  • Re-engineer support/change processes to reduce handoffs, approvals friction and rerun complexity.
  • Industrialize reliability operations so existing SREs spend less time firefighting and more time engineering.

Key Responsibilities (Automation & Process first):

  • Automation Engineering (Core): Build production grade automation in Python (tools, services, workflows) to remove repetitive work: environment checks, dependency validation, automated reruns/reprocessing, safe restarts, drift detection, remediation actions, and standardized operation tasks. Create self-service capabilities for common requests (guard railed, auditable, repeatable). Implement “automation with Safety”: idempotency, dry-run modes, approval gates where needed, rollback/undo strategies, and clear audit trails.
  • Process Re-engineering (Core): Map current operation processes (incident/problem/change, release readiness, rerun/recovery, access/entitlements, environment onboarding) and redesign them to remove waste and reduce cycle time. Standardize runbooks/playbooks into executable workflows, reduce tribal knowledge via templates, checklists, and automated pre-flight controls. Define and track operation KPIs (toil hours removed, alert volume reduction, MTTR improvements, change failure rate reduction, rerun time reduction).
  • Agentic AI: Design and implement agentic workflows that take action using tools/runbooks (e.g., diagnostics, evidence gathering, correlation, guided remediation, change-risk checks, automated rerun orchestration). Put strong controls in place: scoped permissions, deterministic fallbacks, human-in-the-loop approvals for risky actions, evaluation harnesses and measurable outcomes. Productionize with monitoring, logging and post incident learnings feeding back into the agent/tooling.
  • Observability (enablement for automation)

Required skills & Experience:

  • Senior SRE experience on distributed systems and batch/intraday workloads in a production environment.
  • Strong Python.
  • Provable agentic AI experience showing tool integration, guard rails, evaluation approach, measurable impact (toil reduction, MTTR reduction, alert reduction etc).
  • Demonstrated process optimization ability (removing steps/handoffs, standardizing workflows, implementing light weight controls with metrics).
  • Strong Linux and troubleshooting fundamentals across application/system/network layers.
  • Experience working across mixed estates (On Prem VMs + Cloud, with some Kubernetes exposure for operational monitoring/reruns).

Differentiators:

  • Exposure to Banking/Finance Market Risk Domains.
  • Experience and knowledge of Athena ecosystem familiarity or similar (SecDB Quartz).

Site Reliability Engineer in Slough employer: Cubestech Ltd

Join a forward-thinking company in London that prioritises innovation and employee empowerment, making it an exceptional employer for Site Reliability Engineers. With a strong focus on automation and process optimisation, you will have the opportunity to work with cutting-edge AI technologies while enjoying a collaborative work culture that fosters professional growth and development. The company offers competitive benefits and a commitment to reducing operational toil, ensuring that your contributions lead to meaningful impact in the fast-paced finance sector.

C

Contact Detail:

Cubestech Ltd Recruiting Team

StudySmarter Expert Advice🤫

We think this is how you could land Site Reliability Engineer in Slough

Tip Number 1

Network, network, network! Get out there and connect with folks in the industry. Attend meetups, webinars, or even just grab a coffee with someone already working as an SRE. You never know who might have the inside scoop on job openings!

Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your Python projects and automation tools. This gives potential employers a taste of what you can do and sets you apart from the crowd.

Tip Number 3

When you land that interview, come prepared with questions about their current processes and how they handle operational toil. This shows you're genuinely interested and ready to dive into process re-engineering right away.

Tip Number 4

Don’t forget to apply through our website! We’re always on the lookout for talented SREs who can help us optimise and automate. Plus, it’s a great way to ensure your application gets seen by the right people.

We think you need these skills to ace Site Reliability Engineer in Slough

Automation Engineering
Python
Agentic AI
Process Re-engineering
Incident Management
Change Management
Runbook Standardisation

Some tips for your application 🫡

Tailor Your CV:Make sure your CV is tailored to the SRE role. Highlight your experience with Python, automation, and process re-engineering. We want to see how your skills align with our needs, so don’t be shy about showcasing relevant projects!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you’re passionate about the SRE role and how your background makes you a perfect fit. We love seeing enthusiasm and a clear understanding of what we do at StudySmarter.

Showcase Your Achievements:When detailing your experience, focus on measurable achievements. Did you reduce toil hours or improve MTTR? Quantify your impact! We appreciate candidates who can demonstrate their contributions in a tangible way.

Apply Through Our Website:We encourage you to apply through our website for a smoother application process. It helps us keep track of your application and ensures you don’t miss out on any important updates from us. Good luck!

How to prepare for a job interview at Cubestech Ltd

Know Your Automation Inside Out

Make sure you can talk confidently about your experience with automation in Python. Be ready to discuss specific projects where you've eliminated operational toil and how you approached process re-engineering. Highlight any measurable impacts you've achieved, like reduced MTTR or alert volume.

Showcase Your Agentic AI Experience

Prepare examples of how you've implemented agentic workflows in previous roles. Discuss the tools you've integrated and the safety measures you've put in place, such as idempotency and rollback strategies. This will demonstrate your ability to design reliable systems that require minimal manual intervention.

Understand the Market Risk Landscape

Familiarise yourself with the banking and finance market risk domains. If you have experience with the Athena ecosystem or similar platforms, be sure to mention it. Showing that you understand the context of the role will set you apart from other candidates.

Be Ready for Technical Challenges

Expect technical questions that test your knowledge of distributed systems and troubleshooting across application, system, and network layers. Brush up on your Linux skills and be prepared to solve problems on the spot. This will showcase your hands-on experience and problem-solving abilities.