SRE / AIOps Engineer in London

SRE / AIOps Engineer in London

London Full-Time 60000 - 80000 € / year (est.) No home office possible
C

At a Glance

  • Tasks: Build AI-driven operations and automation tools from scratch in a dynamic environment.
  • Company: Join Era4, a mission-driven start-up transforming energy sites into modern data centres.
  • Benefits: Enjoy hybrid work, competitive salary, and opportunities for professional growth.
  • Other info: Be part of a diverse team committed to innovation and operational excellence.
  • Why this job: Make a real impact on national infrastructure while working with cutting-edge technology.
  • Qualifications: Strong Python skills and experience with observability platforms required.

The predicted salary is between 60000 - 80000 € per year.

Era4 develops, owns and operates AI infrastructure across the UK, powered by renewable energy. Converting legacy industrial and energy sites into modern data‑centre facilities, Era4 is combining brownfield regeneration opportunities with cleaner, efficient, scalable compute capacity for healthcare, research, finance, enterprise, and public‑sector organisations.

This is a greenfield role, building a modern Agentic approach to Client and Infrastructure Operations.

Role Summary: We are seeking Automation & AIOps Engineers who sit at the intersection of Site Reliability Engineering and modern AI‑driven operations. Embedded within Era4's engineering‑led Operations Centre, this role exists to build a modern AI Platform Operations function from scratch, designing tooling, and agentic workflows. No legacy to deal with.

Runbook Automation & Agent Development:

  • Build agentic, executable workflows capable of triaging, diagnosing, and where appropriate autonomously remediating known failure patterns.
  • Build and maintain LLM‑backed agents targeting the observability stack, ITSM platform, and infrastructure APIs (e.g. DCIM, IPAM, hypervisor layers).
  • Develop auditable Client‑focused automations, for Client interactions and workflows, with appropriate controls.
  • Develop safe, auditable automation with appropriate controls for higher‑risk platform actions.

Operational Tooling & Self‑Service Enablement:

  • Build internal tooling that empowers engineers and service desk analysts: CLI utilities, ChatOps integrations (Slack/Teams bots), status dashboards, and self‑service automation hooks.
  • Reduce dependency on DevSecOps and engineering teams for routine operational tasks through automation.
  • Maintain and contribute a library of automation assets, agent prompts, and runbook‑as‑code artefacts, version‑controlled and peer‑reviewed.
  • Develop the automation layer around monitoring and event management: alert suppression logic, enrichment pipelines, correlation rules, and alert‑to‑ticket integrations.
  • Continuously tune signal‑to‑noise ratios across monitoring tooling (Prometheus, Mimir, Grafana, or equivalent) to improve situational awareness.
  • Design and implement event correlation and deduplication logic to reduce alert storms and improve incident context.
  • Identify common Operational patterns and tasks as candidates for automation; maintain and prioritise a toil reduction backlog.
  • Participate in post‑incident reviews and translate findings into updated automation, runbooks, or agent logic.
  • Contribute to the evolution of Era4's operational standards, tooling architecture, and agent framework.

Technical – Core Element:

  • Strong Python development skills, including scripting for automation, API integration, and data processing.
  • Hands‑on experience with observability and monitoring platforms: Prometheus, Grafana, Mimir, or equivalent.
  • Experience integrating with ITSM platforms (ServiceNow, Halo, Jira Service Management, or similar) via API.
  • Solid understanding of event‑driven architectures, message queues, and webhook‑based automation patterns.
  • Strong understanding of managing GPU infrastructure in production, key signals and metrics and the automation of workflows.
  • Familiarity with Infrastructure‑as‑Code principles and cloud‑native environments (Kubernetes, Terraform, or similar).

Technical – Agent & AI:

  • Demonstrable experience building LLM‑powered agents or automation using frameworks such as LangChain, LlamaIndex, the Anthropic SDK, OpenAI function calling, or comparable tooling.
  • Understanding of agentic design patterns: tool use, structured output, human‑in‑the‑loop controls, and chain‑of‑thought reasoning for operational tasks.
  • Comfort operating in an API‑first environment, integrating agents with infrastructure APIs, DCIM, IPAM, and hypervisor control planes.

Operational:

  • Prior experience in an SRE, Senior Operations, or Platform Engineering environment, with exposure to on‑call operations and incident management processes.
  • Experience in converting narrative runbooks into executable automation or codified decision trees.
  • Understanding of ITIL‑aligned incident and change management principles and ITSM tooling.

One or more would be an advantage:

  • Exposure to data centre or colocation operations, particularly high‑density compute or GPU infrastructure environments.
  • Experience with ChatOps tooling: building Slack or Microsoft Teams bots for operational workflows.
  • Familiarity with DCIM platforms and telemetry pipelines (power, thermal, network).
  • Knowledge of OpenTelemetry, distributed tracing, or log aggregation platforms (Loki, ELK, Splunk).
  • Contributions to open‑source observability or automation tooling.
  • Experience in a start‑up or scale‑up environment where tooling is built from scratch.

Why Join Era4: You’ll be joining a mission‑driven start‑up building critical national infrastructure, where operational excellence directly enables growth. This role offers high visibility with leadership, real autonomy, and the chance to shape how a next‑generation company operates at scale.

Diversity & Inclusion: Era4 is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Executive & Operations United Kingdom - Hybrid (Occasional visit to London office)

SRE / AIOps Engineer in London employer: Carbon3ai Limited.

At Era4, we pride ourselves on being a forward-thinking employer that champions innovation and sustainability in the tech industry. Our hybrid work culture fosters collaboration and flexibility, allowing you to thrive while contributing to meaningful projects that transform legacy sites into cutting-edge AI infrastructure. With ample opportunities for professional growth and a commitment to diversity and inclusion, joining our team means being part of a mission-driven start-up that values your contributions and empowers you to shape the future of operations.

C

Contact Detail:

Carbon3ai Limited. Recruiting Team

StudySmarter Expert Advice🤫

We think this is how you could land SRE / AIOps Engineer in London

Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with Era4 employees on LinkedIn. A friendly chat can sometimes lead to opportunities that aren’t even advertised!

Tip Number 2

Show off your skills! Create a portfolio or GitHub repo showcasing your projects, especially those related to automation and AI. This gives us a glimpse of what you can do and sets you apart from the crowd.

Tip Number 3

Prepare for the interview by understanding Era4’s mission and values. Be ready to discuss how your experience aligns with building modern AI-driven operations. We love candidates who are genuinely excited about our work!

Tip Number 4

Don’t hesitate to apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows us you’re serious about joining the Era4 team and contributing to our greenfield projects.

We think you need these skills to ace SRE / AIOps Engineer in London

Python Development
API Integration
Data Processing
Observability Platforms (Prometheus, Grafana, Mimir)
ITSM Platform Integration (ServiceNow, Halo, Jira Service Management)
Event-Driven Architectures
Message Queues

Some tips for your application 🫡

Tailor Your CV:Make sure your CV is tailored to the SRE / AIOps Engineer role. Highlight your Python skills, automation experience, and any relevant projects you've worked on. We want to see how your background aligns with our mission at Era4!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Share your passion for AI-driven operations and how you can contribute to building our modern Agentic approach. Keep it concise but impactful – we love a good story!

Showcase Your Projects:If you've got any personal or professional projects that demonstrate your skills in automation or AI, don’t hold back! Include links or descriptions in your application. We’re keen to see what you’ve built and how it relates to our work.

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you’re serious about joining the Era4 team!

How to prepare for a job interview at Carbon3ai Limited.

Know Your Tech Inside Out

Make sure you brush up on your Python skills and get familiar with observability tools like Prometheus and Grafana. Be ready to discuss how you've used these technologies in past projects, especially in automation and API integration.

Showcase Your Automation Experience

Prepare examples of how you've built automation workflows or LLM-powered agents. Highlight any experience with frameworks like LangChain or OpenAI function calling, and be ready to explain your thought process behind designing agentic workflows.

Understand the Role's Impact

Era4 is all about building a modern AI Platform Operations function. Be prepared to discuss how your work can contribute to operational excellence and how you envision reducing toil through automation. Show that you understand the bigger picture!

Ask Insightful Questions

Prepare thoughtful questions about Era4's operational standards and tooling architecture. This shows your genuine interest in the role and helps you gauge if the company culture aligns with your values, especially regarding diversity and inclusion.