Platform Operations Manager (DevOps & Site Reliability Engineering)

Platform Operations Manager (DevOps & Site Reliability Engineering)

Full-Time 75000 - 75000 £ / year (est.) No working from home possible
R

At a Glance

  • Tasks: Lead platform operations and ensure the reliability of our tech systems 24/7.
  • Company: Join Care ADHD, a pioneering HealthTech start-up transforming ADHD care.
  • Benefits: Enjoy hybrid working, 25 days leave, and a £500 home office setup stipend.
  • Other info: Be part of a diverse team committed to innovation and inclusivity.
  • Why this job: Make a real impact on healthcare technology and improve patient outcomes.
  • Qualifications: 8+ years in DevOps or SRE with strong leadership skills.

The predicted salary is between 75000 - 75000 £ per year.

Location: Hybrid - working from our Canary Wharf office 2-3 times per week

Reports to: Director of Engineering

Salary: Up to £75,000

Manages: Platform Operations & DevOps Team (UK & India)

At Care ADHD, our mission is to transform ADHD care through innovation, data, and technology — delivering accessible, patient-centred services that improve outcomes for individuals, clinicians, and healthcare providers. We believe that high-quality data and meaningful insight are essential to improving clinical services, understanding patient journeys, and ensuring that care is delivered efficiently and effectively.

The Role

We are looking for an experienced and hands‑on Platform Operations Lead to own the reliability, availability, performance, and operational stability of Care ADHD’s technology platforms. This role combines DevOps, Site Reliability Engineering (SRE), cloud infrastructure, platform operations, and technical leadership — ensuring that our systems are securely deployed, highly available, scalable, and operational 24/7/365. You will lead platform operations across both the UK and India, working closely with engineering, QA, security, and product teams to ensure our infrastructure and deployment capabilities support a fast‑moving and high‑quality engineering organisation. This is a highly technical leadership role requiring someone who is equally comfortable defining operational strategy, improving engineering practices, and being hands‑on with cloud infrastructure, automation, monitoring, incident response, and reliability engineering.

What You’ll Be Doing

  • Key Responsibilities:
  • Platform Reliability & Operations
    • Own the operational health, availability, and reliability of all production and non-production environments
    • Ensure platforms are monitored, maintained, and operational 24/7/365
    • Lead platform incident management, root cause analysis, and service recovery processes
    • Establish and improve operational readiness, resilience, and disaster recovery capabilities
    • Define and manage SLAs, SLOs, and operational performance metrics
    • Ensure high levels of platform uptime, stability, scalability, and security
    • Design, build, and maintain cloud infrastructure primarily within AWS
    • Lead infrastructure automation and Infrastructure as Code initiatives using Terraform or AWS CDK
    • Design and optimise CI/CD pipelines to support efficient, secure, and reliable software delivery
    • Improve deployment automation, release management, and environment consistency
    • Support engineering teams with platform tooling, deployment strategies, and operational best practices
    • Drive improvements in: Deployment reliability, Infrastructure scalability, Platform security, Cost optimisation, Operational efficiency
  • Site Reliability Engineering (SRE)
    • Implement and maintain observability solutions including: Monitoring, Logging, Alerting, Tracing
    • Develop proactive approaches to incident prevention and operational resilience
    • Lead reliability engineering practices including: Capacity planning, Performance monitoring, Fault tolerance, High availability design
    • Reduce operational toil through automation and self‑service tooling
    • Establish strong incident response and post‑incident review processes
  • Leadership & Team Management
    • Lead and mentor platform operations and DevOps engineers across the UK and India
    • Build a collaborative, accountable, and high‑performing operational culture
    • Allocate and coordinate operational resources across projects and platform priorities
    • Work closely with the Director of Engineering to align platform strategy with product and engineering delivery goals
    • Collaborate with engineering leads, QA, security, and product teams to support platform and release readiness
  • Security, Compliance & Governance
    • Ensure infrastructure and operational processes follow security best practices
    • Support compliance with GDPR and healthcare‑related operational standards
    • Help implement operational governance, access controls, and infrastructure security policies
    • Work closely with security and engineering teams to manage vulnerabilities and operational risk
  • Technology Environment
    • AWS cloud infrastructure
    • Kubernetes and containerised services
    • Serverless platforms (AWS Lambda, API Gateway)
    • Node.js / TypeScript applications
    • PostgreSQL and cloud‑native databases
    • Terraform / AWS CDK
    • CI/CD pipelines and deployment automation
    • Monitoring and observability tooling
    • Microservices and event‑driven architectures

What We’re Looking For

  • Experience
    • 8+ years of experience in DevOps, Platform Engineering, SRE, or Infrastructure Engineering
    • Proven experience leading operational or platform engineering teams
    • Strong experience managing distributed or offshore technical teams
    • Experience supporting business‑critical production systems with high availability requirements
    • Experience operating cloud‑native platforms in AWS environment
  • Technical Skills
    • Strong hands‑on experience with:
      • AWS cloud infrastructure and services
      • CI/CD pipeline design and automation
      • Infrastructure as Code (Terraform or AWS CDK)
      • Kubernetes and container orchestration
      • Monitoring, logging, and observability platforms
      • Incident management and operational support
      • Linux systems administration and networking fundamentals
    • Strong understanding of:
      • Site Reliability Engineering principles
      • High availability and disaster recovery design
      • Platform scalability and resilience
      • Security and operational governance
      • Performance optimisation and capacity planning
    • Experience with tools such as:
      • Terraform
      • GitHub Actions / GitLab CI / Jenkins
      • CloudWatch
      • PagerDuty or similar incident management tooling
    • Operational Leadership
      • Strong ownership mindset with the ability to lead operational stability and platform reliability across the organisation.
    • Communication
      • Excellent communication and stakeholder management skills, particularly across distributed engineering teams.
    • Problem Solving
      • Calm and effective under pressure with strong incident management and troubleshooting capabilities.
    • Collaboration
      • Works effectively across engineering, product, QA, and security teams to support reliable platform delivery.

    What Success Looks Like

    • Stable, secure, and highly available platforms operating 24/7/365
    • Reliable and efficient deployment and release processes
    • Strong monitoring, observability, and incident management practices
    • Reduced downtime, operational risk, and deployment failures
    • High‑performing platform operations teams across the UK and India
    • Engineering teams enabled through strong platform tooling and operational support

    Why Join Care ADHD

    This is an opportunity to play a critical role in building and operating the platform infrastructure behind a growing digital healthcare organisation focused on improving ADHD care and patient outcomes. You’ll have significant influence over platform reliability, operational strategy, engineering enablement, and cloud infrastructure — helping ensure the technology powering our services is secure, scalable, and always available.

    What You can Expect From Us

    • Hybrid working - work from our Canary Wharf office 2-3 times per week
    • 25 days annual leave (plus UK public holidays)
    • Team get‑togethers
    • A paid day off on your birthday
    • Office equipment when you join
    • £500 stipend to set up your home office*
    • Pension contribution
    • Be part of one of the UK’s most ambitious HealthTech start‑ups

    At Care ADHD, we’re committed to building a diverse and inclusive environment. We encourage applications from candidates of all backgrounds, especially those from historically marginalised communities, as we work together to create a more equitable future.

Platform Operations Manager (DevOps & Site Reliability Engineering) employer: RGIT Australia

At Care ADHD, we pride ourselves on being an exceptional employer that champions innovation and inclusivity in the healthcare sector. Our hybrid working model allows for flexibility while fostering a collaborative culture at our Canary Wharf office, where you can thrive alongside a diverse team dedicated to transforming ADHD care. With ample opportunities for professional growth, competitive benefits including 25 days of annual leave, and a commitment to employee well-being, joining us means being part of a mission-driven organisation that values your contributions and supports your career development.

R

Contact Details:

RGIT Australia Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Platform Operations Manager (DevOps & Site Reliability Engineering)

Join the IT Consultancy Buzz

Get involved in local or virtual IT consultancy meetups and forums. This is where we can rub shoulders with industry professionals, get insights into what RGIT Australia values, and even spot unadvertised opportunities. Don't miss out on these chances to make a name for ourselves in the IT world!

Show Off Your Skills

Create a personal project or case study relevant to the challenges RGIT Australia might face. Use platforms like GitHub or Medium to share your findings. This not only demonstrates our consulting skills but shows a proactive attitude, making us stand out from the crowd when applying for that full-time gig.

Leverage LinkedIn for Connections

Follow and engage with the relevant thought leaders and influencers in IT consultancy on LinkedIn. Share insightful content and join discussions to gain visibility. A well-placed comment or shared article could catch the attention of someone at RGIT Australia!

Direct Apply to RGIT Australia

Let's not forget to apply directly through the RGIT Australia website! Tailor your application to showcase our understanding of their consulting style and how we can contribute to their projects. A personalised approach can make a huge difference in landing that full-time position!

We think you need these skills to ace Platform Operations Manager (DevOps & Site Reliability Engineering)

DevOps
Site Reliability Engineering (SRE)
Cloud Infrastructure (AWS)
Infrastructure as Code (Terraform or AWS CDK)
CI/CD Pipeline Design and Automation
Kubernetes and Container Orchestration
Monitoring and Observability Tools

Some tips for your application 🫡

Showcase Your Problem-Solving Skills:In IT consulting, it's all about problem-solving, so make sure your CV highlights your analytical skills and any relevant projects you've tackled. Mention specific technologies or methodologies you've used to resolve issues or improve processes; this shows you can think critically and deliver results, which is vital for us at RGIT Australia.

Highlight Relevant Certifications:Certifications like ITIL, PMP, or even specific tech stack qualifications can really make you stand out. Make sure to include these in your CV, as they not only demonstrate your expertise but also your commitment to staying current in the field. We love seeing candidates who are proactive about their professional development!

Tailor Your Cover Letter:Your cover letter is your chance to connect personally with us at RGIT Australia. Share stories about your experiences in IT consulting, and how they shaped your desire to join our team. Mention why you’re excited about this particular role, and how you see yourself contributing to our projects.

Keep It Clear and Concise:We're all busy, so make sure your application is easy to read. Use bullet points for key achievements, and don’t overload us with jargon. A clean, professional layout goes a long way. Remember, the clearer your application, the more likely we are to invite you in for an interview!

How to prepare for a job interview at RGIT Australia

Brush Up on Your Technical Skills

For an IT consulting role, be ready to demonstrate your technical prowess. You might face questions on systems integration, cloud technologies, or even troubleshooting specific software. If you have experience with tools like AWS, Azure, or even specific programming languages, make sure you can talk about them fluently.

Showcase Your Problem-Solving Approach

IT consulting is all about solving problems for clients. Think about how you can illustrate your approach to a past challenge using the STAR method (Situation, Task, Action, Result). It's a great way to show how you tackle complex issues and come up with effective solutions.

Know the Business Impact of IT Solutions

When discussing your experiences, focus not just on the tech solutions you implemented, but also on their business impact. Employers want to see that you can connect IT with organisational goals. Prep examples that highlight how your tech contributions improved efficiency or reduced costs for past clients or projects.

Prepare for Behavioural Questions

Since IT consulting often involves teamwork and client interactions, expect behavioural questions that assess your interpersonal skills. Be prepared with examples that demonstrate your adaptability, communication skills, and how you handle client feedback. Before the interview, think of situations where you worked closely with clients to create effective IT strategies or changes.