Senior Site Reliability Engineer (DevTools) in London

Senior Site Reliability Engineer (DevTools) in London

London Full-Time 60000 - 80000 £ / year (est.) Home office (partial)
Nebius

At a Glance

  • Tasks: Maintain and grow systems, improve services, and support users in a fast-paced environment.
  • Company: Join Nebius, a leader in cloud infrastructure for the AI economy.
  • Benefits: Competitive pay, career growth, flexibility, and a collaborative culture.
  • Other info: Fast-moving environment with real ownership and opportunities to shape the future of AI.
  • Why this job: Make a meaningful impact on innovative AI projects with a talented team.
  • Qualifications: SRE and SWE experience, coding skills in Java/Kotlin, Go, Python, and Ruby.

The predicted salary is between 60000 - 80000 £ per year.

About Nebius: Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure. Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI. Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R&D.

The Role: We're an SRE team within DevTools, looking for someone ready to help maintain and grow our systems. We run 25k builds a day in TeamCity, store 100 TB of artifacts in Artifactory, and work with a massive monorepo in GitLab — comparable in scale to what you'd find at FAANG companies. We modify GitLab and build our own TeamCity plugins to give users a product that meets their needs. We're also experimenting with AI — we have our own RAG setup and are figuring out how to operate in the age of agents. Our goal: understand users' problems and requests, define metrics that capture the problem, improve those metrics, and verify that the user's problem is actually gone.

Your responsibilities will include:

  • Improving services based on user feedback
  • Building fault-tolerant, self-healing architecture
  • Finding ways to speed up our systems and reduce user friction
  • Modifying well-known closed- and open-source solutions
  • Supporting our users

We expect you to have:

  • A combination of SRE and SWE experience (for us that's a 50/50 split). Our code is in Java/Kotlin, Go, Python, and Ruby
  • An understanding of what's happening under the hood in Unix-like systems and the JVM
  • A passion for improving the user experience
  • The ability to adapt quickly on the fly in a fast-changing environment

It will be an added bonus if you have:

  • Experience in Platform Engineering
  • Experience operating GitLab (or another VCS) and TeamCity (or another CI system)
  • Experience with Spring and operating Java monoliths

Benefits & Perks:

  • Competitive compensation
  • Career growth and learning opportunities
  • Flexibility and work-life balance
  • Collaborative and innovative culture
  • Opportunity to work on impactful AI projects
  • International environment and talented teams

What's it like to work at Nebius: Fast moving - Bold thinking - Constant growth - Meaningful impact - Trust and real ownership - Opportunity to shape the future of AI

Equal Opportunity Statement: Nebius is an equal opportunity employer. We are committed to fostering an inclusive and diverse workplace and to providing equal employment opportunities in all aspects of employment. We do not discriminate on the basis of race, color, religion, sex (including pregnancy), national origin, ancestry, age, disability, genetic information, marital status, veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by applicable law. Applicants must be authorized to work in the country in which they apply and will be required to provide proof of employment eligibility as a condition of hire. If you need accommodations during the application process, please let us know.

Senior Site Reliability Engineer (DevTools) in London employer: Nebius

At Nebius, we pride ourselves on being an exceptional employer, offering competitive compensation and a collaborative culture that fosters innovation and growth. Our Amsterdam headquarters provides a vibrant international environment where talented teams work on impactful AI projects, ensuring flexibility and work-life balance for all employees. Join us to shape the future of AI while enjoying meaningful career development opportunities in a fast-paced and supportive setting.

Nebius

Contact Details:

Nebius Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Senior Site Reliability Engineer (DevTools) in London

Tip Number 1

Network like a pro! Reach out to current or former employees at Nebius on LinkedIn. A friendly chat can give you insider info and maybe even a referral, which can really boost your chances.

Tip Number 2

Prepare for those coding interviews! Brush up on your Java, Go, and Python skills. Practice common SRE scenarios and be ready to showcase how you’ve tackled user problems in the past.

Tip Number 3

Show your passion for user experience! During interviews, share specific examples of how you've improved systems based on user feedback. This will demonstrate that you’re not just about the tech, but also about making life easier for users.

Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining the Nebius team.

We think you need these skills to ace Senior Site Reliability Engineer (DevTools) in London

Site Reliability Engineering (SRE)
Software Engineering (SWE)
Java
Kotlin
Go
Python
Ruby

Some tips for your application 🫡

Tailor Your CV:Make sure your CV reflects the skills and experiences that match the Senior Site Reliability Engineer role. Highlight your SRE and SWE experience, especially with Java/Kotlin, Go, Python, and Ruby. We want to see how you can contribute to our team!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Share your passion for improving user experience and how you've tackled challenges in fast-paced environments. Let us know why you're excited about working with Nebius and our mission in the AI cloud space.

Showcase Your Problem-Solving Skills:In your application, give examples of how you've improved systems based on user feedback or built fault-tolerant architectures. We love seeing real-world applications of your skills, so don't hold back on the details!

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss any important updates. Plus, it shows us you’re keen to join our team!

How to prepare for a job interview at Nebius

Know Your Tech Stack

Make sure you’re familiar with the technologies mentioned in the job description, like Java, Kotlin, Go, Python, and Ruby. Brush up on your knowledge of Unix-like systems and the JVM, as these will likely come up during technical discussions.

Showcase Your Problem-Solving Skills

Prepare to discuss specific examples where you've improved user experience or built fault-tolerant systems. Nebius values engineers who can adapt quickly, so be ready to share how you've tackled challenges in fast-paced environments.

Understand the User Perspective

Since the role focuses on improving services based on user feedback, think about how you’ve gathered and acted on user insights in previous roles. Be prepared to discuss metrics you’ve defined and how you verified improvements.

Practice Coding Interviews

As coding interviews are part of the process, practice common algorithms and data structures in the languages relevant to the role. Use platforms like LeetCode or HackerRank to sharpen your skills and get comfortable with coding under pressure.