Service Reliability Eng – Kings Cross, London
Service Reliability Eng – Kings Cross, London

Service Reliability Eng – Kings Cross, London

Full-Time 36000 - 60000 £ / year (est.) No home office possible
Go Premium
U

At a Glance

  • Tasks: Ensure the reliability and performance of critical systems that connect artists and fans.
  • Company: Join Universal Music, the world's leading music company with a passion for innovation.
  • Benefits: Inclusive culture, career growth opportunities, and a chance to work in the music industry.
  • Why this job: Make a real impact on global music services while working with cutting-edge technology.
  • Qualifications: Experience in systems administration and proficiency in programming languages like Python or Java.
  • Other info: Dynamic team environment with a commitment to diversity and inclusion.

The predicted salary is between 36000 - 60000 £ per year.

Music is Universal. It’s the passionate and dedicated team at Universal Music who help make us the world’s leading music company. From A&R to finance, legal to digital, sales to marketing, Universal Music is the place to grow and develop your career within a truly commercial and innovative business that leads in everything it does.

Everyone is welcome to apply for our roles, and we are determined to ensure that no applicant or employee receives less favourable treatment because of gender, race, disability, sexual orientation, religion, belief, age, marital status, background, pregnancy, or caring responsibilities. We also recognise the importance of diversity of thought within our teams and are fully committed to embracing the talents of people with autism, dyslexia, ADHD, and other forms of neurocognitive variation. We will always seek to make appropriate adjustments to recruitment, workplaces, and work processes to be fully inclusive to people with different needs and working styles. If you need us to make any reasonable adjustments for you from application onwards, including alternatives to the online form or to disclose a neurocognitive condition, please email UniversalMusicCareers@umusic.com.

Job Summary: We are UMG, the Universal Music Group. We are the world’s leading music company. In everything we do, we are committed to artistry, innovation and entrepreneurship. We own and operate a broad array of businesses engaged in recorded music, music publishing, merchandising, and audiovisual content in more than 60 countries. We identify and develop recording artists and songwriters, and we produce, distribute and promote the most critically acclaimed and commercially successful music to delight and entertain fans around the world.

As a key member of our Global Technical Operations team, you will be responsible for the reliability, scalability, and performance of the critical systems that power a global enterprise. By blending a software engineering mindset with operational expertise, you will engineer solutions that improve system reliability, automate complex processes, and reduce manual toil. You will be an essential partner to our development, infrastructure, and security teams, driving a culture of resilience and continuous improvement across the organization. As a Site Reliability Engineer, you won't just be supporting systems; you'll be ensuring the services that connect artists and fans around the globe are always on.

Job Functions:

  • Key Responsibilities:
  • System Reliability & Performance: Design, build, and maintain the availability, scalability, and performance of critical services. Develop and maintain robust monitoring, alerting, and observability systems (e.g., using AWS CloudWatch, Dynatrace) to ensure rapid issue detection and resolution. Monitor infrastructure capacity and performance, providing analysis and suggestions for service delivery improvement.
  • Automation & Efficiency: Drive the automation of repetitive operational tasks, including infrastructure provisioning, deployments, and scaling. Create and maintain scripts and custom code to support and enhance our operational toolset. Support and optimize CI/CD pipelines to improve deployment speed and reliability.
  • Incident Management & Collaboration: Participate in an on-call rotation to troubleshoot and mitigate production incidents. Lead post-incident reviews and root cause analyses to implement lasting solutions. Partner with engineering and IT stakeholders to embed SRE best practices (SLOs, error budgets) into the design and development lifecycle.

Job Requirements:

  • Required Experience & Skills: A strong background in systems administration (Linux/Windows) in a large-scale environment. Proficiency in at least one programming language (e.g., Python, Go, Java). Hands-on experience with a major cloud platform (AWS, GCP, or Azure), with a high preference for AWS. Solid understanding of networking, containers (Docker, Kubernetes), and Infrastructure as Code (e.g., Terraform, Ansible). Experience with modern monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, Splunk, Dynatrace). Proven analytical and problem-solving abilities with experience in a high-pressure environment. Excellent communication skills and the ability to foster a collaborative team environment.
  • Preferred Experience & Skills: Bachelor's degree in an IT-related field. Experience managing large-scale, distributed systems for a global organization. Familiarity with IT governance standards like ITIL. Direct experience with ServiceNow for IT service management. Knowledge of chaos engineering, resilience testing, and advanced capacity planning.

Just So You Know… The company presents this job description as a guide to the major areas and duties for which the jobholder is accountable. However, the business operates in an environment that demands change and the jobholder's specific responsibilities and activities will vary and develop. Therefore, the job description should be seen as indicative and not as a permanent, definitive, and exhaustive statement.

Service Reliability Eng – Kings Cross, London employer: Universal Music Group

At Universal Music Group, we pride ourselves on being an inclusive and innovative employer, offering a vibrant work culture that fosters creativity and collaboration. Located in the heart of Kings Cross, London, our team enjoys access to diverse career growth opportunities within the dynamic music industry, alongside comprehensive benefits that support both personal and professional development. Join us to be part of a passionate community dedicated to connecting artists and fans worldwide while embracing diversity and neurocognitive variation.
U

Contact Detail:

Universal Music Group Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Service Reliability Eng – Kings Cross, London

Tip Number 1

Network like a pro! Reach out to people in the industry, especially those at Universal Music. A friendly chat can open doors that applications alone can't.

Tip Number 2

Prepare for interviews by researching the company culture and values. Show how your skills align with their mission of artistry and innovation. We want to see your passion!

Tip Number 3

Practice your technical skills! Brush up on your programming languages and cloud platforms. Being able to demonstrate your expertise in real-time can really set you apart.

Tip Number 4

Apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you're serious about joining the Universal Music family.

We think you need these skills to ace Service Reliability Eng – Kings Cross, London

Systems Administration (Linux/Windows)
Programming (Python, Go, Java)
Cloud Platform Experience (AWS, GCP, Azure)
Networking Knowledge
Containerisation (Docker, Kubernetes)
Infrastructure as Code (Terraform, Ansible)
Monitoring and Observability Tools (Prometheus, Grafana, Datadog, Splunk, Dynatrace)
Analytical Skills
Problem-Solving Skills
Communication Skills
Collaboration Skills
Incident Management
CI/CD Pipeline Support
ServiceNow Experience
IT Governance Standards (ITIL)

Some tips for your application 🫡

Tailor Your Application: Make sure to customise your CV and cover letter for the Service Reliability Engineer role. Highlight your relevant experience with systems administration, programming languages, and cloud platforms like AWS. We want to see how your skills align with what we do!

Showcase Your Problem-Solving Skills: In your application, don’t just list your technical skills; share examples of how you've tackled challenges in high-pressure environments. We love seeing candidates who can think on their feet and come up with innovative solutions!

Be Yourself: We value diversity and individuality at Universal Music. Don’t hesitate to let your personality shine through in your application. Share your passion for music and technology, and how you can contribute to our vibrant team culture.

Apply Through Our Website: For the best chance of success, make sure to apply directly through our website. This way, your application will be seen by the right people, and you’ll be one step closer to joining our amazing team at Universal Music!

How to prepare for a job interview at Universal Music Group

Know Your Tech Inside Out

Make sure you brush up on your systems administration skills, especially with Linux and Windows. Be ready to discuss your experience with cloud platforms like AWS, and don’t forget to highlight any hands-on work you've done with monitoring tools like Dynatrace or Prometheus.

Showcase Your Problem-Solving Skills

Prepare to share specific examples of how you've tackled complex issues in high-pressure environments. Think about incidents you've managed and how you led post-incident reviews. This will demonstrate your analytical abilities and your knack for continuous improvement.

Emphasise Collaboration

Since this role involves working closely with engineering and IT teams, be ready to talk about your experience in fostering a collaborative environment. Share instances where you’ve successfully partnered with others to implement SRE best practices or improve service delivery.

Be Ready for Technical Questions

Expect technical questions that test your knowledge of programming languages like Python or Go, as well as your understanding of containers and Infrastructure as Code. Practise explaining these concepts clearly, as communication is key in this role.

Service Reliability Eng – Kings Cross, London
Universal Music Group
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

U
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>