Senior Site Reliability Engineer in London
Senior Site Reliability Engineer

Senior Site Reliability Engineer in London

London Full-Time 60000 - 80000 ÂŁ / year (est.) No home office possible
Go Premium
ClickUp

At a Glance

  • Tasks: Enhance the reliability and performance of our innovative cloud-based infrastructure.
  • Company: Join ClickUp, a leader in redefining productivity with cutting-edge AI technology.
  • Benefits: Competitive salary, inclusive culture, and opportunities for professional growth.
  • Other info: Dynamic environment with a focus on innovation and career development.
  • Why this job: Be part of a bold team shaping the future of work and tackling complex challenges.
  • Qualifications: Strong software engineering skills and experience with cloud infrastructure.

The predicted salary is between 60000 - 80000 ÂŁ per year.

At ClickUp, we’re not just building software. We’re architecting the future of work! In a world overwhelmed by work sprawl, we saw a better way. That’s why we created the first truly converged AI workspace, unifying tasks, docs, chat, calendar, and enterprise search, all supercharged by context-driven AI, empowering millions of teams to break free from silos, reclaim their time, and unlock new levels of productivity. Join us and be part of a bold, innovative team that’s redefining what’s possible!

We are looking for driven and innovative software engineers with strong site reliability engineering (SRE) discipline or interest in this area to help us make ClickUp the “one app to rule them all.” As an SRE at ClickUp, your primary roles will be to improve the stability, availability and reliability of our globally distributed and cloud-based infrastructure that powers our app for thousands of users daily. If you are a rockstar engineer with an entrepreneurial and high‑paced mindset who is ready to own, drive and tackle some of the most complex problems out there, we would love to hear from you!

What you’ll do:

  • Build a deep understanding of how ClickUp’s systems behave, scale, interact and fail, and use that insight to identify risks and opportunities for remediation.
  • Own, drive and improve the incident management process across engineering org and participate in the team’s follow‑the‑sun model.
  • Define SLOs and SLIs for all of our services and introduce error budgeting.
  • Own and improve our observability on all of our services.
  • Build software solutions to enable reliability and operability of large‑scale distributed systems handling petabytes of data.
  • Build tools and automation to eliminate toil and reduce operational overhead.
  • Create frameworks, processes and best practices to be used across ClickUp Engineering.
  • Automate critical portions of ClickUp engineering processes, to minimize risk and maximize the speed of innovation.
  • Manage capacity and performance to help scale our infrastructure both on public and private clouds around the world.

What we’re looking for:

  • Software engineering: Strong software engineers with an operational, infrastructural or SRE mentality who can design and build systems for platform and infrastructure layers.
  • Cloud experience: Production work experience in a major cloud environment around doing CI/CD deployments, using managed services, bootstrapping and provisioning services via infrastructure‑as‑code (IaC) systems, automations and operations.
  • Infrastructure management: Experience managing production‑grade infrastructure with IaC tools or configuration management tools.
  • Operating systems: Strong knowledge of *nix based operating systems, their internals and advanced troubleshooting commands.
  • Compute: Experience working with VMs, containers and container orchestration systems.
  • Database: Experience with RDBMS and NoSQL storage solutions within production capacity and familiarity with running and inspecting queries.
  • Observability: Experience with logging, monitoring and alerting tools, setting up monitors and alerts for production services, and understanding concepts such as SLOs and SLIs.

Bonus points:

  • CloudFormation/CDK, ECS, ElasticBeanstalk
  • PostgreSQL, DynamoDB, AuroraDB
  • Typescript or any JavaScript based framework

Unsure if you meet all the qualifications of this job description but are deeply excited about the role? We hire based on ambition, grit, and a passion for improving the way people work. If you think ClickUp is the company for you, we encourage you to apply!

At ClickUp, we assess every candidate based on the potential impact they can have. We hire the best people for the job and support each person’s journey to build their boldest career. ClickUp is an Equal Opportunity Employer, and qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin.

Senior Site Reliability Engineer in London employer: ClickUp

At ClickUp, we pride ourselves on fostering a dynamic and innovative work culture that empowers our employees to thrive. As a Senior Site Reliability Engineer, you'll be part of a forward-thinking team dedicated to redefining productivity through cutting-edge technology, with ample opportunities for professional growth and development. Our commitment to diversity and inclusion ensures that every voice is heard, making ClickUp an exceptional place to build a meaningful career.
ClickUp

Contact Detail:

ClickUp Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Senior Site Reliability Engineer in London

✨Tip Number 1

Get to know ClickUp inside out! Familiarise yourself with our products and how they work. This will not only help you in interviews but also show your genuine interest in the role.

✨Tip Number 2

Network like a pro! Connect with current employees on LinkedIn or attend industry events. A friendly chat can sometimes lead to insider tips or even a referral!

✨Tip Number 3

Prepare for technical challenges! Brush up on your SRE skills and be ready to tackle real-world problems during interviews. Practice makes perfect, so don’t skip this step!

✨Tip Number 4

Apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re serious about joining our innovative team!

We think you need these skills to ace Senior Site Reliability Engineer in London

Site Reliability Engineering (SRE)
Cloud Experience
CI/CD Deployments
Infrastructure as Code (IaC)
Production-Grade Infrastructure Management
Linux-based Operating Systems
VMs and Containers
Container Orchestration Systems
RDBMS and NoSQL Databases
Observability Tools
Logging, Monitoring, and Alerting
SLOs and SLIs
Automation
Problem-Solving Skills
Capacity and Performance Management

Some tips for your application 🫡

Show Your Passion: When writing your application, let your enthusiasm for the role shine through! We want to see that you’re not just ticking boxes but genuinely excited about the opportunity to help redefine the future of work with us.

Tailor Your Experience: Make sure to highlight your relevant experience in site reliability engineering and cloud environments. We love seeing how your skills align with our needs, so don’t be shy about showcasing your achievements!

Be Clear and Concise: Keep your application straightforward and to the point. We appreciate clarity, so avoid jargon and focus on what makes you a great fit for the role. Remember, less is often more!

Apply Through Our Website: We encourage you to apply directly through our careers portal. It’s the best way to ensure your application gets into the right hands and shows us you’re serious about joining our innovative team!

How to prepare for a job interview at ClickUp

✨Know Your Stuff

Before the interview, dive deep into ClickUp’s systems and how they operate. Familiarise yourself with their cloud infrastructure, SLOs, and SLIs. Being able to discuss specific examples of how you’ve improved system reliability or managed incidents will show you’re the right fit.

✨Showcase Your Problem-Solving Skills

Prepare to discuss complex problems you've tackled in previous roles. Think about situations where you had to improve stability or reduce operational overhead. Use the STAR method (Situation, Task, Action, Result) to structure your answers clearly.

✨Demonstrate Your Cloud Experience

Make sure you can talk confidently about your experience with major cloud environments. Be ready to discuss CI/CD deployments, IaC tools, and any automation processes you've implemented. This is crucial for a role focused on large-scale distributed systems.

✨Ask Insightful Questions

At the end of the interview, don’t shy away from asking questions. Inquire about ClickUp’s approach to incident management or how they measure success in their SRE team. This shows your genuine interest in the role and helps you assess if it’s the right fit for you.

Senior Site Reliability Engineer in London
ClickUp
Location: London
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>