Site Reliability Engineer in London

Site Reliability Engineer in London

London Full-Time 28800 - 48000 £ / year (est.) No home office possible
T

At a Glance

  • Tasks: Manage and improve our platform while solving reliability issues and automating tasks.
  • Company: Join Tyk, a leading API Management platform with a global impact.
  • Benefits: Unlimited paid holidays, flexible hours, employee share scheme, and generous parental leave.
  • Why this job: Be part of a mission to connect every system in the world with cutting-edge technology.
  • Qualifications: Experience with Kubernetes, AWS, Linux, and strong collaboration skills required.
  • Other info: Dynamic remote-first culture with opportunities for personal growth and continuous improvement.

The predicted salary is between 28800 - 48000 £ per year.

Who are Tyk, and what do we do? The Tyk API Management platform is helping to drive the connected world and power new products and services. We’re changing the way that organisations connect any number of their systems and services. Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across various industries.

Founded in 2015 with offices in London – UK, London – Ontario, Atlanta and Singapore, we have many thousands of users of our B2B platform across the globe. Our Mission is to connect every system in the world by building an API Management platform.

Total flexibility, default remote, radical responsibility: We offer unlimited paid holidays and remote working from anywhere in the world for everyone. Tyk was founded on the principle of offering flexibility and autonomy to our employees, allowing them to achieve their best results.

The role: We’re looking for a Site Reliability Engineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways to improve, as we will look to you for new ideas, solutions and metrics on how we can improve the platform. You will also be our first line of incident management to our clients and will help define our response going forward.

Here’s what you’ll be responsible for:

  • Maintaining global Tyk Cloud within SL(A/I/O)s you will help to define
  • Identifying reliability issues and working together with your squad to solve them
  • Identifying and introducing new metrics and building relevant dashboards
  • Participating in the on-call rotation
  • Working with your squad to expand multi-region and multi-cloud reach of the platform
  • Documenting operational knowledge
  • Conducting post-incident analysis
  • Automating common tasks
  • Be a key shaper and contributor to our continuous improvement agenda
  • Reliability of our new global Tyk Cloud platform
  • Automation of operations and support
  • Writing and maintaining documentation on SRE processes and policies
  • Recommending and implementing ways of driving operational efficiency
  • Assisting in penetration testing for Cloud through liaising with our provider
  • Incident management

Here’s what we’re looking for:

  • Strong collaboration skills
  • Launching and operating production scale Kubernetes clusters
  • Designing and operating infrastructure on AWS and other providers
  • Operating MongoDB (or other document database) clusters
  • Operating Redis (or other key-value storage) clusters
  • Administering Linux servers
  • Maintaining distributed software
  • Operating Prometheus and Grafana
  • Operating logging collection and analysis systems
  • Participating in the on-call rotation (16:00pm – 4:00am UTC)

Skills:

  • Kubernetes & containers (advanced)
  • AWS / EKS (advanced)
  • Linux (advanced)
  • Terraform and IaC in general (proficient)
  • Helm (proficient)
  • Go and/or Python (familiar)
  • MongoDB (or similar)
  • Redis (or similar)
  • Monitoring – Prometheus, Grafana, Thanos (familiar)
  • Grasp of networking concepts (subnets, routing, peering, load balancing, NAT, etc.)
  • Common networking protocols (DNS, TCP/IP, HTTP, TLS, UDP)
  • Proactive, energetic, innovative and change oriented

Nice to have:

  • GCP or Azure
  • Bare metal infrastructure engineering
  • API management experience
  • Large scale distributed storage management
  • Familiarity with Rancher
  • CKA/CKAD/CKS
  • Creating and delivering production software in Go language

Here’s why you should join us:

  • Everyone has unlimited paid holiday.
  • We have total flexibility in hours, as we believe creativity flows better when our people are given freedom to decide when they are most productive.
  • Employee share scheme
  • Generous maternity and paternity leave
  • Company retreats
  • We value authenticity, respect, responsibility, independence, honesty, diversity and inclusion.

Tyk is an equal opportunities employer and we are determined to ensure that no applicant or employee receives less favourable treatment on the grounds of gender, age, disability, religion, belief, sexual orientation, marital status, or race.

Site Reliability Engineer in London employer: Tyk Technologies

At Tyk, we pride ourselves on being an exceptional employer that champions flexibility and autonomy, allowing our Site Reliability Engineers to thrive in a remote-first environment. With unlimited paid holidays, generous parental leave, and a culture that embraces creativity and continuous improvement, we empower our employees to shape the future of our API Management platform while enjoying a supportive and inclusive work atmosphere. Join us in our mission to connect every system in the world and be part of a diverse team that values authenticity and innovation.
T

Contact Detail:

Tyk Technologies Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer in London

✨Tip Number 1

Network like a pro! Reach out to current or former employees at Tyk on LinkedIn. A friendly chat can give you insider info and maybe even a referral, which can really boost your chances.

✨Tip Number 2

Prepare for the interview by diving deep into Tyk's products and services. Show us that you understand how our API Management platform works and how it impacts various industries. This will impress us and show your genuine interest!

✨Tip Number 3

Practice your problem-solving skills! As a Site Reliability Engineer, you'll face real-time challenges. Brush up on your technical skills and be ready to tackle hypothetical scenarios during the interview.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re serious about joining our team at Tyk!

We think you need these skills to ace Site Reliability Engineer in London

Kubernetes
AWS / EKS
Linux Administration
Terraform
Infrastructure as Code (IaC)
Helm
Go Programming
Python Programming
MongoDB
Redis
Prometheus
Grafana
Networking Concepts
Incident Management
Automation

Some tips for your application 🫡

Show Your Passion: When writing your application, let your enthusiasm for the role shine through! We want to see how excited you are about becoming a Site Reliability Engineer at Tyk and how you can contribute to our mission of connecting every system in the world.

Tailor Your CV: Make sure to customise your CV to highlight relevant experience and skills that match the job description. We love seeing how your background aligns with what we’re looking for, so don’t hold back on showcasing your expertise in Kubernetes, AWS, and all things SRE!

Be Clear and Concise: Keep your application straightforward and to the point. We appreciate clarity, so make sure your writing is easy to read and free from jargon. This will help us understand your qualifications and how you can fit into our team.

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows you’re keen on joining our awesome team at Tyk!

How to prepare for a job interview at Tyk Technologies

✨Know Your Tech Inside Out

As a Site Reliability Engineer, you'll need to be well-versed in Kubernetes, AWS, and Linux. Brush up on your knowledge of these technologies and be ready to discuss your experience with them. Prepare specific examples of how you've used these tools to solve problems or improve systems.

✨Show Your Problem-Solving Skills

Tyk values innovation and continuous improvement. Be prepared to share instances where you've identified reliability issues and implemented solutions. Think about metrics you've introduced or dashboards you've built that enhanced operational efficiency.

✨Emphasise Collaboration

Since this role involves working closely with a squad, highlight your collaboration skills. Share examples of successful teamwork, especially in incident management or when expanding multi-region capabilities. Show that you can communicate effectively with both technical and non-technical team members.

✨Be Ready for Scenario Questions

Expect scenario-based questions that test your incident management skills. Prepare to discuss how you would handle specific incidents, including your approach to post-incident analysis and documentation. This will demonstrate your proactive mindset and commitment to continuous improvement.

Site Reliability Engineer in London
Tyk Technologies
Location: London

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>