Site Reliability Engineer

Site Reliability Engineer

Full-Time 36000 - 60000 £ / year (est.) No home office possible
T

At a Glance

  • Tasks: Manage and improve our platform while solving reliability issues and automating tasks.
  • Company: Join Tyk, a leading API Management platform with a global impact.
  • Benefits: Unlimited paid holidays, flexible hours, employee share scheme, and generous parental leave.
  • Why this job: Be part of a dynamic team shaping the future of technology and innovation.
  • Qualifications: Experience with Kubernetes, AWS, and strong collaboration skills required.
  • Other info: Embrace a culture of continuous improvement and creativity in a remote-first environment.

The predicted salary is between 36000 - 60000 £ per year.

Who are Tyk, and what do we do? The Tyk API Management platform is helping to drive the connected world and power new products and services. We’re changing the way that organisations connect any number of their systems and services. Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across various industries including retail, finance, telecoms, healthcare, and media. Founded in 2015 with offices in London – UK, London – Ontario, Atlanta and Singapore, we have many thousands of users of our B2B platform across the globe.

Our Mission: Tyk is on a mission to connect every system in the world. We’ve started by building an API Management platform. We offer unlimited paid holidays and remote working from anywhere in the world for everyone. Tyk was founded on the principle of offering flexibility and autonomy to our employees, allowing them to achieve their best results.

The role: We’re looking for a Site Reliability Engineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways to improve, as we will look to you for new ideas, solutions and metrics on how we can improve the platform. You will also be our first line of incident management to our clients and will help define our response going forward.

Here’s what you’ll be responsible for:

  • Maintaining global Tyk Cloud within SL(A/I/O)s you will help to define
  • Identifying reliability issues and working together with your squad to solve them
  • Identifying and introducing new metrics and building relevant dashboards
  • Participating in the on-call rotation
  • Working with your squad to expand multi-region and multi-cloud reach of the platform
  • Documenting operational knowledge
  • Conducting post-incident analysis
  • Automating common tasks
  • Be a key shaper and contributor to our continuous improvement agenda
  • Reliability of our new global Tyk Cloud platform
  • Automation of operations and support
  • Writing and maintaining documentation on SRE processes and policies
  • Recommending and implementing ways of driving operational efficiency
  • Assisting in penetration testing for Cloud through liaising with our provider
  • Incident management

Here’s what we’re looking for:

  • Strong collaboration skills
  • Launching and operating production scale Kubernetes clusters
  • Designing and operating infrastructure on AWS and other providers
  • Operating MongoDB (or other document database) clusters
  • Operating Redis (or other key-value storage) clusters
  • Administering Linux servers
  • Maintaining distributed software
  • Operating Prometheus and Grafana
  • Operating logging collection and analysis systems
  • Participating in the on-call rotation (16:00pm – 4:00am UTC)

Skills:

  • Kubernetes & containers (advanced)
  • AWS / EKS (advanced)
  • Linux (advanced)
  • Terraform and IaC in general (proficient)
  • Helm (proficient)
  • Go and/or Python (familiar)
  • MongoDB (or similar)
  • Redis (or similar)
  • Monitoring – Prometheus, Grafana, Thanos (familiar)
  • Grasp of networking concepts (subnets, routing, peering, load balancing, NAT, etc.)
  • Common networking protocols (DNS, TCP/IP, HTTP, TLS, UDP)
  • Proactive, energetic, innovative and change oriented

Nice to have:

  • GCP or Azure
  • Bare metal infrastructure engineering
  • API management experience
  • Large scale distributed storage management
  • Familiarity with Rancher
  • CKA/CKAD/CKS
  • Creating and delivering production software in Go language

Here’s why you should join us:

  • Everyone has unlimited paid holiday.
  • Total flexibility in hours.
  • Employee share scheme.
  • Generous maternity and paternity leave.
  • Company retreats.

We all share the same vision – we value authenticity, respect, responsibility, independence, honesty, diversity and inclusion and most importantly treating others how you wish to be treated. We look for like-minded people who bring their personalities to work every day, strive to achieve their personal goals and who are willing to challenge the way we do things.

Tyk is an equal opportunities employer and we are determined to ensure that no applicant or employee receives less favourable treatment on the grounds of gender, age, disability, religion, belief, sexual orientation, marital status, or race.

Site Reliability Engineer employer: Tyk Technologies

At Tyk, we pride ourselves on being an exceptional employer, offering unlimited paid holidays and the flexibility to work remotely from anywhere in the world. Our inclusive work culture fosters creativity and innovation, empowering employees to take ownership of their roles while providing ample opportunities for personal and professional growth. Join us in our mission to connect every system in the world, and be part of a diverse team that values authenticity, respect, and continuous improvement.
T

Contact Detail:

Tyk Technologies Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer

✨Tip Number 1

Network like a pro! Reach out to current or former employees at Tyk on LinkedIn. A friendly chat can give you insider info and maybe even a referral, which can really boost your chances.

✨Tip Number 2

Prepare for the interview by brushing up on your SRE skills. Be ready to discuss your experience with Kubernetes, AWS, and incident management. Show us how you’ve tackled challenges in the past!

✨Tip Number 3

Don’t just wait for the job to come to you! Apply through our website and keep an eye on new openings. The sooner you apply, the better your chances of landing that dream role with us.

✨Tip Number 4

Show your passion for continuous improvement! Share examples of how you've automated tasks or improved processes in previous roles. We love candidates who are proactive and innovative!

We think you need these skills to ace Site Reliability Engineer

Kubernetes
AWS
EKS
Linux Administration
Terraform
Infrastructure as Code (IaC)
Helm
Go
Python
MongoDB
Redis
Prometheus
Grafana
Networking Concepts
Incident Management

Some tips for your application 🫡

Tailor Your Application: Make sure to customise your CV and cover letter for the Site Reliability Engineer role. Highlight your experience with Kubernetes, AWS, and any relevant projects that showcase your problem-solving skills. We want to see how you can contribute to our mission!

Show Your Curiosity: In your application, let us know about your curiosity and eagerness to improve systems. Share examples of how you've identified issues and implemented solutions in past roles. We love candidates who are proactive and innovative!

Be Authentic: Don’t be afraid to show your personality! At Tyk, we value authenticity and want to get to know the real you. Share your passions and what drives you in the tech world. This helps us see if you’re a good fit for our culture.

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you’re serious about joining our team at Tyk!

How to prepare for a job interview at Tyk Technologies

✨Know Your Tech Stack

Make sure you’re well-versed in the technologies mentioned in the job description, especially Kubernetes, AWS, and Linux. Brush up on your knowledge of MongoDB and Redis too, as these are crucial for the role. Being able to discuss your experience with these tools will show that you're ready to hit the ground running.

✨Show Your Problem-Solving Skills

Prepare to discuss specific incidents where you've identified reliability issues and how you resolved them. Tyk values continuous improvement, so think of examples where you’ve automated tasks or improved processes. This will demonstrate your proactive approach and innovative mindset.

✨Understand Tyk's Mission

Familiarise yourself with Tyk’s mission to connect every system in the world. Be ready to share your thoughts on how API management plays a role in this vision. Showing that you align with their goals will help you stand out as a candidate who truly understands the company’s purpose.

✨Ask Insightful Questions

Prepare thoughtful questions about Tyk’s platform, team dynamics, and future projects. This not only shows your interest in the role but also your eagerness to contribute to their continuous improvement agenda. Asking about their approach to incident management or automation can spark a great conversation.

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>