Site Reliability Engineer in London

Job Board

Companies

WALT Labs

Site Reliability Engineer

Site Reliability Engineer in London

London Full-Time 36000 - 60000 £ / year (est.) No working from home possible

Apply Now

At a Glance

Tasks: Maintain cloud infrastructure and manage incidents for seamless operations.
Company: WALT Labs, a leading managed service provider in cloud technology.
Benefits: 20 holiday days, private health insurance, and career growth opportunities.
Other info: On-site role in Kings Cross, London, with a collaborative work environment.
Why this job: Join a dynamic team and work with cutting-edge Google Cloud Platform technologies.
Qualifications: 8-10 years experience in cloud management and strong GCP knowledge required.

The predicted salary is between 36000 - 60000 £ per year.

Company Description

WALT Labs, a leading managed service provider, is dedicated to empowering businesses by harnessing the power of cloud technology. Our team specializes in delivering customized solutions tailored to meet the unique needs of our clients, driving growth and operational efficiency across industries. From supporting small businesses with seamless data migration to enabling large corporations to manage complex infrastructure projects, we provide exceptional service while staying at the forefront of cloud technology advancements.

Role Description

This is a full-time on-site role 3 days a week minimum in Kings Cross London. We are seeking a skilled Site Reliability Engineer with a strong focus on Google Cloud Platform (GCP) to join our dynamic team. In this role, you’ll be responsible for maintaining cloud infrastructure, managing incidents, and ensuring seamless operations for our clients. You’ll use tools like incident.io and JIRA to manage and resolve support requests efficiently.

Responsibilities

Serve as L2 on-call escalation point for complex technical issues requiring advanced troubleshooting
Lead response to critical incidents, coordinating multiple teams and ensuring effective communication
Provide expert-level support for GCP services including advanced networking, security, and architecture
Perform advanced Google Workspace administration including domain management, security policies, and integration
Use incident.io to manage escalated incidents, major incidents, and coordinate war room activities
Optimize support workflows in JIRA, creating automation rules and improving ticket routing
Monitor and tune infrastructure performance using advanced Grafana queries and custom metrics
Lead technical projects including migrations, upgrades, and new service implementations
Create comprehensive documentation including architectural diagrams, runbooks, and best practices guides
Achieve minimum 50% billable hours through complex Cloud Assist/Managed Cloud customers and consulting engagements
Mentor Cloud Support Engineers and juniors through formal and informal training sessions
Identify and implement process improvements to increase efficiency and reduce resolution time
Conduct thorough root cause analysis for recurring issues and implement permanent fixes
Present technical solutions and recommendations to customer stakeholders and management
Design and implement monitoring strategies for complex multi-cloud environments
Develop automation scripts and tools to improve team efficiency and reduce manual work
Participate in pre-sales activities providing technical expertise for solution design
Review and approve changes to production environments following change management procedures
Lead knowledge sharing sessions and technical deep-dives for the team
Coordinate with vendor support for complex issues requiring manufacturer assistance
Maintain expertise in multiple GCP services and stay current with new feature releases
Participation in business hours escalation rotation

Qualifications

3-5 years experience with Google Cloud Platform
Minimum 2 Google Cloud Professional certifications
Advanced Kubernetes knowledge and troubleshooting
Proficient in Infrastructure as Code (Terraform)
Strong scripting abilities (Python, Go, Bash)
Expert with monitoring tools (Grafana, Datadog)
Experience leading incident response
Excellent communication and mentoring skills
Proven track record of process improvement
Ability to manage multiple priorities effectively
Strong customer service orientation

Benefits

20 holiday days + bank holidays (earn 1.5 days every 3 years)
Private health insurance

Site Reliability Engineer in London employer: WALT Labs

WALT Labs is an exceptional employer that fosters a collaborative and innovative work culture in the heart of Kings Cross, London. With a strong commitment to employee growth, we offer extensive training opportunities and support for professional development, particularly in cloud technologies like Google Cloud Platform. Our competitive benefits package, including private health insurance and generous holiday allowances, ensures that our team members are well-supported both personally and professionally.

Contact Details:

WALT Labs Recruitment Team

View WALT Labs profile

StudySmarter Expert Advice🤫

We think this is how you could land Site Reliability Engineer in London

✨Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with current employees at WALT Labs. A friendly chat can sometimes lead to opportunities that aren’t even advertised!

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to Google Cloud Platform. This gives you a chance to demonstrate your expertise beyond just a CV.

✨Tip Number 3

Prepare for the interview by brushing up on common SRE scenarios and incident management processes. Practise explaining how you’ve tackled challenges in the past, especially using tools like incident.io and JIRA.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining the WALT Labs team!

We think you need these skills to ace Site Reliability Engineer in London

Google Cloud Platform (GCP)

Incident Management

JIRA

Observability Tools (Grafana)

Troubleshooting Skills

Cloud Security Best Practices

Performance Optimisation

Scripting (Python, Terraform)

Communication Skills

Customer Service Skills

Task Prioritisation

Technical Support

Documentation Skills

Collaboration Skills

Some tips for your application 🫡

Tailor Your CV:Make sure your CV is tailored to the Site Reliability Engineer role. Highlight your experience with Google Cloud Platform and any relevant tools like incident.io and JIRA. We want to see how your skills match what we're looking for!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you're passionate about cloud technology and how your background makes you a great fit for our team. We love seeing enthusiasm and personality in applications.

Showcase Your Problem-Solving Skills:In your application, don’t forget to mention specific examples of how you've tackled challenges in cloud environments. We’re all about strong troubleshooting skills, so let us know how you’ve made a difference in past roles!

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you’re serious about joining our awesome team at WALT Labs!

How to prepare for a job interview at WALT Labs

✨Know Your GCP Inside Out

Make sure you brush up on your Google Cloud Platform knowledge. Be ready to discuss specific services you've used and how they relate to the role. Prepare examples of how you've managed cloud infrastructure or resolved incidents in the past.

✨Familiarise Yourself with Incident Management Tools

Since you'll be using tools like incident.io and JIRA, it’s a good idea to get comfortable with them before the interview. Think of scenarios where you’ve effectively used these tools to manage incidents or support requests, and be ready to share those experiences.

✨Show Off Your Troubleshooting Skills

Prepare to discuss your approach to troubleshooting in cloud environments. Have a couple of real-life examples ready that showcase your problem-solving skills, especially under pressure. This will demonstrate your ability to handle the challenges of the role.

✨Communicate Clearly and Confidently

Excellent communication is key, especially when dealing with clients and stakeholders. Practice explaining complex technical concepts in simple terms. This will not only help you during the interview but also show that you can provide great customer service.

Site Reliability Engineer in London

WALT Labs

Location: London

Apply Now

Site Reliability Engineer in London

At a Glance

Site Reliability Engineer in London employer: WALT Labs

StudySmarter Expert Advice🤫

We think you need these skills to ace Site Reliability Engineer in London

Some tips for your application 🫡

How to prepare for a job interview at WALT Labs

Company

Product

Help