Founding Cloud SRE β€” AI/ML Platform & GPU Compute in London

Founding Cloud SRE β€” AI/ML Platform & GPU Compute in London

London Full-Time 60000 - 80000 € / year (est.) No home office possible
I

At a Glance

  • Tasks: Shape the reliability of large-scale AI systems and GPU compute infrastructure.
  • Company: Join Icehouseventures, a pioneering company in AI and cloud technology.
  • Benefits: Enjoy a hybrid work model, competitive salary, and opportunities for growth.
  • Other info: Be part of a founding team with exciting challenges and career advancement.
  • Why this job: Make a real impact on cutting-edge AI platforms and cloud infrastructure.
  • Qualifications: Experience in site reliability engineering and a passion for AI/ML technologies.

The predicted salary is between 60000 - 80000 € per year.

Icehouseventures is seeking a Staff Cloud Site Reliability Engineer to shape the reliability of large-scale AI systems and GPU compute infrastructure. This founding role involves building and scaling reliability foundations for the AI cloud platform and ensuring cloud infrastructure resilience.

Responsibilities include:

  • Operationalizing SLOs
  • Improving incident response
  • Creating automation for operations

The position offers a hybrid work model, encouraging collaboration in the London office while allowing remote work.

Founding Cloud SRE β€” AI/ML Platform & GPU Compute in London employer: Icehouseventures

Icehouseventures is an exceptional employer, offering a dynamic work environment that fosters innovation and collaboration in the heart of London. With a strong focus on employee growth, we provide opportunities for professional development and the chance to work on cutting-edge AI and GPU technologies. Our hybrid work model promotes a healthy work-life balance, making it an ideal place for those seeking meaningful and rewarding employment.

I

Contact Detail:

Icehouseventures Recruiting Team

StudySmarter Expert Advice🀫

We think this is how you could land Founding Cloud SRE β€” AI/ML Platform & GPU Compute in London

✨Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with potential colleagues on LinkedIn. You never know who might have the inside scoop on job openings or can put in a good word for you.

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects related to cloud infrastructure and AI systems. This gives you a chance to demonstrate your expertise and passion beyond just a CV.

✨Tip Number 3

Prepare for those interviews! Research common SRE interview questions and practice your responses. Be ready to discuss your experience with operationalising SLOs and improving incident response – it’s all about showing how you can add value.

✨Tip Number 4

Don’t forget to apply through our website! We love seeing applications directly from candidates who are excited about joining us. It shows initiative and helps us get to know you better right from the start.

We think you need these skills to ace Founding Cloud SRE β€” AI/ML Platform & GPU Compute in London

Site Reliability Engineering
Cloud Infrastructure Management
AI Systems Reliability
GPU Compute Infrastructure
Operationalising SLOs
Incident Response Improvement
Automation for Operations

Some tips for your application 🫑

Tailor Your CV:Make sure your CV speaks directly to the role of Founding Cloud SRE. Highlight your experience with AI systems and GPU compute infrastructure, and don’t forget to mention any relevant SLOs you've operationalised!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Share your passion for cloud reliability and how you can contribute to building a resilient AI cloud platform. Be genuine and let your personality come through.

Showcase Your Problem-Solving Skills:In your application, give examples of how you've improved incident response or created automation in past roles. We love seeing how you tackle challenges head-on!

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to see your application and get to know you better. Plus, it shows you're keen on joining our team!

How to prepare for a job interview at Icehouseventures

✨Know Your Cloud Basics

Make sure you brush up on your cloud computing fundamentals. Understand the key concepts around AI systems and GPU compute infrastructure, as this role is all about shaping reliability in these areas. Being able to discuss how you would operationalise SLOs will definitely impress.

✨Showcase Your Incident Response Skills

Prepare examples from your past experiences where you've improved incident response times or handled critical outages. This role requires a proactive approach, so be ready to share specific strategies you've implemented to enhance reliability.

✨Emphasise Automation Experience

Automation is key in this position, so highlight any tools or scripts you've developed to streamline operations. Discuss how these automations have positively impacted your previous teams, and be prepared to talk about your approach to creating efficient workflows.

✨Be Ready for Collaboration

Since this role offers a hybrid work model, demonstrate your ability to collaborate effectively both in-person and remotely. Share examples of how you've successfully worked with teams in different settings, and express your enthusiasm for contributing to a collaborative culture.