At a Glance
- Tasks: Lead the reliability of AI platforms and optimise infrastructure for high-performance computing.
- Company: Join IsoLabs, a pioneering tech company focused on advancing human health through AI.
- Benefits: Competitive salary, flexible work schedule, and opportunities for professional growth.
- Why this job: Make a real impact in AI by ensuring platform reliability and scalability.
- Qualifications: Experience with large-scale AI workloads and cloud computing, especially GCP.
- Other info: Collaborative environment with a commitment to diversity and inclusion.
The predicted salary is between 80000 - 100000 ÂŁ per year.
About Iso Isomorphic Labs (IsoLabs) launched in 2021 to advance human health by building AI models that accelerate scientific discovery.
Your Impact
You will play a pivotal role in ensuring the reliability and scalability of the foundations making our AI work possible.
What You Will Do
- Own the end-to-end strategy for platform reliability, focusing on accelerator (GPU/TPU) infrastructure and workload orchestration.
- Lead reliability work for our global job scheduler, designing and implementing a robust “test harness” to validate infrastructure upgrades.
- Architect and optimize next-generation inference services to address scaling limits and maintain high-throughput performance.
- Overhaul logging and monitoring systems to provide proactive alerting and telemetry that identifies failures before they impact research.
- Improve internal CI/CD stability, reducing failure rates and speeding feedback loops for the engineering organization.
- Contribute to core technical decisions on tooling and architecture while partnering with science, product, and operations teams.
Skills and Qualifications
- Proven experience architecting and managing large-scale AI/ML workloads in production.
- Expertise in cloud compute design, specifically within Google Cloud Platform (GCP).
- Significant experience deploying and managing complex workloads within Kubernetes (GKE).
- Professional familiarity with NVIDIA GPU generations and high-performance compute.
- Strong programming skills and a “reliability-first” approach to software development.
Nice to Have
- Career spanning both ML software engineering and infrastructure SRE roles.
- Experience leading multidisciplinary projects and navigating complex stakeholder requirements.
- Familiarity with workload scheduling, ML efficiency research, and hardware benchmarking.
- Experience with Google TPU generations and specialized ML-driven R&D cycles.
We require you to be able to come into the office three days a week (currently Tuesday, Wednesday, and one other day depending on your team).
We are committed to equal employment opportunities regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy or related condition (including breastfeeding) or any other basis protected by applicable law. If you have a disability or additional need that requires accommodation, please let us know. By submitting an application, your data will be processed in line with our privacy policy.
Senior Software Engineer, ML Platform (Stability & Infrastructure) employer: Isomorphic Labs
Contact Detail:
Isomorphic Labs Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Senior Software Engineer, ML Platform (Stability & Infrastructure)
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, attend meetups, and connect with people on LinkedIn. You never know who might have the inside scoop on job openings or can refer you directly.
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to AI/ML. This gives potential employers a taste of what you can do and sets you apart from the crowd.
✨Tip Number 3
Prepare for interviews by practising common technical questions and scenarios relevant to the role. We recommend doing mock interviews with friends or using online platforms to get comfortable with the process.
✨Tip Number 4
Apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team at IsoLabs.
We think you need these skills to ace Senior Software Engineer, ML Platform (Stability & Infrastructure)
Some tips for your application 🫡
Tailor Your CV: Make sure your CV reflects the skills and experiences that align with the Senior Software Engineer role. Highlight your expertise in AI/ML workloads, cloud compute design, and any relevant projects you've led.
Craft a Compelling Cover Letter: Use your cover letter to tell us why you're passionate about reliability and scalability in AI platforms. Share specific examples of how you've tackled similar challenges in the past to show us what you can bring to the team.
Showcase Your Technical Skills: Don’t shy away from detailing your programming skills and experience with tools like Kubernetes and GCP. We want to see how your technical background makes you a great fit for our infrastructure needs.
Apply Through Our Website: We encourage you to submit your application through our website. It’s the best way for us to receive your details and ensures you’re considered for the role. Plus, it’s super easy!
How to prepare for a job interview at Isomorphic Labs
✨Know Your Tech Inside Out
Make sure you’re well-versed in the technologies mentioned in the job description, especially around AI/ML workloads and cloud computing. Brush up on your knowledge of Google Cloud Platform and Kubernetes, as these will likely come up during technical discussions.
✨Showcase Your Problem-Solving Skills
Prepare to discuss specific challenges you've faced in previous roles, particularly around reliability and scalability. Use the STAR method (Situation, Task, Action, Result) to structure your answers and highlight how you tackled complex issues.
✨Understand Their Mission
IsoLabs is all about advancing human health through AI. Familiarise yourself with their projects and values. Being able to articulate how your experience aligns with their mission will show that you’re genuinely interested and invested in the role.
✨Ask Insightful Questions
Prepare thoughtful questions that demonstrate your interest in the role and the company. Inquire about their current challenges with platform reliability or how they envision the future of their ML infrastructure. This not only shows your enthusiasm but also helps you gauge if the company is the right fit for you.