ML Engineer - Infrastructure
ML Engineer - Infrastructure

ML Engineer - Infrastructure

London Full-Time 43200 - 72000 Β£ / year (est.) No home office possible
Go Premium
C

At a Glance

  • Tasks: Design and maintain ML cloud infrastructure, focusing on automation and performance.
  • Company: Convergence is revolutionising AI integration in daily life with innovative technology.
  • Benefits: Enjoy a competitive salary, professional growth opportunities, and a collaborative work environment.
  • Why this job: Join us to tackle exciting challenges and shape the future of human-AI collaboration.
  • Qualifications: 3+ years in ML infrastructure, strong Python skills, and experience with GCP and Slurm.
  • Other info: Be part of a well-funded startup making a significant impact in the AI space.

The predicted salary is between 43200 - 72000 Β£ per year.

At Convergence, we're transforming the way AI integrates into our daily lives. Our team is developing the next generation of AI agents that don't just process information but take actions, learn from experience, and collaborate with humans. By introducing Large Meta Learning Models (LMLMs) that integrate memory as a core component, we're enabling AI to improve continuously through user feedback and acquire new skills during real-time use. We believe in freeing individuals and businesses from mundane, repetitive tasks, allowing them to focus on innovative and creative work that truly matters. Our personalised AI assistant, proxy, collaborates with users to enhance productivity and creativity. With a $12 million pre-seed funding from Balderton Capital, Salesforce Ventures, and Shopify Ventures, we're poised to make a significant impact in the AI space. Join us in shaping the future of human-AI collaboration and be part of our mission to transform the AI landscape.

Responsibilities

  • Design, implement, and maintain our ML-focused cloud infrastructure on GCP using Infrastructure as Code (Terraform)
  • Build and manage HPC clusters with Slurm for distributed ML workloads, focusing on GPU/TPU utilization and job scheduling
  • Develop and maintain ML pipeline automation tools and ML-specific CI/CD workflows in Python
  • Design and optimize data storage solutions for ML datasets, model artifacts, and feature stores
  • Implement comprehensive monitoring, logging, and alerting solutions for ML model performance and infrastructure health
  • Collaborate with ML engineers and data scientists to provide robust infrastructure for model training and deployment
  • Lead and implement security best practices for ML systems, including model security and data protection

Requirements

  • 3+ years of experience in ML infrastructure or ML platform engineering
  • Strong proficiency in Python for ML pipeline automation and tooling
  • Extensive experience with Slurm cluster management for large-scale ML workloads
  • Proven track record with Terraform and Infrastructure as Code for ML environments
  • Solid understanding of GCP's ML-specific services (Vertex AI, AI Platform, etc.)
  • Experience with distributed training systems and model serving infrastructure
  • Experience with ML observability tools and performance monitoring
  • Excellent problem-solving skills with a focus on ML system reliability and optimization

Bonus Qualifications

  • Knowledge of ML-specific orchestration tools (e.g., MLflow, Ray)
  • Experience with high-performance computing for ML training
  • Contributions to ML infrastructure-related open-source projects
  • Experience with GPU/TPU cluster management and optimization
  • Background in ML operations (MLOps) or AI reliability engineering
  • Familiarity with vector databases and efficient embedding storage/retrieval

Why Join Us?

  • Be at the cutting edge of AI and LLM technology
  • Work on challenging problems that impact users' daily lives
  • Collaborative and innovative work environment
  • Opportunities for professional growth and learning
  • Competitive salary and benefits package

ML Engineer - Infrastructure employer: Convergence

At Convergence, we are at the forefront of AI innovation, creating a collaborative and dynamic work environment that empowers our employees to tackle challenging problems and make a real impact on users' lives. With a strong focus on professional growth, competitive salaries, and a supportive culture, we offer unique opportunities for those looking to advance their careers in the rapidly evolving field of machine learning. Join us in London and be part of a team that is shaping the future of human-AI collaboration.
C

Contact Detail:

Convergence Recruiting Team

StudySmarter Expert Advice 🀫

We think this is how you could land ML Engineer - Infrastructure

✨Tip Number 1

Familiarise yourself with GCP's ML-specific services, especially Vertex AI and AI Platform. Understanding these tools will not only help you in interviews but also demonstrate your commitment to the role.

✨Tip Number 2

Showcase your experience with Infrastructure as Code, particularly Terraform. Be ready to discuss specific projects where you've implemented this, as it’s a key requirement for the position.

✨Tip Number 3

Connect with current employees or alumni from Convergence on LinkedIn. Engaging with them can provide insights into the company culture and expectations, which can be invaluable during your interview.

✨Tip Number 4

Prepare to discuss your problem-solving approach, especially in relation to ML system reliability and optimisation. Having concrete examples ready will help you stand out as a candidate who can tackle real-world challenges.

We think you need these skills to ace ML Engineer - Infrastructure

Python Programming
Terraform
Slurm Cluster Management
GCP (Google Cloud Platform)
ML Pipeline Automation
CI/CD Workflows
Data Storage Solutions
Monitoring and Logging Solutions
Security Best Practices for ML Systems
Distributed Training Systems
Model Serving Infrastructure
ML Observability Tools
Problem-Solving Skills
High-Performance Computing
Knowledge of ML Orchestration Tools
GPU/TPU Cluster Management
Background in MLOps or AI Reliability Engineering
Familiarity with Vector Databases

Some tips for your application 🫑

Tailor Your CV: Make sure your CV highlights relevant experience in ML infrastructure and platform engineering. Emphasise your proficiency in Python, Terraform, and Slurm cluster management, as these are key requirements for the role.

Craft a Compelling Cover Letter: In your cover letter, express your passion for AI and how your skills align with Convergence's mission. Mention specific projects or experiences that demonstrate your ability to design and maintain ML-focused cloud infrastructure.

Showcase Relevant Projects: If you have worked on any open-source projects or personal projects related to ML infrastructure, include them in your application. This will showcase your hands-on experience and commitment to the field.

Highlight Problem-Solving Skills: In your application, provide examples of how you've tackled challenges in ML system reliability and optimisation. This will demonstrate your problem-solving abilities, which are crucial for the role.

How to prepare for a job interview at Convergence

✨Showcase Your Technical Skills

Be prepared to discuss your experience with Python, Terraform, and Slurm in detail. Bring examples of past projects where you've designed or maintained ML infrastructure, and be ready to explain the challenges you faced and how you overcame them.

✨Understand GCP Services

Familiarise yourself with Google Cloud Platform's ML-specific services like Vertex AI and AI Platform. Demonstrating a solid understanding of these tools will show that you're ready to hit the ground running and can effectively contribute to the team's goals.

✨Discuss Collaboration

Since the role involves working closely with ML engineers and data scientists, be ready to talk about your collaborative experiences. Share examples of how you've worked in teams to build robust infrastructures and how you’ve contributed to successful project outcomes.

✨Emphasise Problem-Solving Abilities

Prepare to discuss specific instances where you've tackled complex problems related to ML system reliability and optimisation. Highlight your analytical thinking and how you approach troubleshooting in high-pressure situations.

ML Engineer - Infrastructure
Convergence
Location: London
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

C
  • ML Engineer - Infrastructure

    London
    Full-Time
    43200 - 72000 Β£ / year (est.)
  • C

    Convergence

    50-100
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>