Staff Machine Learning Engineer (London Area)
Staff Machine Learning Engineer (London Area)

Staff Machine Learning Engineer (London Area)

London Full-Time No home office possible
A

We’re teaming up with one of the leading names in AI, known for pushing the boundaries of what’s possible with large-scale generative models and next-gen cloud infrastructure. We\’re offering a rare opportunity to step into a Staff Machine Learning Engineer role and play a key part in shaping the platforms powering millions of users across the globe.

You\’ll be joining a team of exceptional researchers and engineers, all passionate about advancing the field and delivering world-class AI experiences.

Location: London, Oxford Street (Hybrid; onsite in London once a week)

Rate: £1000 – £1200 per day – Outside IR35

Start date: ASAP,12-month contract

What you\’ll be doing

  • Leading the development of scalable, reliable systems for training and fine-tuning transformer-based models
  • Optimising inference pipelines for real-time applications — aiming for low latency and high throughput
  • Exploring and applying advanced fine-tuning methods like LoRA, prefix-tuning, and adapters
  • Tuning performance across GPUs and systems using tools like DeepSpeed, Triton, TensorRT, and even custom kernels
  • Working closely with research, platform, and product teams to deliver new features and enhance the developer experience
  • Identifying and resolving bottlenecks through profiling, benchmarking, and performance tuning
  • Helping to define best practices for building, testing, and maintaining production ML services and APIs
  • Mentoring other engineers and helping to foster a culture of technical excellence and innovation

What our client is looking for

  • 7+ years of experience building and deploying large-scale ML systems in production
  • Strong Python and PyTorch skills, with a deep understanding of transformers, LLMs, and multimodal models
  • Hands-on experience with distributed training frameworks like DeepSpeed or FSDP
  • Solid background in GPU programming (CUDA, ROCm) and inference optimisation
  • Practical experience with parameter-efficient fine-tuning techniques in real-world applications
  • Familiarity with container orchestration tools (Kubernetes, Kubeflow) and cloud-native environments
  • Knowledge of serving frameworks like Triton, vLLM, or similar
  • Clean, maintainable coding style and a strong testing discipline
  • Great communication skills and a collaborative mindset

For more information contact Cam Dalziel, cameron.dalziel@aspirerecruitmentgroup.com, or apply today.

A

Contact Detail:

Aspire Technology Recruiting Team

Staff Machine Learning Engineer (London Area)
Aspire Technology
A
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>