We’re teaming up with one of the leading names in AI, known for pushing the boundaries of what’s possible with large-scale generative models and next-gen cloud infrastructure. We\’re offering a rare opportunity to step into a Staff Machine Learning Engineer role and play a key part in shaping the platforms powering millions of users across the globe.
You\’ll be joining a team of exceptional researchers and engineers, all passionate about advancing the field and delivering world-class AI experiences.
Location: London, Oxford Street (Hybrid; onsite in London once a week)
Rate: £1000 – £1200 per day – Outside IR35
Start date: ASAP,12-month contract
What you\’ll be doing
- Leading the development of scalable, reliable systems for training and fine-tuning transformer-based models
- Optimising inference pipelines for real-time applications — aiming for low latency and high throughput
- Exploring and applying advanced fine-tuning methods like LoRA, prefix-tuning, and adapters
- Tuning performance across GPUs and systems using tools like DeepSpeed, Triton, TensorRT, and even custom kernels
- Working closely with research, platform, and product teams to deliver new features and enhance the developer experience
- Identifying and resolving bottlenecks through profiling, benchmarking, and performance tuning
- Helping to define best practices for building, testing, and maintaining production ML services and APIs
- Mentoring other engineers and helping to foster a culture of technical excellence and innovation
What our client is looking for
- 7+ years of experience building and deploying large-scale ML systems in production
- Strong Python and PyTorch skills, with a deep understanding of transformers, LLMs, and multimodal models
- Hands-on experience with distributed training frameworks like DeepSpeed or FSDP
- Solid background in GPU programming (CUDA, ROCm) and inference optimisation
- Practical experience with parameter-efficient fine-tuning techniques in real-world applications
- Familiarity with container orchestration tools (Kubernetes, Kubeflow) and cloud-native environments
- Knowledge of serving frameworks like Triton, vLLM, or similar
- Clean, maintainable coding style and a strong testing discipline
- Great communication skills and a collaborative mindset
For more information contact Cam Dalziel, cameron.dalziel@aspirerecruitmentgroup.com, or apply today.
Contact Detail:
Aspire Technology Recruiting Team