Research Engineer, Model Inference & Serving - London
Research Engineer, Model Inference & Serving - London

Research Engineer, Model Inference & Serving - London

London Full-Time 60000 - 80000 £ / year (est.) Home office (partial)
H

At a Glance

  • Tasks: Build and optimise AI inference systems for cutting-edge multimodal models.
  • Company: Join a pioneering AI company pushing the boundaries of superintelligence.
  • Benefits: Competitive salary, hybrid work, and opportunities for professional growth.
  • Other info: Exciting career development opportunities and a multicultural team atmosphere.
  • Why this job: Shape the future of AI while collaborating with top talent in a dynamic environment.
  • Qualifications: Strong software engineering skills and experience with deep learning frameworks.

The predicted salary is between 60000 - 80000 £ per year.

About H: H exists to push the boundaries of superintelligence with agentic AI. By automating complex, multi-step tasks typically performed by humans, AI agents will help unlock full human potential. H is hiring the world's best AI talent, seeking those who are dedicated as much to building safely and responsibly as to advancing disruptive agentic capabilities. We promote a mindset of openness, learning, and collaboration, where everyone has something to contribute.

About the Team: The Inference team builds and operates the systems that serve H's foundational models in production. We focus on multimodal inference and serving for Computer Use Agents, optimizing across both the inference engine layer (e.g., vLLM, SGLang) and the model serving layer (e.g., disaggregated inference, intelligent routing). Agentic inference brings constraints around context length, multimodality, and tool calls, which we address by co-designing with the Models team on training-time choices and with the agent teams on how models are deployed. We operate at the intersection of research and production, translating cutting-edge inference techniques into the systems that power H's next generation of agents. We are looking for strong engineers excited about inference to join the team and help shape the systems behind superintelligent AI.

Key Responsibilities:

  • Build and operate the inference stack that serves H's multimodal agentic models
  • Improve latency, throughput, and cost of model serving across the stack
  • Research and implement inference techniques tailored to agent workloads
  • Co-design with the Models team on training-time decisions that affect inference
  • Collaborate with cross-functional teams to integrate inference into agentic AI products
  • Evaluate inference, serving, and hardware platforms, and communicate findings to stakeholders
  • Stay current with advancements in inference, model serving, and accelerator technology

Requirements:

Technical skills:

  • Strong software engineering track record
  • Proficient in Python and at least one systems language (Rust, C++, or Go)
  • Hands-on experience with deep learning frameworks (PyTorch, JAX), preferably in an industry setting
  • Solid distributed systems fundamentals
  • Experience working in a modern cloud environment and with production ML infrastructure (Kubernetes, etc.)
  • Working knowledge of modern ML, including transformers and multimodal architectures

Research skills:

  • Research engagement: an advanced degree with research output, or publications at top-tier AI or systems venues (e.g., NeurIPS, ICML, MLSys, OSDI), research internships, or substantive open-source contributions

Soft skills:

  • Excellent communication and presentation skills
  • Strong collaboration and teamwork skills
  • Passion for inference and AI

Preferred qualifications:

  • Startup experience
  • Hands-on experience with inference frameworks (vLLM, SGLang, TensorRT-LLM)
  • Writing or modifying GPU kernels (CUDA, Triton, etc.)
  • Edge or on-device inference experience (llama.cpp, MLX, ONNX Runtime, etc.)
  • Experience with quantization, speculative decoding, disaggregated inference or KV-cache compression
  • Experience with multimodal models and/or agentic systems

Location: Paris or London. This role is hybrid, and you are expected to be in the office 3 days a week on average. Please expect some travel between offices on a reasonable cadence (e.g., every 4-6 weeks).

What We Offer:

  • Join the exciting journey of shaping the future of AI
  • Collaborate with a fun, dynamic and multicultural team, working alongside world-class AI talent in a highly collaborative environment
  • Enjoy a competitive salary
  • Unlock opportunities for professional growth, continuous learning, and career development

Research Engineer, Model Inference & Serving - London employer: H Company

At H, we are committed to pushing the boundaries of superintelligence with agentic AI, making us an exceptional employer for those passionate about advancing AI responsibly. Our London office fosters a vibrant and collaborative work culture, where you will have the opportunity to work alongside world-class talent, engage in continuous learning, and contribute to groundbreaking projects that shape the future of AI. With a competitive salary and a focus on professional growth, H is the ideal place for innovative minds eager to make a meaningful impact in the field of AI.
H

Contact Detail:

H Company Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Research Engineer, Model Inference & Serving - London

✨Tip Number 1

Network like a pro! Reach out to folks in the AI and tech community, especially those who work at H or similar companies. Attend meetups, webinars, or conferences to make connections that could lead to job opportunities.

✨Tip Number 2

Show off your skills! Create a portfolio showcasing your projects, especially those related to inference and AI. This can be a game-changer when it comes to standing out during interviews.

✨Tip Number 3

Prepare for technical interviews by brushing up on your coding skills and understanding of deep learning frameworks. Practice common algorithms and system design questions to impress the interviewers.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team.

We think you need these skills to ace Research Engineer, Model Inference & Serving - London

Software Engineering
Python
Rust
C++
Go
Deep Learning Frameworks
PyTorch
JAX
Distributed Systems
Cloud Environment
Production ML Infrastructure
Kubernetes
Inference Techniques
Communication Skills
Collaboration Skills

Some tips for your application 🫡

Tailor Your CV: Make sure your CV reflects the skills and experiences that align with the Research Engineer role. Highlight your software engineering track record, especially in Python and any systems languages you know. We want to see how your background fits into our mission!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Share your passion for inference and AI, and explain why you're excited about joining our team. Be sure to mention any relevant research or projects that showcase your expertise.

Showcase Your Research Experience: If you've got publications or research internships under your belt, flaunt them! We love seeing candidates who have engaged with cutting-edge research, so make sure to include any relevant work that demonstrates your capabilities.

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows you’re keen on being part of our journey at H!

How to prepare for a job interview at H Company

✨Know Your Tech Inside Out

Make sure you’re well-versed in the technical skills listed in the job description. Brush up on Python, deep learning frameworks like PyTorch or JAX, and any systems languages you’ve worked with. Be ready to discuss your past projects and how they relate to inference techniques.

✨Showcase Your Research Experience

If you have an advanced degree or publications, be prepared to talk about them! Highlight any research internships or open-source contributions that demonstrate your engagement with cutting-edge AI topics. This will show your passion for the field and your ability to contribute to their innovative environment.

✨Collaboration is Key

Since the role involves working with cross-functional teams, think of examples where you successfully collaborated with others. Whether it’s co-designing with teams or integrating systems, showing that you can work well with diverse groups will set you apart.

✨Stay Current and Curious

Keep yourself updated on the latest advancements in inference and model serving. Mention any recent articles, papers, or technologies you’ve explored. This not only shows your enthusiasm but also your commitment to continuous learning, which aligns perfectly with their company culture.

Research Engineer, Model Inference & Serving - London
H Company
Location: London

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>