Systems Research Engineer - LLM Optimisation (vLLM / TensorRT-LLM) in Broughton

Systems Research Engineer - LLM Optimisation (vLLM / TensorRT-LLM) in Broughton

Broughton Full-Time 60000 - 80000 £ / year (est.) No working from home possible
Project People

At a Glance

  • Tasks: Architect and optimise cutting-edge AI infrastructure for large-scale data centres.
  • Company: Leading tech firm at the forefront of AI and systems research.
  • Benefits: Competitive salary, generous benefits, and opportunities for professional growth.
  • Other info: Collaborative environment with potential for impactful research and publications.
  • Why this job: Join a dynamic team shaping the future of AI technology and infrastructure.
  • Qualifications: Degree in Computer Science or related field; experience with LLM frameworks preferred.

The predicted salary is between 60000 - 80000 £ per year.

Permanent position in Edinburgh City Centre (On-site 5 days), within walking distance from local transport links.

Salary: Competitive and negotiable, with a generous benefits package.

In an era where Large Language Models (LLMs) are rebuilding the foundational software stack, our client is at the forefront of reshaping how large-scale models are trained, served, and deployed. Operating at the intersection of advanced systems research and industrial-scale engineering, their Edinburgh-based team is driving new AI Infrastructure & Agentic Serving architectures.

This role is a unique opportunity to help define next-generation large-scale data centres and AI infrastructure systems, turning innovative system designs into deployable, real-world technologies.

We are seeking Systems Research Engineers with a deep passion for computer systems, distributed AI infrastructure, and performance optimization. These roles are ideal for recent PhD graduates or exceptional BSc/MSc engineers looking to build research-driven experience in Operating Systems, Distributed Systems, AI Model Serving, and Machine Learning infrastructure. You will work closely with architects to prototype and optimize the next generation of global AI clusters.

What you will be doing:

  • Distributed Systems Research & Development: Architect, implement, and evaluate distributed system components for emerging AI and data-centric workloads. Drive modular design and scalability across GPU and NPU clusters, building highly efficient serving and scheduling systems.
  • Performance Optimization & Profiling: Conduct in-depth profiling and performance tuning of large-scale inference and data pipelines, focusing on KV cache management, heterogeneous memory scheduling, and high-throughput inference serving using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems.
  • Scalable Model Serving Infrastructure: Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant AI serving across distributed environments. Research and prototype new techniques for cache sharing, data locality, and resource orchestration and scheduling within AI clusters.
  • Research & Publications: Translate innovative research ideas into publishable contributions at leading venues (e.g., OSDI, NSDI, EuroSys, SoCC, MLSys, NeurIPS, ICML, ICLR) while driving internal adoption of novel methods and architectures.
  • Cross-Team Collaboration: Communicate technical insights, research progress, and evaluation outcomes effectively to multidisciplinary stakeholders and global research teams.

What we are looking for:

  • Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or related field.
  • Fresh PhD graduates in systems, distributed computing, or large-scale AI infrastructure are also welcome.
  • At least 2 years of experience with LLM inference/serving framework optimization (vLLM/Ray Serve/TensorRT-LLM/PyTorch).
  • Hands-on experience with distributed KV cache optimization.
  • Familiarity with GPUs and how they execute LLMs.
  • Strong knowledge of distributed systems, operating systems, machine learning systems architecture, inference serving, and AI infrastructure.
  • Solid grounding in systems research methodology, distributed algorithms, and profiling tools.
  • Proficiency in C/C++, with additional experience in Python for research prototyping.
  • Team-oriented mindset with effective technical communication skills.

If this sounds like a role you can take hold of, we would love to hear from you! To apply for this role, please send your CV to Maggie Kwong. Great journeys start here, apply now!

Systems Research Engineer - LLM Optimisation (vLLM / TensorRT-LLM) in Broughton employer: Project People

Join a pioneering team in Edinburgh that is redefining AI infrastructure and large-scale model deployment. As a Systems Research Engineer, you will benefit from a competitive salary and a generous benefits package, while working in a collaborative environment that fosters innovation and professional growth. With opportunities to publish your research and engage with multidisciplinary teams, this role offers a unique chance to contribute to cutting-edge technology in a vibrant city centre location.

Project People

Contact Details:

Project People Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Systems Research Engineer - LLM Optimisation (vLLM / TensorRT-LLM) in Broughton

Tip Number 1

Network like a pro! Reach out to folks in the industry on LinkedIn or at local meetups. A friendly chat can sometimes lead to opportunities that aren’t even advertised yet.

Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects related to distributed systems and AI infrastructure. This gives potential employers a taste of what you can do.

Tip Number 3

Prepare for interviews by brushing up on your technical knowledge and problem-solving skills. Practice common interview questions related to LLM optimisation and distributed systems to boost your confidence.

Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love hearing from passionate candidates like you!

We think you need these skills to ace Systems Research Engineer - LLM Optimisation (vLLM / TensorRT-LLM) in Broughton

Distributed Systems Research & Development
Performance Optimization & Profiling
Scalable Model Serving Infrastructure
Research & Publications
Cross-Team Collaboration
LLM Inference / Serving Framework Optimization
Distributed KV Cache Optimization

Some tips for your application 🫡

Tailor Your CV:Make sure your CV reflects the skills and experiences that align with the Systems Research Engineer role. Highlight your experience with LLMs, distributed systems, and any relevant projects you've worked on. We want to see how you fit into our vision!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you're passionate about AI infrastructure and how your background makes you a great fit for our team. Keep it engaging and personal – we love to see your personality come through!

Showcase Your Research Experience:If you've got research experience, especially in systems or distributed computing, make sure to highlight it! Mention any publications or projects that demonstrate your ability to innovate and contribute to cutting-edge technology. We value research-driven minds!

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way to ensure your application gets to us quickly and efficiently. Plus, it shows you’re keen on joining our team at StudySmarter. Don’t miss out on this opportunity!

How to prepare for a job interview at Project People

Know Your Tech Inside Out

Make sure you’re well-versed in the technologies mentioned in the job description, like vLLM, TensorRT-LLM, and distributed systems. Brush up on your knowledge of performance optimisation techniques and be ready to discuss how you've applied them in past projects.

Showcase Your Research Skills

Since this role involves translating research into practical applications, prepare to talk about any relevant research you've conducted. Highlight your experience with publications or presentations at conferences, as this will demonstrate your ability to contribute to the team’s innovative goals.

Prepare for Technical Questions

Expect technical questions that assess your understanding of distributed systems and AI infrastructure. Practice explaining complex concepts clearly and concisely, as effective communication is key when collaborating with multidisciplinary teams.

Ask Insightful Questions

At the end of the interview, don’t shy away from asking questions about the company’s current projects or future directions in AI infrastructure. This shows your genuine interest in the role and helps you gauge if it’s the right fit for you.