Artificial Intelligence Engineer - Distributed Inference
Artificial Intelligence Engineer - Distributed Inference

Artificial Intelligence Engineer - Distributed Inference

Birmingham Full-Time 48000 - 84000 Β£ / year (est.) No home office possible
D

At a Glance

  • Tasks: Join us to design and optimize high-performance AI systems for distributed inference.
  • Company: Danucore is pioneering accessible, efficient, and ethical AI technology for humanity.
  • Benefits: Enjoy competitive pay, cutting-edge resources, and support for your innovative projects.
  • Why this job: Be part of a passionate team shaping the future of AI with positive societal impact.
  • Qualifications: Experience with AI model optimization and distributed systems is essential; curiosity and passion are key.
  • Other info: Showcase your skills through public work like GitHub or blogs in your application.

The predicted salary is between 48000 - 84000 Β£ per year.

AI Engineer – Distributed Inference Specialist πŸš€

Do you want to be a spectator or a player as the world races to develop AGI? πŸ€”

Are you ready to be a pioneer of AI?

Why join us?

At Danucore, we are on the hunt for BRILLIANT MINDS to join a team of visionaries and innovators dedicated to building distributed supercomputers and AI systems which are:

Faster ⚑️ – from building and deploying AI datacentres at speed to optimising the AI workloads that run on them we want to be the fastest

Cheaper πŸ’Έ – AI should be accessible to all. We lower the costs of AI deployment with careful hardware deployments and software systems to ensure efficient resource utilisation.

Kinder πŸ™ – Our systems are designed to benefit humanity. We do not allow our systems to participate in military, gambling or pornography applications

Greener πŸŒ±β€“ We optimise energy consumption with an integrated hardware and software solution to leverage renewable energy, optimise heat recovery – all running under energy aware orchestration systems to optimise workloads

Cleverer 🧠 – We develop agentic AI systems and to make our systems intelligent and constantly improving

Help us build systems to ensure the power of frontier AI remains accessible and give users sovereignty over their AI systems πŸ’ͺ

Join us in ensuring that the most transformative technology in human history remains in the hands of humanity itself. Let\’s make AI development transparent, accessible, and aligned with the interests of humanity, not just the profits of a few. ⚑

About the Role 🎯

This role is for those obsessed with pushing the boundaries of AI model performance.

We\’re looking for someone who gets excited about shaving milliseconds off inference time, every percentage point of GPU utilization gained and how many Watts were consumed to achieve it. ⚑️

You\’ll work directly with cutting-edge models β€” from LLMs to multimodal systems β€” and large GPU clusters, finding innovative ways to make them run faster, more efficiently, and more accessibly on diverse hardware setups. πŸ› οΈ

What We\’re Looking For πŸ”

In team members:

  • Passion for AI: A strong desire to influence the future of technology and its societal impact. πŸ€œπŸ€›
  • Willingness to Learn: we\’re looking for future experts with curious minds and a growth mindset. 🧠
  • Open-Mindedness: Ready to challenge the norm and think outside the box? 😊

and for the role:

  • Evidence of deploying and optimising AI models in multi gpu and multi node systems πŸ–₯️πŸ–₯️
  • Good working knowledge of leading AI runtimes: PyTorch, vLLM, TensorRT, ONNX Runtime, Llama.cpp πŸƒπŸ» ♂️ ➑️⏱️o
  • Experience with distributed inference engines: Ray Serve, Triton Inference Server, vLLM, SLURM 🌐
  • Knowledge of AI compilers: OpenXLA, torch.compile, OpenAI\’s triton, MLIR, Mojo, TVM, MLC-LLM βš™οΈ
  • Good working knowledge of inter-process communication: message queues, MPI, NCCL, gRPC πŸ“‘
  • Good working knowledge of high performance networking: RDMA, RoCE, Infiniband, NVIDIA GPUDirect, NVLink, NVIDIA DOCA, MagnumIO, dpdk, spdk πŸ“Ÿ
  • Experience with model quantisation, pruning, and sparsity techniques for performance optimisation. πŸ“Š

And bonus points if you have:

  • a homelab, blog, or a collection of git repos showcasing your talents and interests πŸ§‘ πŸ’»πŸ‘© πŸ’»
  • made contributions to open-source projects or publications in the field of AI/ML systems optimisation πŸ“šπŸ“

Let us know which of the above you have worked with / are relevant in your cover letter! ✨

Key Responsibilities πŸ“‹

  • Design and implement high-performance distributed inference systems for running large language models and multimodal AI models at scale πŸ‘·
  • Optimise model serving infrastructure for maximum throughput, minimal latency, and optimal power efficiency ⚑
  • Develop and maintain deployment pipelines for efficient model serving, and monitoring in production πŸ”„
  • Research and implement cutting-edge techniques in model optimisation, including pruning, quantisation, and sparsity methods πŸ§‘ πŸ”¬
  • Design, build and configure experimental hardware setups for model serving and optimisation πŸ› οΈ
  • Design and implement robust testing frameworks to ensure reliable model serving βœ…
  • Collaborate with the team to build and improve our distributed inference platform, making it more accessible and efficient for users 🀝
  • Monitor, optimise and document system performance metrics, including latency, throughput, power consumption and benchmark scores πŸ“

How Can We Tempt You?

Exceptional Financial Package: Enjoy a competitive compensation structure, including an enticing EMI scheme that rewards your brilliance. πŸ’°

Envious Compute Power: Gain access to a vast array of cutting-edge computing resources to bring your ideas to life!

Support for Your Vision: We believe that the brightest minds often have their own innovative projects. Let\’s collaborate! Share your ideas, and work with our team and support network to make them happen! 🌟

Make an Impact: Join a passionate team dedicated to creating positive change in the world. The future is ours to shape, and together we can ensure it\’s for the better.

Dynamic Start-Up Culture: Dive in from day one! Experience the thrill of a start-up environment where you can roll up your sleeves and make a real difference right away. πŸš€

How to Apply πŸ“¬

Email your cover letter and CV to jobs@danucore.com with subject \”AI Engineer – Distributed Inference\”

In your cover letter, please include details of:

  • what parts or technologies mentioned in this job advert you have experience with and can add value withπŸ’‘
  • links to any public work e.g. github profile, blogs or papers

Artificial Intelligence Engineer - Distributed Inference employer: Danucore

At Danucore, we pride ourselves on being an exceptional employer that fosters a dynamic start-up culture where innovation thrives. Our commitment to employee growth is evident through access to cutting-edge computing resources and support for personal projects, allowing you to make a meaningful impact in the AI landscape. Join us in a collaborative environment that values passion, creativity, and a shared vision for a kinder, greener, and more accessible future in technology.
D

Contact Detail:

Danucore Recruiting Team

StudySmarter Expert Advice 🀫

We think this is how you could land Artificial Intelligence Engineer - Distributed Inference

✨Tip Number 1

Familiarize yourself with the specific AI runtimes mentioned in the job description, such as PyTorch and TensorRT. Having hands-on experience or projects that showcase your skills with these tools can set you apart from other candidates.

✨Tip Number 2

Engage with the AI community by contributing to open-source projects or writing about your experiences on platforms like GitHub or personal blogs. This not only demonstrates your passion for AI but also shows your willingness to learn and collaborate.

✨Tip Number 3

Prepare to discuss your experience with distributed inference engines like Ray Serve or Triton Inference Server during the interview. Be ready to share specific examples of how you've optimized model performance in multi-GPU setups.

✨Tip Number 4

Showcase any experimental hardware setups you've built or worked with. This practical experience can highlight your ability to design and implement high-performance systems, which is crucial for this role.

We think you need these skills to ace Artificial Intelligence Engineer - Distributed Inference

Experience with deploying and optimizing AI models in multi-GPU and multi-node systems
Proficiency in leading AI runtimes: PyTorch, vLLM, TensorRT, ONNX Runtime, Llama.cpp
Familiarity with distributed inference engines: Ray Serve, Triton Inference Server, vLLM, SLURM
Knowledge of AI compilers: OpenXLA, torch.compile, OpenAI's Triton, MLIR, Mojo, TVM, MLC-LLM
Understanding of inter-process communication: message queues, MPI, NCCL, gRPC
Expertise in high-performance networking: RDMA, RoCE, Infiniband, NVIDIA GPUDirect, NVLink, NVIDIA DOCA, MagnumIO, dpdk, spdk
Experience with model quantization, pruning, and sparsity techniques for performance optimization
Strong analytical skills to monitor and optimize system performance metrics
Ability to design and implement robust testing frameworks for model serving
Passion for AI and a willingness to learn and innovate

Some tips for your application 🫑

Tailor Your Cover Letter: Make sure to customize your cover letter to highlight your experience with the specific technologies mentioned in the job description, such as PyTorch, Ray Serve, or model quantization techniques. Show how your skills align with the company's mission and values.

Showcase Relevant Experience: In your CV, emphasize any previous roles or projects where you deployed and optimized AI models in multi-GPU and multi-node systems. Include metrics or results that demonstrate your impact, such as improvements in inference time or resource utilization.

Include Links to Your Work: Don’t forget to add links to your public work, like your GitHub profile, blogs, or any publications related to AI/ML systems optimization. This will give the hiring team a better understanding of your skills and interests.

Express Your Passion for AI: In both your cover letter and CV, convey your enthusiasm for AI and its societal impact. Mention any personal projects, contributions to open-source, or innovative ideas you have that align with the company's vision of making AI accessible and beneficial for humanity.

How to prepare for a job interview at Danucore

✨Show Your Passion for AI

Make sure to express your enthusiasm for artificial intelligence and its potential impact on society. Share specific examples of projects or experiences that demonstrate your commitment to pushing the boundaries of AI technology.

✨Highlight Relevant Experience

Prepare to discuss your experience with deploying and optimizing AI models, especially in multi-GPU and multi-node systems. Be ready to provide concrete examples of how you've improved performance metrics like latency and throughput.

✨Demonstrate Your Technical Knowledge

Familiarize yourself with the key technologies mentioned in the job description, such as PyTorch, Ray Serve, and model optimization techniques. Be prepared to discuss how you've used these tools in past projects and how they can be applied to the role.

✨Ask Insightful Questions

Prepare thoughtful questions about the company's vision for AI and how they plan to make their systems more accessible and efficient. This shows your interest in the role and helps you understand if the company aligns with your values.

Artificial Intelligence Engineer - Distributed Inference
Danucore
D
  • Artificial Intelligence Engineer - Distributed Inference

    Birmingham
    Full-Time
    48000 - 84000 Β£ / year (est.)

    Application deadline: 2027-02-05

  • D

    Danucore

Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>