Senior AI Inference Engineer (llama.cpp specialist) - 100% Remote in London
Senior AI Inference Engineer (llama.cpp specialist) - 100% Remote

Senior AI Inference Engineer (llama.cpp specialist) - 100% Remote in London

London Full-Time 36000 - 60000 £ / year (est.) Home office possible
Go Premium
T

At a Glance

  • Tasks: Develop and optimise AI inference engines for edge devices using C++.
  • Company: Join Tether, a leader in digital finance and blockchain technology.
  • Benefits: 100% remote work, collaborate with global talent, and be part of a fintech revolution.
  • Why this job: Make a real impact in AI and fintech while working on cutting-edge technology.
  • Qualifications: Strong C++ skills, experience with Llama.cpp, and a background in AI or Machine Learning.
  • Other info: Dynamic team culture with opportunities for growth and innovation.

The predicted salary is between 36000 - 60000 £ per year.

Join Tether and shape the future of digital finance. At Tether, we’re not just building products; we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate reserve-backed tokens across blockchains. By harnessing the power of blockchain technology, Tether enables you to store, send, and receive digital tokens instantly, securely, and globally, all at a fraction of the cost. Transparency is the bedrock of everything we do, ensuring trust in every transaction.

Innovate with Tether:

  • Tether Finance: Our innovative product suite features the world’s most trusted stablecoin, USDT, relied upon by hundreds of millions worldwide, alongside pioneering digital asset tokenization services.
  • Tether Power: Driving sustainable growth, our energy solutions optimize excess power for Bitcoin mining using eco-friendly practices in state-of-the-art, geo-diverse facilities.
  • Tether Data: Fueling breakthroughs in AI and peer-to-peer technology, we reduce infrastructure costs and enhance global communications with cutting-edge solutions like KEET, our flagship app that redefines secure and private data sharing.
  • Tether Education: Democratizing access to top-tier digital learning, we empower individuals to thrive in the digital and gig economies, driving global growth and opportunity.
  • Tether Evolution: At the intersection of technology and human potential, we are pushing the boundaries of what is possible, crafting a future where innovation and human capabilities merge in powerful, unprecedented ways.

Why join us? Our team is a global talent powerhouse, working remotely from every corner of the world. If you’re passionate about making a mark in the fintech space, this is your opportunity to collaborate with some of the brightest minds, pushing boundaries and setting new standards. We’ve grown fast, stayed lean, and secured our place as a leader in the industry. If you have excellent English communication skills and are ready to contribute to the most innovative platform on the planet, Tether is the place for you. Are you ready to be part of the future?

About the job: You will work on the C++ layer that powers local AI, porting and enhancing inference engines like llama.cpp, ONNX and similar, to run efficiently on edge devices. Your focus is on the runtime: making models load faster, run leaner, and perform well across different hardware. You will ensure that the inference layer is stable, optimized, and ready for integration with the rest of the stack. This role is for engineers who want to work close to the metal, enabling private and fast on-device AI without relying on cloud infrastructure.

Responsibilities:

  • Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml, ONNX.
  • Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments.
  • Integrate AI features into existing products, enriching them with the latest advancements in machine learning.
  • Excellent programming skills in C++, experience in Javascript is a bonus.
  • Strong experience with Llama.cpp and ggml inference engines, which facilitates the deployment of models to specific GPU architectures.
  • Good understanding of deep learning concepts and model architectures.
  • Experience with transformers and LLMs.
  • Demonstrated ability to rapidly assimilate new technologies and techniques.
  • A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI R&D.

Important information for candidates: Recruitment scams have become increasingly common. To protect yourself, please keep the following in mind when applying for roles:

  • Apply only through our official channels. We do not use third-party platforms or agencies for recruitment unless clearly stated. All open roles are listed on our official careers page.
  • Verify the recruiter’s identity. All our recruiters have verified LinkedIn profiles. If you’re unsure, you can confirm their identity by checking their profile or contacting us through our website.
  • Be cautious of unusual communication methods. We do not conduct interviews over WhatsApp, Telegram, or SMS. All communication is done through official company emails and platforms.
  • Double-check email addresses. All communication from us will come from emails ending in @tether.to or @tether.io.
  • We will never request payment or financial details. If someone asks for personal financial information or payment at any point during the hiring process, it is a scam. Please report it immediately.
  • When in doubt, feel free to reach out through our official website.

Senior AI Inference Engineer (llama.cpp specialist) - 100% Remote in London employer: Tether Operations Limited

At Tether, we pride ourselves on being a leading innovator in the fintech space, offering a fully remote work environment that fosters collaboration among a diverse team of global talent. Our commitment to transparency and cutting-edge technology not only empowers our employees but also provides ample opportunities for professional growth and development, making it an exciting place to shape the future of digital finance.
T

Contact Detail:

Tether Operations Limited Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Senior AI Inference Engineer (llama.cpp specialist) - 100% Remote in London

✨Tip Number 1

Network like a pro! Reach out to folks in the fintech and AI space on LinkedIn. Join relevant groups, attend virtual meetups, and don’t be shy about sliding into DMs. You never know who might have the inside scoop on job openings!

✨Tip Number 2

Show off your skills! Create a portfolio showcasing your projects, especially those involving C++ and AI inference engines like llama.cpp. This gives potential employers a taste of what you can do and sets you apart from the crowd.

✨Tip Number 3

Prepare for interviews by brushing up on common technical questions related to AI and machine learning. Practice coding challenges and be ready to discuss your past projects in detail. Confidence is key, so own your expertise!

✨Tip Number 4

Apply directly through our website! It’s the safest way to ensure your application gets seen. Plus, it shows you’re serious about joining Tether and being part of our innovative journey in digital finance.

We think you need these skills to ace Senior AI Inference Engineer (llama.cpp specialist) - 100% Remote in London

C++ Programming
Llama.cpp
ggml Inference Engines
ONNX
Machine Learning Deployment
Deep Learning Concepts
Transformers
Large Language Models (LLMs)
Collaboration with Researchers
Model Training and Transitioning
Performance Optimisation
Edge Device Integration
Rapid Technology Assimilation
English Communication Skills

Some tips for your application 🫡

Show Your Passion: When writing your application, let your enthusiasm for AI and fintech shine through! We want to see how excited you are about the role and how you can contribute to our mission at Tether.

Tailor Your CV: Make sure your CV is tailored to highlight your experience with C++, llama.cpp, and any relevant projects. We love seeing how your skills align with what we’re doing, so don’t hold back!

Craft a Compelling Cover Letter: Your cover letter is your chance to tell us why you’re the perfect fit for this role. Share specific examples of your work with AI inference engines and how you’ve tackled challenges in the past.

Apply Through Our Website: Remember, the best way to apply is through our official careers page. This ensures your application gets to us directly and helps you avoid any recruitment scams. We can’t wait to hear from you!

How to prepare for a job interview at Tether Operations Limited

✨Know Your Tech Inside Out

Make sure you’re well-versed in C++ and the specific frameworks mentioned, like llama.cpp and ONNX. Brush up on your understanding of deep learning concepts and model architectures, as these will likely come up during technical discussions.

✨Showcase Your Problem-Solving Skills

Prepare to discuss past projects where you’ve optimised AI models or worked closely with researchers. Be ready to explain how you tackled challenges and what impact your solutions had on performance and efficiency.

✨Communicate Clearly and Confidently

Since excellent English communication skills are a must, practice articulating your thoughts clearly. You might be asked to explain complex technical concepts, so being able to simplify your explanations will impress the interviewers.

✨Research Tether and Its Innovations

Familiarise yourself with Tether’s products and their role in the fintech space. Understanding their mission and how your role as a Senior AI Inference Engineer fits into their vision will show your genuine interest and enthusiasm for the position.

Senior AI Inference Engineer (llama.cpp specialist) - 100% Remote in London
Tether Operations Limited
Location: London
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

T
  • Senior AI Inference Engineer (llama.cpp specialist) - 100% Remote in London

    London
    Full-Time
    36000 - 60000 £ / year (est.)
  • T

    Tether Operations Limited

    50-100
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>