AI Research Engineer (Model Serving & Inference)

AI Research Engineer (Model Serving & Inference)

Full-Time 30000 - 46000 £ / year (est.) No working from home possible
T

At a Glance

  • Tasks: Drive innovation in AI model serving and inference architectures for advanced systems.
  • Company: Join Tether, a leader in digital finance and blockchain technology.
  • Benefits: Work remotely with a global team and enjoy a dynamic work culture.
  • Other info: Ideal for those passionate about AI and looking to make a real-world impact.
  • Why this job: Be part of a fintech revolution, collaborating with top minds to push boundaries.
  • Qualifications: Degree in Computer Science or related field; PhD preferred with AI R&D experience.

The predicted salary is between 30000 - 46000 £ per year.

Join to apply for the AI Research Engineer (Model Serving & Inference) role at Tether.io

3 weeks ago Be among the first 25 applicants

Join to apply for the AI Research Engineer (Model Serving & Inference) role at Tether.io

Join Tether and Shape the Future of Digital Finance
At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate reserve-backed tokens across blockchains. By harnessing the power of blockchain technology, Tether enables you to store, send, and receive digital tokens instantly, securely, and globally, all at a fraction of the cost. Transparency is the bedrock of everything we do, ensuring trust in every transaction.
Innovate with Tether
Tether Finance: Our innovative product suite features the world’s most trusted stablecoin, USDT, relied upon by hundreds of millions worldwide, alongside pioneering digital asset tokenization services.
But that’s just the beginning:
Tether Power: Driving sustainable growth, our energy solutions optimize excess power for Bitcoin mining using eco-friendly practices in state-of-the-art, geo-diverse facilities.
Tether Data: Fueling breakthroughs in AI and peer-to-peer technology, we reduce infrastructure costs and enhance global communications with cutting-edge solutions like KEET, our flagship app that redefines secure and private data sharing.
Tether Education: Democratizing access to top-tier digital learning, we empower individuals to thrive in the digital and gig economies, driving global growth and opportunity.
Tether Evolution: At the intersection of technology and human potential, we are pushing the boundaries of what is possible, crafting a future where innovation and human capabilities merge in powerful, unprecedented ways.
Why Join Us?
Our team is a global talent powerhouse, working remotely from every corner of the world. If you’re passionate about making a mark in the fintech space, this is your opportunity to collaborate with some of the brightest minds, pushing boundaries and setting new standards. We’ve grown fast, stayed lean, and secured our place as a leader in the industry.
If you have excellent English communication skills and are ready to contribute to the most innovative platform on the planet, Tether is the place for you.
Are you ready to be part of the future?
About the job:
As a member of our AI model team, you will drive innovation in model serving and inference architectures for advanced AI systems. Your work will focus on optimizing model deployment and inference strategies to deliver highly responsive, efficient, and scalable performance across real-world applications. You will work on a wide spectrum of systems, ranging from resource-efficient models designed for limited hardware environments to complex, multi-modal architectures that integrate data such as text, images, and audio.
We expect you to have deep expertise in designing and optimizing model serving pipelines and inference frameworks as well as a strong background in advanced model architectures. You will adopt a hands-on, research-driven approach to develop, test, and implement novel serving strategies and inference algorithms. Your responsibilities include engineering robust inference pipelines, establishing comprehensive performance metrics, and identifying and resolving bottlenecks in production environments. The ultimate goal is to enable high-throughput, low-latency, low-memory footprint, and scalable AI performance that delivers tangible value in dynamic, real-world scenarios.
Responsibilities:

  • Design and deploy state-of-the-art model serving architectures that deliver high throughput and low latency while optimizing memory usage. Ensure these pipelines run efficiently across diverse environments, including resource-constrained devices and edge platforms.
  • Establish clear performance targets such as reduced latency, improved token response, and minimized memory footprint.
  • Build, run, and monitor controlled inference tests in both simulated and live production environments. Track key performance indicators such as response latency, throughput, memory consumption, and error rates, with special attention to metrics specific to resource-constrained devices. Document iterative results and compare outcomes against established benchmarks to validate performance across platforms.
  • Identify and prepare high-quality test datasets and simulation scenarios tailored to real-world deployment challenges, specifically those encountered on low-resource devices. Set measurable criteria to ensure that these resources effectively evaluate model performance, latency, and memory utilization under various operational conditions.
  • Analyze computational efficiency and diagnose bottlenecks in the serving pipeline by monitoring both processing and memory metrics. Address issues such as suboptimal batch processing, network delays, and high memory usage to optimize the serving infrastructure for scalability and reliability on resource-constrained systems.
  • Work closely with cross-functional teams to integrate optimized serving and inference frameworks into production pipelines designed for edge and on-device applications. Define clear success metrics such as improved real-world performance, low error rates, robust scalability, optimal memory usage and ensure continuous monitoring and iterative refinements for sustained improvements.
  • A degree in Computer Science or related field. Ideally PhD in NLP, Machine Learning, or a related field, complemented by a solid track record in AI R&D (with good publications in A* conferences).
  • Proven experience in low-level kernel optimizations and inference optimization on mobile devices is essential. Your contributions should have led to measurable improvements in inference latency, throughput, and memory footprint for domain-specific applications, particularly on resource-constrained devices and edge platforms.
  • A deep understanding of modern model serving architectures and inference optimization techniques is required. This includes state-of-the-art methods for achieving low-latency, high-throughput performance, and efficient memory management in diverse, resource-constrained deployment scenarios.
  • Must have strong expertise in writing CPU and GPU kernels for mobile devices (i.e., smartphones) as well as a deep understanding of model serving frameworks and engines. Practical experience in developing and deploying end-to-end inference pipelines, from optimizing models for efficient serving to integrating these solutions on resource-constrained devices is required.
  • Demonstrated ability to apply empirical research to overcome challenges in model serving, such as latency optimization, computational bottlenecks, and memory constraints. You should be proficient in designing robust evaluation frameworks and iterating on optimization strategies to continuously push the boundaries of inference performance and system efficiency.

Seniority level

  • Seniority level

    Not Applicable

Employment type

  • Employment type

    Full-time

Job function

  • Job function

    Information Technology
  • Industries

    Technology, Information and Internet

Referrals increase your chances of interviewing at Tether.io by 2x

Get notified about new Artificial Intelligence Engineer jobs in United Kingdom.

Nottingham, England, United Kingdom 12 hours ago

London, England, United Kingdom 2 weeks ago

Software Engineer (Python) - AI Platform

London, England, United Kingdom 2 weeks ago

Manchester, England, United Kingdom 1 week ago

London, England, United Kingdom $35,000.00-$46,000.00 1 month ago

Data Engineer (open to the UK and Europe)

London, England, United Kingdom 2 months ago

United Kingdom $100,000.00-$150,000.00 1 month ago

AI Engineer - Freelance - 100 % Remote from Europe

London, England, United Kingdom 1 week ago

Cambridge, England, United Kingdom 2 weeks ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

AI Research Engineer (Model Serving & Inference) employer: Tether.io

At Tether, we are at the forefront of the digital finance revolution, offering a dynamic work environment that fosters innovation and collaboration among a global team. Our commitment to transparency and cutting-edge technology not only empowers our employees but also provides ample opportunities for professional growth in the rapidly evolving fintech landscape. Join us in shaping the future while enjoying the flexibility of remote work and the chance to make a significant impact in the world of AI and blockchain.

T

Contact Details:

Tether.io Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land AI Research Engineer (Model Serving & Inference)

Tip Number 1

Familiarise yourself with the latest advancements in AI model serving and inference architectures. Being well-versed in current trends and technologies will not only boost your confidence during discussions but also demonstrate your genuine interest in the role.

Tip Number 2

Network with professionals in the fintech and AI sectors. Engaging with industry experts can provide valuable insights and potentially lead to referrals, which can significantly enhance your chances of landing an interview.

Tip Number 3

Prepare to discuss specific projects where you've optimised model serving pipelines or improved inference performance. Having concrete examples ready will showcase your hands-on experience and problem-solving skills, making you a more attractive candidate.

Tip Number 4

Stay updated on Tether's latest innovations and contributions to the fintech space. Understanding their products and mission will allow you to tailor your conversations and show how your skills align with their goals, making you stand out as a candidate.

We think you need these skills to ace AI Research Engineer (Model Serving & Inference)

Deep Learning
Model Serving Architectures
Inference Optimization
Low-Level Kernel Optimizations
Mobile Device Development
Performance Metrics Analysis
Resource-Constrained Systems

Some tips for your application 🫡

Tailor Your CV:Make sure your CV highlights relevant experience in AI model serving and inference. Focus on specific projects or roles where you've optimised model deployment or worked with advanced architectures.

Craft a Compelling Cover Letter:In your cover letter, express your passion for fintech and how your skills align with Tether's mission. Mention any specific technologies or methodologies you’ve used that relate to the job description.

Showcase Your Research Experience:If you have publications or research experience, especially in NLP or Machine Learning, be sure to include this. Highlight any contributions that led to improvements in inference latency or memory usage.

Prepare for Technical Questions:Anticipate technical questions related to model serving architectures and inference optimisation. Be ready to discuss your hands-on experience with CPU and GPU kernels, as well as any challenges you've overcome in previous roles.

How to prepare for a job interview at Tether.io

Showcase Your Technical Expertise

Make sure to highlight your experience with model serving architectures and inference optimisation techniques. Be prepared to discuss specific projects where you've successfully implemented low-latency, high-throughput solutions, especially on resource-constrained devices.

Prepare for Practical Assessments

Tether may include practical assessments in the interview process. Brush up on your skills related to writing CPU and GPU kernels, as well as developing end-to-end inference pipelines. Practising coding challenges can help you feel more confident.

Understand Tether's Mission

Familiarise yourself with Tether's products and their impact on digital finance. Being able to articulate how your role as an AI Research Engineer contributes to their mission will demonstrate your genuine interest in the company and its goals.

Ask Insightful Questions

Prepare thoughtful questions about Tether's approach to AI and model serving. This not only shows your enthusiasm but also helps you gauge if the company's culture and values align with your own. Consider asking about their future projects or challenges they face in AI deployment.