Overview
Summary: Are you an expert AI scientist and engineer who loves to build large-scale systems employing AI to enable deep personalisation and multimodal reasoning? We are looking for a distinguished AI engineer who has worked with FAANG companies and has 20+ years of experience in building large-scale search and recommendation systems. Our role offers the unique opportunity at the intersection of applied AI research, large-scale systems, and AI native product development.
The role is a 5 days a week, fully remote role.
You Will Thrive in This Role If
- You’re passionate about advancing the frontier of reinforcement learning and language model research.
- You’re a proactive self-starter who takes full ownership of ideas and drives them from concept to completion.
- You prioritize principled methods, carefully designed experiments, and results that are reliable and reproducible over time.
- You thrive in fast-moving, technically challenging environments where rapid iteration and adaptability are essential.
- You’re confident navigating and improving large-scale machine learning codebases.
- You have a strong, foundational understanding of machine learning theory and its practical applications.
Key Responsibilities
- You will be responsible for building a generative AI-based multimodal product to enable voice conversational agent across industries.
- You will generate differentiated value through applied research into the agentic reasoning, delivering multimodal fine-tuning models, AI alignment to reduce hallucinations, and a large-scale reasoning infrastructure to reduce cost.
- You will also be responsible to build reasoning infrastructure to reason with multimodal small language models and large action models.
- You will also be responsible in building the AI-driven UI to personalize based on the users intent and preferences to minimize UI based friction.
- You will also ensure that the reasoning inference engine is secured from prompt poisoning.
- You will be required to build a large-scale inferencing setup, cloud or on-prem, leveraging vLLM and efficient post-training pruning and quantization techniques.
- You will be responsible to mentor junior team members.
Required Skills
- 20+ years of experience in building search, monetisation, and recommendation systems along with an in-depth understanding of the design of such systems and hands-on experience in building such systems, preferably at one of Meta, Microsoft, Google, or Amazon.
- Proven track record as a founder, early technical leader, or principal engineer on an AI-first product for 500+ million daily active user base.
- In-depth understanding of deep learning and model architectures:
- LLMs & Multimodal Models: GPT, LLaMA, Mistral, T5, CLIP, SAM, Stable Diffusion
- Pretraining & Fine-tuning: MLM, Causal LM, Contrastive Learning, LoRA, QLoRA
- Agentic AI & Autonomy: LangChain, LangGraph, SemanticKernel, AutoGPT, CrewAI, OpenAI Function Calling
- Hands-on experience in LLM training, optimization, and scaling
- Distributed Training: DeepSpeed (ZeRO), FSDP, Megatron-LM, JAX Multi-host
- Memory & Compute Optimization: Gradient Checkpointing, Activation Offloading
- Hardware Acceleration: A100/H100, TPU v4, Habana Gaudi2, Slurm, Kubernetes
- Hands-on experience in inference acceleration and model compression
- Deployment & Optimization: OpenVINO, TensorRT, ONNX Runtime, vLLM
- Compression: GPTQ, AWQ, Pruning, Knowledge Distillation
- Low-latency Retrieval: FAISS, Annoy, Milvus for RAG & autonomous agents
- Deep expertise in Generative AI, including diffusion models for image and video generation, vision-language models, and multimodal semantic understanding models such as multimodal transformers
- Experience with fine-tuning techniques (such as LoRAs and adapters), developing evaluation frameworks (including human-in-the-loop quality assessments), and working with large language models (LLMs) for tasks like tool integration and summarization
- Experience with multimodal machine learning (vision, language, audio)
- Data engineering, evaluation, and alignment –
- Dataset Engineering: Web-scale data curation (The Pile, LAION), Tokenization (SentencePiece, BPE)
- RLHF & Safety: TRL, PPO, Detoxification, Adversarial Red Teaming
- Benchmarking: HELM, MMLU, Perplexity, Calibration Metrics
- Strong applied research track record as evident by publications in tier-1 conferences pertaining to web mining, AI, ML, or deep learning.
- Past speaking experience at influential conferences and in general recognised as an AI expert in the field.
- 10+ years’ experience in designing complex enterprise grade and large-scale architectures.
- Expertise in Go, Java, and Rust.
- Experience with Clouflare, AWS.
- Experience with Dialogflow CX, Docker, or CI/CD pipelines
- Effective cross-functional collaborator with creative and technical teams
- Ability to influence and mentor without direct authority
- Commitment to quality, ethics, and brand safety in AI solutions
- Data-driven, with experience defining and tracking success metrics
- Passion for continuous learning and thought leadership in AI
- Able to distill complex AI concepts for technical and executive audiences alike
Bonus Skills
- Experience in training foundational models
- Experience with React and Redux.
#J-18808-Ljbffr
Contact Detail:
Cubedigico UK Ltd Recruiting Team