Senior Research Scientist - Multimodal Agents in London
Senior Research Scientist - Multimodal Agents

Senior Research Scientist - Multimodal Agents in London

London Full-Time 36000 - 60000 ÂŁ / year (est.) No home office possible
C

At a Glance

  • Tasks: Drive research in multimodal agents and develop innovative AI systems for design.
  • Company: Join Canva, a leading design platform with a vibrant culture.
  • Benefits: Equity packages, flexible leave, and a wellbeing allowance to support your lifestyle.
  • Why this job: Make a real impact in AI while collaborating with passionate teams.
  • Qualifications: Experience in reinforcement learning and agentic systems is essential.
  • Other info: Dynamic work environment with opportunities for personal and professional growth.

The predicted salary is between 36000 - 60000 ÂŁ per year.

Join the team redefining how the world experiences design.

Where and How You Can Work

The buzzing Canva London campus features several buildings around beautiful leafy Hoxton Square in Shoreditch. While our global headquarters is in Sydney, Australia, London is our HQ for Europe, with all kinds of teams based here, plus event spaces to gather our team and communities. You’ll experience a warm welcome from our Vibe team at front of house, amazing home cooked food from our Head Chef and a variety of workspaces to hang out with your team mates or get solo work done. That said, we trust our Canvanauts to choose the balance that empowers them and their team to achieve their goals and so you have choice in where and how you work.

At Canva, our mission is to empower the world to design. We’re building AI that feels magical and lands real impact for millions of people – helping anyone create with confidence. We’re looking for a senior research scientist who lives and breathes reinforcement learning and agentic systems to push the frontier of reasoning, tool use, and reliability – and ship it to users.

About The Team

We explore multimodal agentic architectures, build scalable training and evaluation loops, and partner closely with product and platform teams to turn breakthroughs into delightful product features. We are a cutting‐edge post‐training team, developing new multimodal agentic systems. We work on all topics of multimodal modelling, post‐training and design agents, building scalable training and evaluation loops, and partnering closely with product and platform teams to turn breakthroughs into delightful product features. We seek a person with experience in post‐training and reinforcement learning (RL).

About The Role

You’ll drive research directions and play a leading role in hands‐on work across the agent stack—from reward design and policy optimization to planning, memory, and tool orchestration, dataset construction, to post‐training, and the development of novel post‐training approaches. You’ll design tight experiments, iterate quickly, and land trustworthy conclusions. Most importantly, you’ll help convert research into reliable, safe, and high‐quality product experiences.

What You’ll Be Doing In This Role

  • Develop agent systems (planning, multimodal tool use, retrieval, novel training approaches, modeling ablations) for real tasks in design, vision, and language.
  • Scale post‐training and RL across distributed systems (PyTorch) with efficient data loaders, tracing/telemetry, stable training of mixture‐of‐experts (MoE) architectures, and reproducible pipelines; profile, debug, and optimize.
  • Contribute to the research agenda for RL/agentic systems aligned with Canva’s product goals; identify high‐leverage bets and retire dead ends quickly.
  • Build reward models and learning loops: RLHF/RLAIF, preference modeling, DPO/IPO‐style objectives, offline/online RL, curriculum learning, and credit assignment for multi‐step reasoning.
  • Develop simulation and sandbox tasks that surface failure modes (planning errors, tool‐use brittleness, hallucination, unsafe actions) and turn them into measurable targets.
  • Help align on rigorous evaluation for agents (task success, reliability, latency, safety, regressions). Stand up offline suites and online A/B tests; favor simple, controlled experiments that generalize.
  • Collaborate and ship: work shoulder‐to‐shoulder with product, design, safety, and platform to land research as reliable features—then iterate.
  • Share and elevate: mentor teammates, present findings internally, and contribute back to the community when it helps the field and our users.

You’re Likely a Match If You Have

  • Depth in implementing and post‐training LLMs/VLMs/Diffusion models, with a track record of shipped research or publications in agents/RL.
  • Experience modifying, and adapting open‐source models.
  • Strong experience with experimental design: tight baselines, clean ablations, reproducibility, and clear, data‐backed conclusions.
  • Fluency in Python and PyTorch; you’re comfortable in large ML codebases and can profile, debug, and optimize training and inference.
  • Practical experience building agent loops (planning, tool invocation, retrieval, memory) and evaluating multi‐step reasoning quality.
  • Hands‐on experience with policy optimization, reward modeling, and preference learning (e.g., RLHF/RLAIF, DPO/IPO, actor‐critic/PPO, offline RL).
  • Experience with large‐scale training (distributed training, experiment tracking, evaluation harnesses) and cloud multimodal tooling.
  • Experience with RL for MoE architectures.

Nice to Have

  • Experience with video and audio modelling.
  • Experience with multi‐agent settings.
  • Strength in alignment and safety evaluations, including red‐teaming and risk mitigation for tool‐using agents.
  • Contributions to open‐source, benchmarks, or shared evaluation suites for agents.

What’s in it for you?

Achieving our crazy big goals motivates us to work hard – and we do – but you’ll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too. We also offer a range of benefits to set you up for every success in and outside of work.

  • Equity packages – we want our success to be yours too.
  • Inclusive parental leave policy that supports all parents & carers.
  • An annual Vibe & Thrive allowance to support your wellbeing, social connection, office setup & more.
  • Flexible leave options that empower you to be a force for good, take time to recharge and support you personally.

Other Stuff To Know

We make hiring decisions based on your experience, skills and passion, as well as how you can enhance Canva and our culture. When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process.

We celebrate all types of skills and backgrounds at Canva so even if you don’t feel like your skills quite match what’s listed above – we still want to hear from you!

Please note that interviews are conducted virtually.

Senior Research Scientist - Multimodal Agents in London employer: Canva

Canva is an exceptional employer that fosters a vibrant and inclusive work culture at its London campus, where creativity thrives amidst beautiful surroundings. Employees enjoy a range of benefits including equity packages, flexible leave options, and a supportive parental leave policy, all designed to promote personal well-being and professional growth. With opportunities to collaborate on cutting-edge AI research and a commitment to employee development, Canva empowers its team to achieve meaningful impact while enjoying a fulfilling work-life balance.
C

Contact Detail:

Canva Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Senior Research Scientist - Multimodal Agents in London

✨Tip Number 1

Network like a pro! Reach out to current or former employees at Canva on LinkedIn. A friendly chat can give you insider info and maybe even a referral, which can really boost your chances.

✨Tip Number 2

Prepare for the interview by diving deep into Canva's mission and values. Show us how your experience in reinforcement learning and agentic systems aligns with what we do. Tailor your examples to highlight your relevant skills!

✨Tip Number 3

Practice makes perfect! Set up mock interviews with friends or use online platforms. Focus on articulating your thought process when tackling complex problems, especially around multimodal agents and RL.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you’re genuinely interested in joining our team at Canva.

We think you need these skills to ace Senior Research Scientist - Multimodal Agents in London

Reinforcement Learning (RL)
Agentic Systems
Multimodal Modelling
Post-Training Techniques
Experimental Design
Python
PyTorch
Policy Optimization
Reward Modelling
Preference Learning
Distributed Systems
Data Analysis
Collaboration Skills
Mentoring

Some tips for your application 🫡

Tailor Your Application: Make sure to customise your CV and cover letter for the Senior Research Scientist role. Highlight your experience with reinforcement learning and agentic systems, as this is what we’re really looking for!

Showcase Your Projects: Include specific examples of your past work that relate to multimodal modelling and post-training. We love seeing how you’ve tackled challenges and turned research into real-world applications.

Be Clear and Concise: When writing your application, keep it straightforward. Use clear language to explain your skills and experiences, so we can easily see how you fit into our team at Canva.

Apply Through Our Website: Don’t forget to submit your application through our official website! It’s the best way for us to receive your details and get you in the loop for this exciting opportunity.

How to prepare for a job interview at Canva

✨Know Your Stuff

Make sure you brush up on reinforcement learning and agentic systems. Familiarise yourself with the latest research and developments in these areas, as well as how they relate to Canva's mission. Being able to discuss your past experiences and how they align with the role will show that you're genuinely interested.

✨Prepare for Technical Questions

Expect to dive deep into technical discussions about post-training, policy optimisation, and reward modelling. Practise explaining complex concepts clearly and concisely, as you'll need to demonstrate your expertise in Python and PyTorch. Consider doing mock interviews with peers to sharpen your responses.

✨Show Your Collaborative Spirit

Canva values teamwork, so be ready to share examples of how you've successfully collaborated with product, design, and safety teams in the past. Highlight any experiences where you turned research into practical applications, as this will resonate well with their focus on delivering reliable features.

✨Ask Insightful Questions

Prepare thoughtful questions that show your interest in Canva's projects and culture. Inquire about their approach to scaling multimodal agentic systems or how they evaluate the success of their agents. This not only demonstrates your enthusiasm but also helps you gauge if the company is the right fit for you.

Senior Research Scientist - Multimodal Agents in London
Canva
Location: London

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

C
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>