At a Glance
- Tasks: Drive research in multimodal agent systems and develop innovative AI solutions.
- Company: Join Canva, a leading design platform with a vibrant culture.
- Benefits: Equity packages, flexible leave, and a supportive parental leave policy.
- Why this job: Make a real impact by shaping the future of AI in design.
- Qualifications: Experience in reinforcement learning and agentic systems is essential.
- Other info: Collaborative environment with opportunities for personal and professional growth.
The predicted salary is between 48000 - 72000 ÂŁ per year.
Company Description
Join the team redefining how the world experiences design. We know job hunting can be a little time consuming and you’re probably keen to find out what’s on offer, so we’ll get straight to the point.
Where and how you can work
The buzzing Canva London campus features several buildings around beautiful leafy Hoxton Square in Shoreditch. While our global headquarters is in Sydney, Australia, London is our HQ for Europe, with all kinds of teams based here, plus event spaces to gather our team and communities. You’ll experience a warm welcome from our Vibe team at front of house, amazing home cooked food from our Head Chef and a variety of workspaces to hang out with your team mates or get solo work done. That said, we trust our Canvanauts to choose the balance that empowers them and their team to achieve their goals and so you have choice in where and how you work.
Job Description
At Canva, our mission is to empower the world to design. We’re building AI that feels magical and lands real impact for millions of people – helping anyone create with confidence. We’re looking for a senior research scientist who lives and breathes reinforcement learning and agentic systems to push the frontier of reasoning, tool use, and reliability – and ship it to users.
About the team
We explore multimodal agentic architectures, build scalable training and evaluation loops, and partner closely with product and platform teams to turn breakthroughs into delightful product features. We are a cutting-edge post-training team, developing new multimodal agentic systems. We work on all topics of multimodal modelling, post-training and design agents, we build scalable training and evaluation loops, and partner closely with product and platform teams to turn breakthroughs into delightful product features. We are looking for a person with experience in post-training and reinforcement learning (RL) to join our team.
About the role
You’ll drive research directions and play a leading role in hands‑on work across the agent stack—from reward design and policy optimization to planning, memory, and tool orchestration, dataset construction, to post-training, and the development of novel post-training approaches. You’ll design tight experiments, iterate quickly, and land trustworthy conclusions. Most importantly, you’ll help convert research into reliable, safe, and high‑quality product experiences.
What you’ll be doing in this role
- Develop agent systems (planning, multimodal tool use, retrieval, novel training approaches, modeling ablations) for real tasks in design, vision, and language.
- Scale post-training and RL across distributed systems (PyTorch) with efficient data loaders, tracing/telemetry, stable training of mixture-of-experts (MoE) architectures, and reproducible pipelines; profile, debug, and optimize.
- Contribute to the research agenda for RL/agentic systems aligned with Canva’s product goals; identify high‑leverage bets and retire dead ends quickly.
- Build reward models and learning loops: RLHF/RLAIF, preference modeling, DPO/IPO‑style objectives, offline/online RL, curriculum learning, and credit assignment for multi‑step reasoning.
- Develop simulation and sandbox tasks that surface failure modes (planning errors, tool‑use brittleness, hallucination, unsafe actions) and turn them into measurable targets.
- Help align on rigorous evaluation for agents (task success, reliability, latency, safety, regressions).
- Stand up offline suites and online A/B tests; favor simple, controlled experiments that generalize.
- Collaborate and ship: work shoulder‑to‑shoulder with product, design, safety, and platform to land research as reliable features—then iterate.
- Share and elevate: mentor teammates, present findings internally, and contribute back to the community when it helps the field and our users.
You’re likely a match if you have:
- Depth in implementing and post-training LLMs/VLMs/Diffusion models, with a track record of shipped research or publications in agents/RL.
- Experience modifying, and adapting open-source models.
- Strong experience with experimental design: tight baselines, clean ablations, reproducibility, and clear, data‑backed conclusions.
- Fluency in Python and PyTorch; you’re comfortable in large ML codebases and can profile, debug, and optimize training and inference.
- Practical experience building agent loops (planning, tool invocation, retrieval, memory) and evaluating multi‑step reasoning quality.
- Hands‑on experience with policy optimization, reward modeling, and preference learning (e.g., RLHF/RLAIF, DPO/IPO, actor‑critic/PPO, offline RL).
- Experience with large‑scale training (distributed training, experiment tracking, evaluation harnesses) and cloud multimodal tooling.
- Experience with RL for MoE architectures.
Nice to have:
- Experience with video and audio modelling.
- Experience with multi‑agent settings.
- Strength in alignment and safety evaluations, including red‑teaming and risk mitigation for tool‑using agents.
- Contributions to open‑source, benchmarks, or shared evaluation suites for agents.
Additional Information
What’s in it for you? Achieving our crazy big goals motivates us to work hard – and we do – but you’ll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too. We also offer a range of benefits to set you up for every success in and outside of work. Here’s a taste of what’s on offer:
- Equity packages – we want our success to be yours too.
- Inclusive parental leave policy that supports all parents & carers.
- An annual Vibe & Thrive allowance to support your wellbeing, social connection, office setup & more.
- Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally.
Check out lifeatcanva.com for more info.
Other stuff to know
We make hiring decisions based on your experience, skills and passion, as well as how you can enhance Canva and our culture. When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process. We celebrate all types of skills and backgrounds at Canva so even if you don’t feel like your skills quite match what’s listed above – we still want to hear from you! Please note that interviews are conducted virtually.
Senior Research Scientist - Multimodal Agents employer: Canva
Contact Detail:
Canva Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Senior Research Scientist - Multimodal Agents
✨Tip Number 1
Get your networking game on! Reach out to current or former employees at Canva through LinkedIn. A friendly chat can give you insider info and maybe even a referral, which can really boost your chances.
✨Tip Number 2
Prepare for the interview like it’s a big project. Research Canva’s latest innovations in AI and design. Show us you’re not just passionate about the role but also about what we do as a company!
✨Tip Number 3
Practice makes perfect! Run through common interview questions with a mate or in front of the mirror. The more comfortable you are talking about your experience and skills, the better you'll come across.
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re serious about joining the Canva family.
We think you need these skills to ace Senior Research Scientist - Multimodal Agents
Some tips for your application 🫡
Be Yourself: When you're writing your application, let your personality shine through! We want to get to know the real you, so don’t be afraid to show your passion for research and design.
Tailor Your Application: Make sure to customise your application to highlight your experience with reinforcement learning and agentic systems. Show us how your skills align with what we're looking for in a Senior Research Scientist!
Showcase Your Achievements: Don’t just list your past roles; share specific projects or research that demonstrate your expertise. We love seeing concrete examples of how you've made an impact in your previous positions.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands and helps us keep track of all the amazing talent out there!
How to prepare for a job interview at Canva
✨Know Your Stuff
Make sure you brush up on reinforcement learning and agentic systems. Familiarise yourself with the latest research and developments in these areas, as well as how they relate to Canva's mission. Being able to discuss specific projects or papers will show your passion and expertise.
✨Showcase Your Experience
Prepare to talk about your hands-on experience with post-training and RL. Have examples ready that demonstrate your ability to design experiments, build agent systems, and optimise models. This is your chance to shine, so make it count!
✨Collaborate and Communicate
Canva values teamwork, so be ready to discuss how you've collaborated with cross-functional teams in the past. Highlight any experiences where you worked closely with product, design, or safety teams to turn research into practical applications.
✨Ask Insightful Questions
Prepare thoughtful questions about Canva's approach to multimodal agentic architectures and their future goals. This not only shows your interest in the role but also helps you gauge if the company culture aligns with your values.