Applied Scientist Intern: Audio Visual Question Answering
Applied Scientist Intern: Audio Visual Question Answering

Applied Scientist Intern: Audio Visual Question Answering

Cambridge Full-Time No home office possible
Go Premium
Microsoft

At a Glance

  • Tasks: Develop and test innovative algorithms for Audio Visual Question Answering in meeting scenarios.
  • Company: Join Microsoft Teams, a leader in modern work and collaboration.
  • Benefits: Gain hands-on experience, mentorship, and opportunities for publication in top AI conferences.
  • Why this job: Be at the forefront of AI innovation and shape the future of communication.
  • Qualifications: PhD or MSc candidate in relevant fields with coding and analytical skills.
  • Other info: Inclusive culture with strong focus on learning and career growth.

Job Description

Overview

Microsoft Teams is the hub for teamwork that integrates all the people, content, and tools your team needs to be more engaged and effective. It is core to Microsoft’s modern work, modern life & modern education value prop. We are reinventing the way people communicate and work together across the globe.

We are looking to hire a PhD (or published MSc) candidate for a 12-week internship (ideally from February 2026) to join CMD Labs – an applied science team within Microsoft Teams – to work on the next generation of AI supported meeting experiences.

The intern will be fully onboarded onto our current science and production code base and be expected to investigate, propose, implement and test new algorithms and approaches in this area – solving problems of direct relevance to product. The intern will also be expected to present results internally at the end of the position and write up the work for publication in a leading academic AI conference (e.g. ICML, NeurIPS, ACL, CVPR, Interspeech).

You will partner with research, product and engineering teams to invent and deliver the future for Microsoft Teams, Microsoft Copilot and other AI products.

This role is based in Cambridge (United Kingdom).

Our culture is inclusive and collaborative; our team members come from diverse backgrounds, are respectful to one another and achieve impact by building on each other’s strengths and skills. We focus our energy on AI projects that are likely to have high impact on our products and bring high value to our customers. Our team has a strong sense of bias for action and accountability and provides its members with many opportunities for learning and career growth.

Responsibilities

Responsibilities

  • Conduct experiments, create and validate metrics, and develop candidate algorithms for effective Audio Visual Question Answering in meeting rooms scenarios.
  • Collaborate closely with CMD Labs researchers and engineers to leverage existing assets, datasets, and ensure results can be leveraged back into the product.
  • Embody Microsoft culture and values

Qualifications

Required

  • Currently enrolled in a PhD program (or published candidate in MSc program) in Computer Science, Electrical or Computer Engineering, Statistics, or a related field.
  • Practical experience in training, fine-tuning, transformer models or LLMs e.g., using text, audio and/or images.
  • Practical Python coding experience leveraging PyTorch or similar framework
  • Excellent analytical, coding, communication, and collaborative skills.

Preferred

  • Field of research and publications directly related to multimodal AI, including e.g., computer vision and audio modelling – with an emphasis on live / real-time applications.
  • Experience in model quantization, pruning or distillation.
  • Experience working in the domain of live speech processing and conversational AI

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Applied Scientist Intern: Audio Visual Question Answering employer: Microsoft

Microsoft is an exceptional employer, offering a vibrant and inclusive work culture that fosters collaboration and innovation. As an intern in Cambridge, you'll have the unique opportunity to work on cutting-edge AI projects within a supportive team, while benefiting from extensive learning and career growth opportunities. With a strong emphasis on accountability and impact, you'll be empowered to contribute meaningfully to the future of Microsoft Teams and other AI products.
Microsoft

Contact Detail:

Microsoft Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Applied Scientist Intern: Audio Visual Question Answering

✨Tip Number 1

Network like a pro! Reach out to current or former Microsoft employees on LinkedIn. Ask them about their experiences and any tips they might have for landing an internship with CMD Labs. Personal connections can make a huge difference!

✨Tip Number 2

Prepare for technical interviews by brushing up on your coding skills and algorithms. Use platforms like LeetCode or HackerRank to practice. We want you to feel confident when discussing your experience with Python and AI models!

✨Tip Number 3

Showcase your projects! If you've worked on relevant AI projects, make sure to highlight them in your conversations. Discuss the challenges you faced and how you overcame them. This will demonstrate your problem-solving skills and passion for the field.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you’re serious about joining the team at Microsoft Teams and contributing to innovative AI solutions.

We think you need these skills to ace Applied Scientist Intern: Audio Visual Question Answering

PhD in Computer Science or related field
Practical experience in training transformer models
Fine-tuning LLMs
Python coding experience
Experience with PyTorch or similar frameworks
Analytical skills
Communication skills
Collaborative skills
Research experience in multimodal AI
Knowledge of computer vision
Experience in audio modelling
Model quantization
Pruning or distillation techniques
Live speech processing
Conversational AI

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the Applied Scientist Intern role. Highlight your relevant experience in AI, coding skills, and any projects that showcase your ability to work with multimodal data. We want to see how you fit into our team!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about AI and how your background aligns with our mission at Microsoft Teams. Be genuine and let your personality come through – we love a good story!

Showcase Your Projects: If you've worked on any interesting projects related to audio-visual question answering or AI, make sure to mention them. Include links to your GitHub or any publications if applicable. We’re keen to see what you’ve been up to!

Apply Through Our Website: Don’t forget to apply through our website! It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it’s super easy – just follow the prompts and you’ll be all set!

How to prepare for a job interview at Microsoft

✨Know Your Algorithms

Make sure you brush up on the latest algorithms related to Audio Visual Question Answering. Be ready to discuss your experience with training and fine-tuning models, especially transformers and LLMs. This will show that you’re not just familiar with theory but can apply it practically.

✨Showcase Your Coding Skills

Since practical Python coding experience is a must, prepare to demonstrate your skills using PyTorch or similar frameworks. You might be asked to solve a coding problem on the spot, so practice common tasks and be ready to explain your thought process as you code.

✨Collaborative Mindset

Microsoft values collaboration, so be prepared to discuss how you've worked with others in past projects. Share examples of how you’ve partnered with researchers or engineers to achieve results, and highlight your ability to leverage existing datasets and assets.

✨Embody Microsoft’s Culture

Familiarise yourself with Microsoft’s culture and values. During the interview, reflect these values in your answers. Show that you appreciate diversity and inclusivity, and express your enthusiasm for contributing to high-impact AI projects that benefit customers.

Applied Scientist Intern: Audio Visual Question Answering
Microsoft
Location: Cambridge
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>