Job Board

Companies

Merantix

Sr Research Engineer, Computer Vision

Sr Research Engineer, Computer Vision in London

London Full-Time 48000 - 72000 £ / year (est.) No home office possible

At a Glance

Tasks: Design and build cutting-edge computer vision systems for real-world applications.
Company: Join Autodesk, a leader in innovative software solutions.
Benefits: Flexible work options, competitive salary, and opportunities for professional growth.
Why this job: Make a real impact with your skills in AI and computer vision.
Qualifications: Bachelor's degree in relevant field and 4+ years of experience in computer vision.
Other info: Collaborative culture focused on diversity and belonging.

The predicted salary is between 48000 - 72000 £ per year.

Location: Flexible / Hybrid / Remote (team-dependent)

About the Role

We are hiring a Senior Software Engineer focused on Computer Vision and Multimodal AI to build robust perception and understanding systems used across multiple teams and product areas. You will develop end-to-end pipelines that transform images and video into structured, reliable observations by combining modern vision models with multimodal reasoning and contextual signals (for example: domain metadata, documents, and sensor inputs). This role blends applied research with strong software engineering: rapid iteration, rigorous evaluation, and production-minded implementation for cloud-scale batch processing and interactive workflows.

Key Responsibilities

Design, build, and improve multi-stage computer vision pipelines that may include segmentation, detection, tracking, and VLM-based analysis, producing structured outputs (entities, attributes, actions/events, confidence, provenance).
Build systems that handle real-world variability in visual inputs (for example: low resolution, poor lighting, motion blur, cluttered scenes, inconsistent capture devices).
Work with diverse media types such as photos, video, timelapse, 360 video, and RGB-D when available.
Fuse visual evidence with contextual inputs such as metadata, documents, and sensor streams to improve recognition quality and reduce ambiguity.
Evaluate and integrate state-of-the-art vision and vision-language foundation models, including open-vocabulary recognition, grounded perception, segmentation, and multimodal reasoning.
Apply fine-tuning or adaptation approaches when needed; partner with ML teams on training, data strategy, and infrastructure best practices.
Define measurable acceptance criteria and benchmarking for accuracy, robustness, latency/cost, and reliability across datasets and domains.
Build scalable cloud workflows for batch processing and integrate outputs with APIs and downstream consumers.
Improve operational performance and cost via batching, caching, model selection, and pipeline observability.
Write maintainable code, contribute to design docs, code reviews, shared libraries, and cross-team technical decisions.

Minimum Qualifications

Bachelor's degree in Computer Science, Electrical Engineering, Robotics, or related field (or equivalent practical experience).
4+ years of experience building computer vision systems using Python.
Strong experience with deep learning for computer vision (detection, segmentation, and/or video understanding) using modern frameworks such as PyTorch.
Experience taking ML prototypes into reliable pipelines, including evaluation, monitoring, and failure analysis.
Experience building or integrating ML systems into cloud or backend workflows (batch processing and/or services).
Strong collaboration and communication skills; ability to work across teams and stakeholders.

Preferred Qualifications

Experience with vision-language models (VLMs) and multimodal systems (for example: grounded vision, open-vocabulary recognition, retrieval-augmented multimodal reasoning).
Experience with multimodal fusion (combining imagery/video with metadata, documents, and sensor signals).
Experience with video pipelines (tracking, temporal aggregation, long-video processing).
Experience with real-world datasets, including data curation, labelling strategy, augmentation, and quality control under limited data constraints.
Experience developing reusable platform components adopted across multiple teams.

What Success Looks Like

Delivered an end-to-end system that ingests real-world image/video inputs and outputs a structured, queryable set of observations (objects plus activities/events), with clear accuracy and reliability metrics.
Demonstrated robustness to common visual failure modes (lighting, occlusion, clutter, camera variation) and measurable improvements when contextual signals are available.
Built a modular pipeline architecture (segmentation/detection/VLM reasoning components) that can be reused and extended across domains and teams.
Maintained strong engineering quality: reproducible experiments, documented decisions, maintainable code, and dependable integrations.

Keywords (for candidate matching)

Computer Vision, Deep Learning, PyTorch, Object Detection, Segmentation, Tracking, Video Understanding, Vision-Language Models (VLM), Multimodal AI, Open-Vocabulary, Grounding, Sensor Fusion, Data Curation, Model Evaluation, Benchmarking, Cloud ML Pipelines, Batch Processing, MLOps, Observability.

Sr Research Engineer, Computer Vision in London employer: Merantix

At Autodesk, we are committed to fostering a vibrant work culture that encourages innovation and collaboration, making us an exceptional employer for those in the field of Computer Vision and Multimodal AI. Our flexible work arrangements, combined with opportunities for professional growth and meaningful contributions to transformative projects, empower our employees to thrive while shaping a better world. Join us to be part of a diverse team where your skills will not only be valued but also have a lasting impact.

Contact Detail:

Merantix Recruiting Team

View Merantix Profile

StudySmarter Expert Advice 🤫

We think this is how you could land Sr Research Engineer, Computer Vision in London

✨Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with people on LinkedIn. You never know who might have the inside scoop on job openings or can refer you directly.

✨Tip Number 2

Show off your skills! Create a portfolio showcasing your projects, especially those related to computer vision and multimodal AI. This gives potential employers a taste of what you can do and sets you apart from the crowd.

✨Tip Number 3

Prepare for interviews by brushing up on common technical questions and coding challenges. Practice explaining your thought process clearly, as communication is key when working across teams.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team.

We think you need these skills to ace Sr Research Engineer, Computer Vision in London

Computer Vision

Multimodal AI

Deep Learning

Python

PyTorch

Object Detection

Segmentation

Tracking

Video Understanding

Vision-Language Models (VLM)

Multimodal Fusion

Cloud ML Pipelines

Batch Processing

Data Curation

Model Evaluation

Some tips for your application 🫡

Tailor Your CV: Make sure your CV highlights your experience with computer vision and deep learning. Use keywords from the job description to show that you’re a perfect fit for the role.

Showcase Your Projects: Include specific examples of projects where you've built or improved computer vision systems. This is your chance to shine, so don’t hold back on the details!

Craft a Compelling Cover Letter: Your cover letter should tell us why you're passionate about computer vision and how your skills align with our needs. Keep it engaging and personal – we want to get to know you!

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands and shows your enthusiasm for joining our team!

How to prepare for a job interview at Merantix

✨Know Your Tech Inside Out

Make sure you’re well-versed in the latest computer vision technologies and frameworks, especially PyTorch. Brush up on your deep learning concepts, as you'll likely be asked to discuss how you've applied them in real-world scenarios.

✨Showcase Your Projects

Prepare to talk about specific projects where you've built or integrated computer vision systems. Highlight the challenges you faced, how you overcame them, and the measurable outcomes of your work. This will demonstrate your hands-on experience and problem-solving skills.

✨Understand the Role's Requirements

Familiarise yourself with the key responsibilities listed in the job description. Be ready to discuss how your previous experiences align with building multi-stage pipelines and handling real-world variability in visual inputs. Tailor your responses to show you understand what success looks like in this role.

✨Ask Insightful Questions

Prepare thoughtful questions about the team’s current projects, challenges they face, and how they measure success. This shows your genuine interest in the role and helps you gauge if the company culture aligns with your values.

Sr Research Engineer, Computer Vision in London

Merantix

Location: London

Sr Research Engineer, Computer Vision in London

At a Glance

Sr Research Engineer, Computer Vision in London employer: Merantix

StudySmarter Expert Advice 🤫

✨Tip Number 1

✨Tip Number 2

✨Tip Number 3

✨Tip Number 4

We think you need these skills to ace Sr Research Engineer, Computer Vision in London

Some tips for your application 🫡

How to prepare for a job interview at Merantix

Sr Research Engineer, Computer Vision in London

Land your dream job quicker with Premium