Sr Research Engineer, Computer Vision
Sr Research Engineer, Computer Vision

Sr Research Engineer, Computer Vision

Full-Time 48000 - 84000 £ / year (est.) No home office possible
Go Premium
M

At a Glance

  • Tasks: Design and build cutting-edge computer vision systems for real-world applications.
  • Company: Join Autodesk, a leader in innovative software solutions.
  • Benefits: Flexible work options, competitive salary, and opportunities for professional growth.
  • Why this job: Make a real impact by transforming how we understand visual data.
  • Qualifications: Bachelor's degree in relevant field and 4+ years of experience in computer vision.
  • Other info: Collaborative culture focused on diversity and belonging.

The predicted salary is between 48000 - 84000 £ per year.

We are hiring a Senior Software Engineer focused on Computer Vision and Multimodal AI to build robust perception and understanding systems used across multiple teams and product areas. You will develop end-to-end pipelines that transform images and video into structured, reliable observations by combining modern vision models with multimodal reasoning and contextual signals (for example: domain metadata, documents, and sensor inputs). This role blends applied research with strong software engineering: rapid iteration, rigorous evaluation, and production-minded implementation for cloud-scale batch processing and interactive workflows.

Key Responsibilities

  • Design, build, and improve multi-stage computer vision pipelines that may include segmentation, detection, tracking, and VLM-based analysis, producing structured outputs (entities, attributes, actions/events, confidence, provenance).
  • Build systems that handle real-world variability in visual inputs (for example: low resolution, poor lighting, motion blur, cluttered scenes, inconsistent capture devices).
  • Work with diverse media types such as photos, video, timelapse, 360 video, and RGB-D when available.
  • Fuse visual evidence with contextual inputs such as metadata, documents, and sensor streams to improve recognition quality and reduce ambiguity.
  • Evaluate and integrate state-of-the-art vision and vision-language foundation models, including open-vocabulary recognition, grounded perception, segmentation, and multimodal reasoning.
  • Apply fine-tuning or adaptation approaches when needed; partner with ML teams on training, data strategy, and infrastructure best practices.
  • Define measurable acceptance criteria and benchmarking for accuracy, robustness, latency/cost, and reliability across datasets and domains.
  • Build scalable cloud workflows for batch processing and integrate outputs with APIs and downstream consumers.
  • Improve operational performance and cost via batching, caching, model selection, and pipeline observability.
  • Write maintainable code, contribute to design docs, code reviews, shared libraries, and cross-team technical decisions.

Minimum Qualifications

  • Bachelor’s degree in Computer Science, Electrical Engineering, Robotics, or related field (or equivalent practical experience).
  • 4+ years of experience building computer vision systems using Python.
  • Strong experience with deep learning for computer vision (detection, segmentation, and/or video understanding) using modern frameworks such as PyTorch.
  • Experience taking ML prototypes into reliable pipelines, including evaluation, monitoring, and failure analysis.
  • Experience building or integrating ML systems into cloud or backend workflows (batch processing and/or services).
  • Strong collaboration and communication skills; ability to work across teams and stakeholders.

Preferred Qualifications

  • Experience with vision-language models (VLMs) and multimodal systems (for example: grounded vision, open-vocabulary recognition, retrieval-augmented multimodal reasoning).
  • Experience with multimodal fusion (combining imagery/video with metadata, documents, and sensor signals).
  • Experience with video pipelines (tracking, temporal aggregation, long-video processing).
  • Experience with real-world datasets, including data curation, labelling strategy, augmentation, and quality control under limited data constraints.
  • Experience developing reusable platform components adopted across multiple teams.

What Success Looks Like

  • Delivered an end-to-end system that ingests real-world image/video inputs and outputs a structured, queryable set of observations (objects plus activities/events), with clear accuracy and reliability metrics.
  • Demonstrated robustness to common visual failure modes (lighting, occlusion, clutter, camera variation) and measurable improvements when contextual signals are available.
  • Built a modular pipeline architecture (segmentation/detection/VLM reasoning components) that can be reused and extended across domains and teams.
  • Maintained strong engineering quality: reproducible experiments, documented decisions, maintainable code, and dependable integrations.

Keywords (for candidate matching)

  • Computer Vision
  • Deep Learning
  • PyTorch
  • Object Detection
  • Segmentation
  • Tracking
  • Video Understanding
  • Vision-Language Models (VLM)
  • Multimodal AI
  • Open-Vocabulary
  • Grounding
  • Sensor Fusion
  • Data Curation
  • Model Evaluation
  • Benchmarking
  • Cloud ML Pipelines
  • Batch Processing
  • MLOps
  • Observability

Sr Research Engineer, Computer Vision employer: Merantix

At Autodesk, we are committed to fostering a dynamic and inclusive work environment where innovation thrives. As a Senior Research Engineer in Computer Vision, you will have the opportunity to engage in cutting-edge projects that not only challenge your skills but also contribute to meaningful advancements in technology. Our flexible work culture supports a healthy work-life balance, while our emphasis on professional development ensures that you can grow your career alongside a team of passionate experts dedicated to making a positive impact on the world.
M

Contact Detail:

Merantix Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Sr Research Engineer, Computer Vision

✨Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with people on LinkedIn. You never know who might have the inside scoop on job openings or can refer you directly.

✨Tip Number 2

Show off your skills! Create a portfolio showcasing your projects, especially those related to computer vision and multimodal AI. This gives potential employers a taste of what you can do and sets you apart from the crowd.

✨Tip Number 3

Prepare for interviews by brushing up on common technical questions and coding challenges. Practice explaining your thought process clearly, as communication is key when working across teams.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team at StudySmarter.

We think you need these skills to ace Sr Research Engineer, Computer Vision

Computer Vision
Multimodal AI
Deep Learning
Python
PyTorch
Object Detection
Segmentation
Tracking
Video Understanding
Vision-Language Models (VLM)
Cloud ML Pipelines
Batch Processing
Data Curation
Model Evaluation
Benchmarking

Some tips for your application 🫡

Tailor Your Application: Make sure to customise your CV and cover letter for the role. Highlight your experience with computer vision and multimodal AI, and show us how your skills align with what we're looking for.

Showcase Your Projects: Include specific examples of projects you've worked on that relate to the job. We want to see your hands-on experience with building computer vision systems and any cool stuff you've done with deep learning.

Be Clear and Concise: When writing your application, keep it straightforward. Use clear language and avoid jargon where possible. We appreciate a well-structured application that gets straight to the point!

Apply Through Our Website: Don't forget to submit your application through our official website! It’s the best way for us to receive your details and ensures you’re considered for the role.

How to prepare for a job interview at Merantix

✨Know Your Tech Inside Out

Make sure you’re well-versed in the latest computer vision technologies and frameworks, especially PyTorch. Brush up on your deep learning concepts, as you'll likely be asked to discuss how you've applied them in real-world scenarios.

✨Showcase Your Problem-Solving Skills

Prepare to discuss specific challenges you've faced in building computer vision systems. Be ready to explain how you tackled issues like low-resolution images or poor lighting, and highlight any innovative solutions you implemented.

✨Demonstrate Collaboration

Since this role involves working across teams, think of examples where you successfully collaborated with others. Be prepared to share how you communicated technical concepts to non-technical stakeholders and how you integrated feedback into your projects.

✨Prepare for Technical Questions

Expect to dive deep into your experience with multimodal AI and vision-language models. Review key concepts and be ready to discuss how you would approach building scalable cloud workflows or integrating outputs with APIs.

Sr Research Engineer, Computer Vision
Merantix
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>