Research Engineer - Contextual Bandits & RL
Research Engineer - Contextual Bandits & RL

Research Engineer - Contextual Bandits & RL

Full-Time 36000 - 60000 ÂŁ / year (est.) No home office possible
A

At a Glance

  • Tasks: Develop decision-making models for hyper-personalisation in retail using advanced machine learning techniques.
  • Company: VC-backed startup focused on transforming retail through hyper-personalised experiences.
  • Benefits: Competitive pay, equity options, and a dynamic work environment with real impact.
  • Other info: Collaborative culture with opportunities to tackle cutting-edge ML challenges.
  • Why this job: Shape the future of retail and make a difference from day one.
  • Qualifications: 3-5+ years in machine learning, strong Python skills, and experience with contextual bandits or RL.

The predicted salary is between 36000 - 60000 ÂŁ per year.

We are a VC-backed startup focused on hyper-personalisation, currently in stealth. Inspired by the latest in recommender systems, we leverage transformers and graph learning alongside decision‑making models to build the most engaging customer experiences for in‑store retail. Our mission is to change retail forever through hyper‑personalised experiences that are both simple and beautiful.

About the Role - Offline Contextual Bandits and RL for Hyper-personalisation

We are looking for a Research Engineer to build decision‑making models for in‑store hyper‑personalisation, with an initial focus on learning from logged human interaction data in an offline setting. You will work closely with domain experts and engineers to develop contextual bandit and reinforcement learning approaches that can support both single‑step decisions and multi‑step customer journeys, with the potential to enable online learning over time.

Key Responsibilities

  • Develop and productionise offline contextual bandit and offline RL methods that learn from logged interaction data.
  • Build rigorous off‑policy evaluation (OPE) and counterfactual validation to measure candidate policies offline and compare approaches reliably.
  • Formulate and model both single‑step decisions (contextual bandits) and multi‑step decision processes (sequential / RL style settings) based on real retail interactions.
  • Advance representation learning for decision‑making, including using transformers and GNNs where appropriate for behavioural, relational, and sequential data.
  • Translate research ideas into robust systems: dataset design, modelling, evaluation, deployment, monitoring, and iteration.
  • Collaborate cross‑functionally to turn ambiguous product goals into concrete ML objectives, experiments, and deliverables.

Essential Qualifications

  • 3 to 5+ years applying machine learning research in production settings.
  • MSc in Computer Science, Machine Learning, or a closely related field (or equivalent experience).
  • Strong foundations in machine learning and deep learning, including experience with at least one of: contextual bandits, reinforcement learning, counterfactual learning, ranking, or recommender systems.
  • Excellent Python skills and experience developing and debugging production‑level code.
  • Ability to reason about evaluation methodology and failure modes when learning from logged interaction data.

Desired Skills (Bonus Points)

  • Demonstrated experience with offline policy learning and evaluation methods (for example IPS style estimators and doubly‑robust approaches, plus uncertainty estimation).
  • Familiarity with bandit algorithms and exploration strategies, with interest in enabling online learning when the product is ready.
  • Experience with recommenders and ranking (candidate generation, reranking, slates).
  • Experience building data pipelines and improving data quality in modern ML environments.
  • PhD in a relevant field.

What We Offer

  • Opportunity to build technology that will transform millions of shopping experiences.
  • Real ownership and impact in shaping product and company direction.
  • A dynamic, collaborative work environment with cutting‑edge ML challenges.
  • Competitive compensation and equity in a rapidly growing company.

If you’re excited by the idea of shaping the future of retail and eager to make a real impact from day one, we’d love to hear from you.

Research Engineer - Contextual Bandits & RL employer: algo1

Join a pioneering VC-backed startup at the forefront of hyper-personalisation in retail, where you will have the opportunity to develop cutting-edge decision-making models that transform customer experiences. Our dynamic and collaborative work culture fosters innovation and real ownership, allowing you to make a significant impact from day one while enjoying competitive compensation and equity in a rapidly growing company.
A

Contact Detail:

algo1 Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Research Engineer - Contextual Bandits & RL

✨Tip Number 1

Network like a pro! Reach out to people in the industry, attend meetups, and connect with potential colleagues on LinkedIn. You never know who might have the inside scoop on job openings or can put in a good word for you.

✨Tip Number 2

Show off your skills! Create a portfolio showcasing your projects related to contextual bandits and reinforcement learning. This will give you an edge and demonstrate your hands-on experience to potential employers.

✨Tip Number 3

Prepare for interviews by brushing up on your technical knowledge and problem-solving skills. Practice explaining your past projects and how they relate to hyper-personalisation and decision-making models.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our mission to transform retail.

We think you need these skills to ace Research Engineer - Contextual Bandits & RL

Machine Learning
Deep Learning
Contextual Bandits
Reinforcement Learning
Counterfactual Learning
Ranking
Recommender Systems
Python Programming
Off-Policy Evaluation
Data Pipeline Development
Evaluation Methodology
Transformers
Graph Neural Networks (GNNs)
Data Quality Improvement
Cross-Functional Collaboration

Some tips for your application 🫡

Show Your Passion for Hyper-Personalisation: When writing your application, let us see your enthusiasm for hyper-personalisation and how it can transform retail. Share any relevant experiences or projects that highlight your interest in this area.

Tailor Your CV and Cover Letter: Make sure to customise your CV and cover letter to reflect the specific skills and experiences mentioned in the job description. We want to see how your background aligns with our mission and the role of Research Engineer.

Highlight Your Technical Skills: Don’t forget to showcase your technical expertise, especially in machine learning, contextual bandits, and reinforcement learning. Be specific about your experience with Python and any relevant projects you've worked on.

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for this exciting opportunity to shape the future of retail!

How to prepare for a job interview at algo1

✨Know Your Stuff

Make sure you brush up on your machine learning fundamentals, especially contextual bandits and reinforcement learning. Be ready to discuss your past projects and how they relate to the role. This shows you’re not just familiar with the theory but can also apply it in real-world scenarios.

✨Showcase Your Problem-Solving Skills

Prepare to tackle some practical problems during the interview. Think about how you would approach building decision-making models or evaluating policies. Being able to articulate your thought process will impress the interviewers and demonstrate your analytical skills.

✨Collaborative Spirit

Since the role involves working closely with domain experts and engineers, highlight your teamwork experiences. Share examples of how you’ve successfully collaborated on projects, turning ambiguous goals into concrete objectives. This will show that you can thrive in a cross-functional environment.

✨Ask Insightful Questions

Prepare thoughtful questions about the company’s vision for hyper-personalisation and how they plan to implement their technology. This not only shows your interest in the role but also gives you a chance to assess if the company aligns with your career goals.

Research Engineer - Contextual Bandits & RL
algo1

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>