At a Glance
- Tasks: Build and operate data pipelines, ensuring quality and traceability across systems.
- Company: Join Cape.io, a leader in innovative advertising technology.
- Benefits: Competitive salary, learning opportunities, and a collaborative culture.
- Other info: Work in a diverse team with a clear path for career progression.
- Why this job: Shape the future of data in a dynamic environment with real impact.
- Qualifications: 3-5 years in data engineering, strong Python and SQL skills required.
The predicted salary is between 60000 - 80000 £ per year.
What you’ll do:
- Build and operate the data pipelines that pull data from Cape.io’s departmental databases, product platforms, and third-party systems (NetSuite, HubSpot, Zendesk) into a unified platform.
- Establish the data quality and deduplication patterns that make the platform trustworthy: entity resolution across overlapping source systems, and clear provenance for every dataset.
- Set up lineage and cataloguing so that any dataset consumed downstream is traceable and documented.
- Partner with our AI/ML engineers to shape how data is structured, versioned, and served to AI agents and models.
- Partner with product engineers to model and integrate data from Cape.io’s product platforms.
- Use AI-assisted tooling (LLM code generation, schema mapping, validation) as part of how you work. We expect this, not just permit it.
What you’ll bring:
- 3–5 years building production data pipelines. Doesn’t need to be at massive scale; it needs to be systems real people depended on.
- Hands-on experience across the full modern data stack: orchestration (Airflow, Dagster, or Prefect), transformation (dbt), and warehousing (Snowflake, BigQuery, or Databricks). You don’t need to be an expert in all three, but you need to have worked with each.
- Solid Python and SQL. You write code other people will need to read.
- Experience pulling data from SaaS and business systems (NetSuite, HubSpot, Zendesk, Salesforce, or equivalent) via APIs, connectors, or event streams.
- Hands-on experience with AWS or GCP. We use both.
- A demonstrable habit of using AI tools (Cursor, Claude, Copilot, or similar) in your day-to-day engineering work.
- A visible learning footprint we can look at: GitHub, a blog, a talk, or projects you can point us to.
- Eligible to work in the UK or the Netherlands. We do not sponsor visas.
- Strong written and verbal communication. You’ll be holding the central data layer together, which means explaining it to people who don’t think in pipelines.
(bonus) Media, advertising, or Adtech experience, especially the fragmented data typical of media operations across markets.
(bonus) MDM, data lineage, or cataloguing tooling (OpenLineage, DataHub, Atlan, or similar) used in a serious role.
(bonus) Understanding of how AI systems consume data: vector databases, embeddings, feature stores, and RAG patterns.
(bonus) Real-time or event-driven architectures (Kafka, Pub/Sub, streaming pipelines).
(bonus) Familiarity with media-specific systems (traffic, clearance and compliance, creative asset management, campaign management).
About the data platform:
The challenge: Cape.io operates in 100+ countries. Our data lives across departmental databases, product platforms (creative automation, compliance-and-clearance, distribution), and third-party systems like NetSuite, HubSpot, and Zendesk. Right now, nobody owns unifying it.
Our solution: We’re building the first proper data platform at Cape.io from the ground up — one that ingests data from dozens of sources, deduplicates and governs it, keeps a clear provenance chain, and makes the result accessible to the AI agents, analytics workflows, and downstream teams that depend on it.
Your impact: You’ll be the only person on the central data layer, partnering with our AI/ML engineers on how the data is consumed and with product engineers on the product-side data. This is an individual contributor role today. If the team grows, it has a credible path into a lead position. The media and advertising industry is characterised by exceptionally noisy, fragmented, and inconsistent data across systems and markets — experience with that complexity is a bonus.
Our DNA: How we work:
- Move smart: Hesitation kills momentum. We combine speed with intelligence, turning complex challenges into automated solutions. Every experiment teaches us, every failure makes us stronger.
- Define tomorrow: We’re not here to meet expectations; we’re here to reset them. Our mission demands unprecedented quality, security, and scale. We push the boundaries of what’s possible.
- Be better together: Success multiplies when shared. We step up, dig in, and make things better for each other. Through open collaboration and fierce support, we turn individual potential into collective achievement.
- Change what’s possible: Creativity drives humanity forward. Our technology exists to amplify it, protect it, and let it soar. We remove barriers between great minds and great work.
Ownership of the first proper data platform at Cape.io from day one.
Direct collaboration with the Head of AI Operations and Group CTO on how data fuels what we’re building in AI.
AI-assisted tooling as the default in your workflow, with budget to match.
Competitive salary and benefits package.
A culture of learning and development with a clear path for progression.
A global, diverse, and collaborative team.
The opportunity to work on a market-leading product that is defining the future of advertising.
Please note this role can be based from London, Amsterdam or Tilburg. 2 days per week in your chosen office but you must be eligible to work in the UK or Netherlands as we cannot sponsor visas.
Cape.io is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We believe that the best teams are made up of people with different backgrounds, perspectives, and skills. We encourage you to apply even if you don’t check every single box!
Data Engineer in London employer: Cape.io (formerly Peach)
Contact Detail:
Cape.io (formerly Peach) Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Data Engineer in London
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, attend meetups, and connect with potential colleagues on LinkedIn. You never know who might have the inside scoop on job openings or can put in a good word for you.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your data pipelines, projects, or any cool stuff you've built. This is your chance to demonstrate your hands-on experience and make a lasting impression.
✨Tip Number 3
Prepare for interviews by brushing up on your Python and SQL skills. Be ready to discuss how you've tackled real-world data challenges and how you can contribute to building Cape.io's data platform.
✨Tip Number 4
Apply through our website! It’s the best way to ensure your application gets seen. Plus, we love seeing candidates who take the initiative to engage directly with us.
We think you need these skills to ace Data Engineer in London
Some tips for your application 🫡
Tailor Your Application: Make sure to customise your CV and cover letter to highlight your experience with data pipelines and the tools mentioned in the job description. We want to see how your skills align with what we’re looking for!
Showcase Your Projects: Don’t forget to include links to your GitHub or any relevant projects that demonstrate your hands-on experience. We love seeing your work in action, so let us know what you’ve been up to!
Be Clear and Concise: When writing your application, keep it straightforward and to the point. We appreciate strong written communication, so make sure your application is easy to read and understand.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands, and we can’t wait to hear from you!
How to prepare for a job interview at Cape.io (formerly Peach)
✨Know Your Data Stack
Make sure you’re familiar with the modern data stack mentioned in the job description. Brush up on your experience with orchestration tools like Airflow or Dagster, and be ready to discuss how you've used them in past projects. This will show that you understand the technical requirements and can hit the ground running.
✨Showcase Your Coding Skills
Since solid Python and SQL skills are a must, prepare to demonstrate your coding abilities. Bring examples of your work, whether it’s from GitHub or personal projects. Be ready to explain your thought process and how you ensure your code is readable for others.
✨Communicate Clearly
As the central data layer holder, you’ll need to explain complex concepts to non-technical team members. Practice articulating your past experiences in a way that’s easy to understand. Use analogies or simple terms to break down complicated ideas, showing that you can bridge the gap between tech and non-tech teams.
✨Embrace AI Tools
The role expects you to use AI-assisted tooling regularly. Be prepared to discuss how you’ve integrated AI tools into your workflow. Share specific examples of how these tools have improved your efficiency or the quality of your work, demonstrating that you’re not just familiar with them but actively using them to enhance your engineering processes.