Senior Data Engineer (Data Quality, PySpark, Databricks)
Senior Data Engineer (Data Quality, PySpark, Databricks)

Senior Data Engineer (Data Quality, PySpark, Databricks)

Leeds Full-Time 55000 - 65000 £ / year (est.) No home office possible
P

At a Glance

  • Tasks: Ensure data quality and optimise transformation pipelines using PySpark and Databricks.
  • Company: Join Smoove, a leading tech provider in the property sector, simplifying home ownership since 2003.
  • Benefits: Enjoy remote work flexibility with team meet-ups for collaboration and a competitive salary of £65,000 - £75,000.
  • Why this job: Be part of a mission-driven team that values diversity and innovation in the data space.
  • Qualifications: Extensive experience with PySpark, Databricks, and GDPR compliance is essential.
  • Other info: Collaborate with analysts and engineers to ensure trusted datasets for impactful decision-making.

The predicted salary is between 55000 - 65000 £ per year.

Hi, we’re Smoove, part of the PEXA Group. Our vision is to simplify and revolutionise the home moving and ownership experience for everyone. We are on a mission to deliver products and services that remove the pain, frustration, uncertainty, friction and stress that the current process creates. We are a leading provider of tech in the property sector - founded in 2003, our product focus has been our conveyancer two-sided marketplace, connecting consumers with a range of quality conveyancers to choose from at competitive prices via our easy-to-use tech platform. We are now building out our ecosystem so consumers can benefit from our services either via their Estate Agent or their Mortgage Broker, through smarter conveyancing platforms, making the home buying or selling process easier, quicker, safer and more transparent.

Why join Smoove? Great question! We pride ourselves on attracting, developing and retaining a diverse range of people in an equally diverse range of roles and specialisms – who together achieve outstanding results. Our transparent approach and open-door policy make Smoove a great place to work and as our business expands, we are looking for ambitious, talented people to join us.

We are looking for a technically proficient Senior Data Engineer to join our growing Data team. Your primary focus will be on ensuring data quality, stability, and reliability — from the moment data arrives in its rawest form to when it is used in decision-making dashboards and customer-facing reports. You will optimise the transformation pipeline from start to finish, guaranteeing that datasets are robust, tested, secure, and business-ready. Our data platform is built using Databricks, with data pipelines written in PySpark and orchestrated using Airflow. You will be expected to challenge and improve current transformations, ensuring they meet our performance, scalability, and data governance needs. This includes work with complex, nested data structures, ensuring they are reliably parsed and transformed. Experience in managing sensitive data (PII) and implementing GDPR policies is required. You’ll work closely with analysts, engineers, and business stakeholders to ensure that datasets are not only accurate but also trusted. You will collaborate with product and engineering teams to incorporate data from new products into our core business datasets, ensuring that these new sources meet our data standards and are quickly usable for business intelligence. You’ll help put controls in place — such as access policies, metadata layers, and automated data quality checks — to ensure long-term stability. Experience with a data governance platform like Alation is desirable. While predominantly remote / home based the team meet up to 20-25 days per year for meaningful collaboration in either Leeds or Thame.

Key Responsibilities

  • Ensure end-to-end data quality, from raw ingested data to business-ready datasets
  • Optimise PySpark-based data transformation logic for performance and reliability
  • Build scalable and maintainable pipelines in Databricks and Airflow
  • Implement and uphold GDPR-compliant processes around PII data
  • Collaborate with stakeholders to define what "business-ready" means, and confidently sign off datasets as fit for consumption
  • Put testing strategies in place to detect data issues early and often
  • Contribute to access management, metadata management, and wider data governance practices
  • Help shape our approach to reliable data delivery for internal and external customers

Skills & Experience Required

  • Extensive hands-on experience with PySpark, including performance optimisation
  • Deep working knowledge of Databricks (development, architecture, and operations)
  • Proven experience working with Airflow for orchestration
  • Proven track record in managing and securing PII data, with GDPR compliance in mind
  • Experience in data governance processes; Alation experience preferred, but similar tools welcome
  • Strong SQL skills and experience optimising complex queries
  • Strong experience in handling and transforming semi-structured data
  • High competency in programming, with a focus on clean, efficient, and production-quality code
  • Demonstrated ability to work with stakeholders to understand data needs and guide the validation and delivery process
  • Experience implementing and maintaining data quality tests and monitoring solutions
  • Strong verbal and written communication skills
  • Ability to think holistically about data reliability and how it serves business decisions

£65,000 - £75,000 a year

Sound like you? We at Smoove are ready so if this role sounds like you, apply today.

Senior Data Engineer (Data Quality, PySpark, Databricks) employer: PEXA Group Limited

At Smoove, we foster a dynamic and inclusive work environment that prioritises employee development and collaboration. With a strong focus on innovation in the property tech sector, our team enjoys flexible remote working arrangements while also benefiting from regular in-person meet-ups for meaningful collaboration. Join us to be part of a mission-driven company that values transparency, diversity, and the growth of its employees.
P

Contact Detail:

PEXA Group Limited Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Senior Data Engineer (Data Quality, PySpark, Databricks)

✨Tip Number 1

Familiarise yourself with the specific technologies mentioned in the job description, such as PySpark and Databricks. Consider building a small project or contributing to an open-source project that uses these tools to demonstrate your hands-on experience.

✨Tip Number 2

Engage with the data engineering community online, particularly around topics like data quality and GDPR compliance. Participating in forums or attending webinars can help you gain insights and make connections that could be beneficial during the interview process.

✨Tip Number 3

Prepare to discuss your previous experiences with data governance and managing sensitive data. Be ready to share specific examples of how you've implemented data quality checks and ensured compliance with regulations like GDPR.

✨Tip Number 4

Research Smoove and its mission to understand their products and services better. Tailoring your conversation during interviews to align with their goals will show your genuine interest in the company and how you can contribute to their vision.

We think you need these skills to ace Senior Data Engineer (Data Quality, PySpark, Databricks)

Extensive hands-on experience with PySpark
Performance optimisation in PySpark
Deep working knowledge of Databricks
Experience with Airflow for orchestration
Proven track record in managing and securing PII data
GDPR compliance knowledge
Experience in data governance processes
Strong SQL skills
Optimising complex SQL queries
Handling and transforming semi-structured data
High competency in programming
Ability to write clean, efficient, production-quality code
Stakeholder engagement and communication
Implementing and maintaining data quality tests
Monitoring data quality solutions
Holistic thinking about data reliability

Some tips for your application 🫡

Tailor Your CV: Make sure your CV highlights your experience with PySpark, Databricks, and Airflow. Use specific examples that demonstrate your ability to optimise data transformation logic and manage sensitive data in compliance with GDPR.

Craft a Compelling Cover Letter: In your cover letter, express your passion for data quality and how it aligns with Smoove's mission to simplify the home moving experience. Mention your collaborative skills and how you can contribute to the team’s goals.

Showcase Relevant Projects: If you have worked on projects involving data governance or have experience with tools like Alation, be sure to include these in your application. Highlight any specific challenges you faced and how you overcame them.

Proofread Your Application: Before submitting, carefully proofread your application for any spelling or grammatical errors. A polished application reflects your attention to detail, which is crucial for a role focused on data quality.

How to prepare for a job interview at PEXA Group Limited

✨Showcase Your Technical Skills

Be prepared to discuss your hands-on experience with PySpark and Databricks. Bring examples of how you've optimised data transformation logic and built scalable pipelines, as this will demonstrate your technical proficiency.

✨Understand Data Governance

Familiarise yourself with data governance practices, especially around GDPR compliance and managing PII data. Be ready to explain how you have implemented these processes in previous roles, as this is crucial for the position.

✨Communicate Effectively

Strong verbal and written communication skills are essential. Practice explaining complex data concepts in simple terms, as you'll need to collaborate with various stakeholders to ensure datasets are business-ready.

✨Prepare for Problem-Solving Questions

Expect questions that assess your ability to troubleshoot data quality issues. Think of specific challenges you've faced in the past and how you resolved them, as this will highlight your analytical thinking and problem-solving skills.

Senior Data Engineer (Data Quality, PySpark, Databricks)
PEXA Group Limited
P
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>