At a Glance
- Tasks: Ensure data quality and optimise transformation pipelines using PySpark and Databricks.
- Company: Join Smoove, a leading tech provider in the property sector with a mission to simplify home ownership.
- Benefits: Competitive salary, remote work flexibility, and opportunities for professional growth.
- Other info: Collaborative team environment with regular meet-ups for meaningful connections.
- Why this job: Make a real impact on data-driven decisions while working with innovative technologies.
- Qualifications: Extensive experience with PySpark, Databricks, and data governance practices.
The predicted salary is between 65000 - 75000 £ per year.
Hi, we’re Smoove, part of the PEXA Group. Our vision is to simplify and revolutionise the home moving and ownership experience for everyone. We are on a mission to deliver products and services that remove the pain, frustration, uncertainty, friction and stress that the current process creates.
We are a leading provider of tech in the property sector - founded in 2003, our product focus has been our conveyancer two-sided marketplace, connecting consumers with a range of quality conveyancers to choose from at competitive prices via our easy-to-use tech platform. We are now building out our ecosystem so consumers can benefit from our services either via their Estate Agent or their Mortgage Broker, through smarter conveyancing platforms, making the home buying or selling process easier, quicker, safer and more transparent.
Why join Smoove? Great question! We pride ourselves on attracting, developing and retaining a diverse range of people in an equally diverse range of roles and specialisms – who together achieve outstanding results. Our transparent approach and open-door policy make Smoove a great place to work and as our business expands, we are looking for ambitious, talented people to join us.
We are looking for a technically proficient Senior Data Engineer to join our growing Data team. Your primary focus will be on ensuring data quality, stability, and reliability — from the moment data arrives in its rawest form to when it is used in decision-making dashboards and customer-facing reports. You will optimise the transformation pipeline from start to finish, guaranteeing that datasets are robust, tested, secure, and business-ready.
Our data platform is built using Databricks, with data pipelines written in PySpark and orchestrated using Airflow. You will be expected to challenge and improve current transformations, ensuring they meet our performance, scalability, and data governance needs. This includes work with complex, nested data structures, ensuring they are reliably parsed and transformed. Experience in managing sensitive data (PII) and implementing GDPR policies is required.
You’ll work closely with analysts, engineers, and business stakeholders to ensure that datasets are not only accurate but also trusted. You will collaborate with product and engineering teams to incorporate data from new products into our core business datasets, ensuring that these new sources meet our data standards and are quickly usable for business intelligence.
You’ll help put controls in place — such as access policies, metadata layers, and automated data quality checks — to ensure long-term stability. Experience with a data governance platform like Alation is desirable. While predominantly remote / home based the team meet up to 20-25 days per year for meaningful collaboration in either Leeds or Thame.
Key Responsibilities
- Ensure end-to-end data quality, from raw ingested data to business-ready datasets
- Optimise PySpark-based data transformation logic for performance and reliability
- Build scalable and maintainable pipelines in Databricks and Airflow
- Implement and uphold GDPR-compliant processes around PII data
- Collaborate with stakeholders to define what 'business-ready' means, and confidently sign off datasets as fit for consumption
- Put testing strategies in place to detect data issues early and often
- Contribute to access management, metadata management, and wider data governance practices
- Help shape our approach to reliable data delivery for internal and external customers
Skills & Experience Required
- Extensive hands-on experience with PySpark, including performance optimisation
- Deep working knowledge of Databricks (development, architecture, and operations)
- Proven experience working with Airflow for orchestration
- Proven track record in managing and securing PII data, with GDPR compliance in mind
- Experience in data governance processes; Alation experience preferred, but similar tools welcome
- Strong SQL skills and experience optimising complex queries
- Strong experience in handling and transforming semi-structured data
- High competency in programming, with a focus on clean, efficient, and production-quality code
- Demonstrated ability to work with stakeholders to understand data needs and guide the validation and delivery process
- Experience implementing and maintaining data quality tests and monitoring solutions
- Strong verbal and written communication skills
- Ability to think holistically about data reliability and how it serves business decisions
£65,000 - £75,000 a year
Sound like you? We at Smoove are ready so if this role sounds like you, apply today.
Remote Senior Data Engineer (Data Quality, PySpark, Databricks) in Exeter employer: PEXA Group
At Smoove, we foster a dynamic and inclusive work environment that prioritises employee development and collaboration. As a remote-first company with regular in-person meet-ups, we offer our Senior Data Engineers the opportunity to work on cutting-edge technology while enjoying a supportive culture that values transparency and innovation. Join us to be part of a mission-driven team dedicated to transforming the home moving experience, with competitive salaries and a commitment to your professional growth.
StudySmarter Expert Advice🤫
We think this is how you could land Remote Senior Data Engineer (Data Quality, PySpark, Databricks) in Exeter
✨Get Involved in Data Science Meetups
Tap into local data science meetups or workshops to connect with fellow enthusiasts and professionals. These events are goldmines for networking, and sometimes even lead directly to job openings at companies like PEXA Group!
✨Show Off Your Projects
Start building a public portfolio showcasing your data science projects on platforms like GitHub or personal websites. Highlight unique analyses or models you've developed. This not only demonstrates your skills but also gets your name out there for roles like Remote Senior Data Engineer (Data Quality, PySpark, Databricks) at PEXA Group.
✨Leverage Professional Networks
Join professional bodies related to data science, like the Data Science Society or similar organisations. Getting involved can lead to mentorship opportunities and insider knowledge about full-time positions at companies like PEXA Group.
✨Apply Directly through Our Website
When you find a suitable opening like Remote Senior Data Engineer (Data Quality, PySpark, Databricks) at PEXA Group, make sure to apply directly through our website. It gives you an edge and shows you're keen to join our team. Plus, who doesn’t love a direct application? It’s easier than navigating through job boards!
We think you need these skills to ace Remote Senior Data Engineer (Data Quality, PySpark, Databricks) in Exeter
Some tips for your application 🫡
Show Off Your Projects:In the world of data science, your projects can speak volumes about your skills. Make sure to showcase a few key projects in your CV or portfolio, especially those that highlight your ability to work with data sets, build models, or use relevant tools like Python, R, or SQL. Don’t forget to include links to any GitHub repositories if applicable!
Quantify Your Achievements:Employers love numbers! When drafting your CV, highlight your achievements with quantifiable results. For instance, mention how your data analysis led to a certain percentage increase in efficiency or revenue at a previous job or project. These details can really make your application pop!
Craft a Tailored Cover Letter:For a full-time role at PEXA Group, your cover letter should reflect your passion for data science and your excitement about the specific projects or values of the company. Dive into why you’re a good fit, how your skills align with their needs, and any unique perspectives you can bring to the team.
Stand Out with Relevant Courses and Certifications:Although experience talks, relevant courses or certifications can be your ticket to impressing hiring managers at PEXA Group. Mention any standout courses you've completed that equipped you with essential skills, such as machine learning certifications or data visualisation courses. This shows your commitment to continuously developing your skills in the field!
How to prepare for a job interview at PEXA Group
✨Brush Up on Your Statistics
For a data science role, we need to seriously sharpen our statistics skills. Get ready to tackle technical questions on probability distributions, hypothesis testing, and regression analysis. These are often the bread and butter of data science interviews, so don't just skim over them!
✨Showcase Your Projects
Prepare a killer portfolio showcasing your data science projects. We should include details about the datasets used, the tools and techniques applied, and the impact of your findings. If we can walk them through a particularly challenging project or a cool visualisation that had real-world implications, it’ll really make us stand out!
✨Get Comfortable with Python and R
Most data science positions require us to be proficient in programming languages like Python and R. We should practice common libraries like pandas, NumPy, and scikit-learn, and be ready for live coding exercises or algorithm questions. Showing off our coding chops can really impress the interviewers at PEXA Group!
✨Prepare for Case Studies
Expect to encounter real-world case studies during the interview. We might be asked how we’d approach a data problem or analyse a dataset to extract insights. It's essential to think out loud and demonstrate our problem-solving process so that the interviewer can see our logical thinking in action.