Pyspark Data Engineer

Pyspark Data Engineer

England Full-Time 90000 - 126000 £ / year (est.) No home office possible
I

At a Glance

  • Tasks: Design and maintain scalable data pipelines using PySpark and Python.
  • Company: Join a cutting-edge team focused on building a modern data lake.
  • Benefits: Enjoy remote work with occasional travel to London and competitive pay.
  • Why this job: Shape a high-impact platform from scratch in a dynamic, collaborative environment.
  • Qualifications: Expertise in PySpark, Python, and familiarity with Databricks and Azure DevOps required.
  • Other info: 6-month contract with potential for long-term extension; active SC clearance needed.

The predicted salary is between 90000 - 126000 £ per year.

We are seeking a PySpark Data Engineer to support the development of a modern, scalable data lake for a new strategic programme. This is a greenfield initiative to replace fragmented legacy reporting solutions, offering the opportunity to shape a long-term, high-impact platform from the ground up.

Key Responsibilities:

  • Design, build, and maintain scalable data pipelines using PySpark 3/4 and Python 3.
  • Contribute to the creation of a unified data lake following medallion architecture principles.
  • Leverage Databricks and Delta Lake (Parquet format) for efficient, reliable data processing.
  • Apply BDD testing practices using Python Behave and ensure code quality with Python Coverage.
  • Collaborate with cross-functional teams and participate in Agile delivery workflows.
  • Manage configurations and workflows using YAML, Git, and Azure DevOps.

Required Skills & Experience:

  • Proven expertise in PySpark 3/4 and Python 3 for large-scale data engineering.
  • Hands-on experience with Databricks, Delta Lake, and medallion architecture.
  • Familiarity with Python Behave for Behaviour Driven Development.
  • Strong understanding of YAML, code quality tools (e.g. Python Coverage), and CI/CD pipelines.
  • Knowledge of Azure DevOps and Git best practices.
  • Active SC clearance is essential - applicants without this cannot be considered.

Contract Details:

  • 6-month initial contract with long-term extension potential (multi-year programme).
  • Inside IR35.

This is an excellent opportunity to join a high-profile programme at its inception and help build a critical data platform from the ground up. If you are a mission-driven engineer with a passion for scalable data solutions and secure environments, we would love to hear from you.

Pyspark Data Engineer employer: iO Associates - UK/EU

Join a forward-thinking company that values innovation and collaboration, offering a dynamic work culture where your contributions directly impact the development of a cutting-edge data platform. With opportunities for professional growth and the flexibility of remote work combined with occasional travel to London, this role provides a unique chance to be part of a greenfield initiative that shapes the future of data engineering. Enjoy competitive compensation and the satisfaction of working on high-impact projects in a supportive environment.
I

Contact Detail:

iO Associates - UK/EU Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Pyspark Data Engineer

✨Tip Number 1

Familiarise yourself with the latest features of PySpark 3/4 and Python 3. Being able to discuss recent updates or enhancements during your conversations can demonstrate your commitment to staying current in the field.

✨Tip Number 2

Showcase your experience with Databricks and Delta Lake by preparing examples of past projects where you implemented these technologies. This will help you illustrate your hands-on expertise and problem-solving skills.

✨Tip Number 3

Brush up on your knowledge of Behaviour Driven Development (BDD) and Python Behave. Be ready to discuss how you've applied these practices in previous roles, as this will highlight your ability to ensure code quality and collaboration.

✨Tip Number 4

Since active SC clearance is essential for this role, make sure to mention your clearance status early in discussions. If you have any relevant experience working in secure environments, be prepared to share those insights.

We think you need these skills to ace Pyspark Data Engineer

Proficiency in PySpark 3/4
Strong Python 3 programming skills
Experience with Databricks and Delta Lake
Understanding of medallion architecture principles
Familiarity with Behaviour Driven Development (BDD) using Python Behave
Knowledge of YAML for configuration management
Experience with code quality tools such as Python Coverage
Understanding of CI/CD pipelines
Proficient in Azure DevOps
Familiarity with Git best practices
Active SC clearance

Some tips for your application 🫡

Tailor Your CV: Make sure your CV highlights your experience with PySpark, Python, and any relevant tools like Databricks and Delta Lake. Use specific examples to demonstrate your expertise in building scalable data pipelines and working with medallion architecture.

Craft a Compelling Cover Letter: In your cover letter, express your enthusiasm for the role and the opportunity to contribute to a greenfield initiative. Mention your familiarity with Agile workflows and how your skills align with the company's goals for developing a modern data lake.

Showcase Relevant Projects: If you have worked on similar projects, include them in your application. Describe your role, the technologies used, and the impact of your contributions. This will help demonstrate your hands-on experience and problem-solving abilities.

Highlight Security Clearance: Since active SC clearance is essential for this position, make sure to clearly state your current security clearance status in your application. This will ensure that your application is considered without any delays.

How to prepare for a job interview at iO Associates - UK/EU

✨Showcase Your Technical Skills

Be prepared to discuss your experience with PySpark and Python in detail. Highlight specific projects where you've designed and built data pipelines, and be ready to explain the challenges you faced and how you overcame them.

✨Understand the Medallion Architecture

Familiarise yourself with medallion architecture principles as they are crucial for this role. Be ready to discuss how you would implement these principles in a data lake environment and why they are important for scalable data solutions.

✨Demonstrate Collaboration Skills

Since the role involves working with cross-functional teams, prepare examples of how you've successfully collaborated in Agile environments. Discuss your experience with tools like Azure DevOps and Git, and how they facilitated teamwork in your previous projects.

✨Prepare for Behaviour Driven Development (BDD)

Brush up on BDD practices using Python Behave. Be ready to explain how you ensure code quality and testing in your projects, and provide examples of how you've implemented these practices in past roles.

I
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>