Pyspark Data Engineer

Pyspark Data Engineer

Full-Time 36000 - 54000 £ / year (est.) No home office possible
I

At a Glance

  • Tasks: Design and maintain scalable data pipelines using PySpark and Python.
  • Company: Join a high-profile social network project focused on modern data solutions.
  • Benefits: Remote work with occasional London travel; competitive pay up to £450/day.
  • Why this job: Shape a new data platform from scratch and make a real impact.
  • Qualifications: Expertise in PySpark, Python, Databricks, and CI/CD practices required.
  • Other info: 6-month contract with potential for long-term extension; active SC clearance needed.

The predicted salary is between 36000 - 54000 £ per year.

PySpark Data Engineer | up to £450/day Inside | Remote with occasional London travel

We are seeking a PySpark Data Engineer to support the development of a modern, scalable data lake for a new strategic programme. This is a greenfield initiative to replace fragmented legacy reporting solutions, offering the opportunity to shape a long-term, high-impact platform from the ground up.

Key Responsibilities:
* Design, build, and maintain scalable data pipelines using PySpark 3/4 and Python 3.
* Contribute to the creation of a unified data lake following medallion architecture principles.
* Leverage Databricks and Delta Lake (Parquet format) for efficient, reliable data processing.
* Apply BDD testing practices using Python Behave and ensure code quality with Python Coverage.
* Collaborate with cross-functional teams and participate in Agile delivery workflows.
* Manage configurations and workflows using YAML, Git, and Azure DevOps.

Required Skills & Experience:
* Proven expertise in PySpark 3/4 and Python 3 for large-scale data engineering.
* Hands-on experience with Databricks, Delta Lake, and medallion architecture.
* Familiarity with Python Behave for Behaviour Driven Development.
* Strong understanding of YAML, code quality tools (e.g. Python Coverage), and CI/CD pipelines.
* Knowledge of Azure DevOps and Git best practices.
* Active SC clearance is essential – applicants without this cannot be considered.

Contract Details:
* 6-month initial contract with long-term extension potential (multi-year programme).
* Inside IR35.

This is an excellent opportunity to join a high-profile programme at its inception and help build a critical data platform from the ground up. If you are a mission-driven engineer with a passion for scalable data solutions and secure environments, we\’d love to hear from you.

#J-18808-Ljbffr

Pyspark Data Engineer employer: iO Associates

Join a forward-thinking organisation that values innovation and collaboration, offering a dynamic work culture where your contributions directly impact the development of a cutting-edge data platform. With opportunities for professional growth and the chance to work on a high-profile greenfield project, this role not only provides competitive compensation but also the flexibility of remote work with occasional travel to London. Embrace the challenge of shaping a modern data lake while being part of a mission-driven team dedicated to excellence in data engineering.
I

Contact Detail:

iO Associates Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Pyspark Data Engineer

✨Tip Number 1

Make sure you brush up on your PySpark and Python skills. Since the role requires proven expertise in these technologies, being able to demonstrate your knowledge through practical examples or projects can really set you apart.

✨Tip Number 2

Familiarise yourself with Databricks and Delta Lake, as well as the medallion architecture principles. Having a solid understanding of these concepts will not only help you in interviews but also show that you're ready to contribute from day one.

✨Tip Number 3

Since collaboration is key in this role, practice discussing your past experiences working in Agile teams. Be prepared to share how you've contributed to team success and how you handle cross-functional collaboration.

✨Tip Number 4

Ensure you have your active SC clearance sorted out before applying. This is a crucial requirement for the position, and having it in place will make your application much stronger.

We think you need these skills to ace Pyspark Data Engineer

Proven expertise in PySpark 3/4
Strong proficiency in Python 3
Experience with Databricks
Knowledge of Delta Lake and Parquet format
Understanding of medallion architecture principles
Familiarity with Behaviour Driven Development (BDD) using Python Behave
Experience with code quality tools such as Python Coverage
Proficient in YAML for configuration management
Knowledge of CI/CD pipelines
Experience with Azure DevOps
Familiarity with Git best practices
Active SC clearance

Some tips for your application 🫡

Tailor Your CV: Make sure your CV highlights your experience with PySpark, Python, and any relevant data engineering projects. Use specific examples that demonstrate your skills in building scalable data pipelines and working with Databricks and Delta Lake.

Craft a Compelling Cover Letter: In your cover letter, express your enthusiasm for the role and the opportunity to work on a greenfield project. Mention your familiarity with medallion architecture and how your previous experiences align with the responsibilities outlined in the job description.

Showcase Relevant Projects: If you have worked on similar projects, include them in your application. Describe your role, the technologies used (like YAML, Git, and Azure DevOps), and the impact of your contributions. This will help demonstrate your hands-on experience.

Highlight Security Clearance: Since active SC clearance is essential for this position, make sure to clearly state your current security clearance status in your application. This will ensure that your application is considered right from the start.

How to prepare for a job interview at iO Associates

✨Showcase Your Technical Skills

Be prepared to discuss your experience with PySpark and Python in detail. Highlight specific projects where you've built scalable data pipelines, and be ready to explain the challenges you faced and how you overcame them.

✨Understand the Medallion Architecture

Familiarise yourself with the medallion architecture principles as they are crucial for this role. Be ready to discuss how you would implement these principles in a data lake environment and why they are important for data processing.

✨Demonstrate Agile Experience

Since collaboration with cross-functional teams is key, share examples of your experience working in Agile environments. Discuss how you’ve contributed to team workflows and any tools you’ve used, such as Azure DevOps or Git.

✨Prepare for Behaviour Driven Development (BDD)

Brush up on BDD practices using Python Behave. Be ready to explain how you ensure code quality and testing in your projects, and provide examples of how you've implemented these practices in past roles.

I
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>