Data Platform Reliability Engineer in Manchester

Data Platform Reliability Engineer in Manchester

Manchester Full-Time 50000 - 65000 £ / year (est.) Home office (partial)
ANS Group

At a Glance

  • Tasks: Ensure reliability and performance of data platforms while optimising customer data environments.
  • Company: Join a forward-thinking tech company focused on modern analytics solutions.
  • Benefits: Enjoy 25 days holiday, private health insurance, and flexible working options.
  • Other info: Dynamic role with opportunities for personal growth and development.
  • Why this job: Make a real impact by enhancing live production systems and supporting customers.
  • Qualifications: Experience with SQL, Spark, and production data platforms is essential.

The predicted salary is between 50000 - 65000 £ per year.

As a Data Engineer, you'll play a pivotal role in delivering a proactive, high-quality managed data service for customers using modern analytics platforms such as lakehouse, SQL, Spark and semantic models. You'll focus on ensuring the reliability, performance, data quality and overall optimisation of customer data environments that have been onboarded into our managed service. Working within ITIL‑aligned Incident, Change and Problem Management processes, you'll help keep customer platforms healthy, stable and fit for purpose. This is a hands‑on engineering role, ideal for someone who thrives on troubleshooting, performance tuning and managing controlled change. Rather than greenfield development, you'll be enhancing and supporting live production systems while guiding customers on best practice and improvements.

Responsibilities

  • Platform Monitoring & Incident Response: Monitor pipelines, dataflows, jobs and lakehouse workloads for failures or performance issues. Respond to alerts, diagnose root causes and restore service quickly. Fix issues across pipeline steps, data refresh, connectivity and authentication. Safely re‑run jobs to restore normal operation. Support D365 governance requirements.
  • Performance & Capacity Management: Monitor capacity usage, throttling and performance risks. Analyse performance of SQL, Spark, notebooks, Delta optimisation and semantic models. Implement optimisations such as query tuning, indexing, scheduling improvements and compute scaling.
  • Data Quality & Schema Management: Detect schema changes, datatype shifts, anomalies and missing/late data. Maintain and run data quality rules (duplicates, thresholds, data completeness). Investigate and resolve data quality issues.
  • Change Delivery & Continuous Improvement: Deliver customer‑requested changes through formal Change Management. Update pipelines, schemas, calculated fields, metadata and RLS roles. Optimise slow‑running workloads and provide impact/rollback assessments.
  • Connectivity, Security & Access: Troubleshoot linked services, runtime failures and network access issues. Provide guidance on gateway configuration and authentication. Support cloud gateway remediation and manage workspace/dataset/lake permissions.
  • Tooling, Documentation & Knowledge Management: Use telemetry, logs and diagnostics to investigate reliability and performance issues. Maintain data dictionaries, lineage documentation, runbooks and data flow diagrams. Ensure all changes are recorded in up‑to‑date support documentation.

Qualifications

  • Experience supporting and operating production data platforms.
  • Strong SQL skills, including optimisation and troubleshooting.
  • Spark‑based data processing and performance tuning experience.
  • Hands‑on work with pipelines/orchestration, lakehouse/warehouse architectures, and semantic models.
  • Familiarity with incident and change management processes.
  • Strong problem‑solving and root‑cause analysis abilities.
  • Clear, structured documentation skills.

Desirable

  • Experience with Microsoft Fabric, Azure data services or similar cloud analytics platforms.
  • DAX optimisation skills.
  • Understanding of capacity‑based analytics and throttling behaviour.
  • Experience in customer‑facing managed services.
  • Knowledge of data governance, lineage and data quality frameworks.

Soft Skills & Attributes

  • Customer‑focused with a strong service mindset.
  • Confident in high‑availability production environments.
  • Calm, methodical incident diagnosis.
  • Strong communicator and collaborative team player.
  • Proactive in spotting risks and improvements.
  • Able to balance reactive support with planned optimisation work.

Benefits

  • 25 days holiday, plus up to 5 additional days to purchase.
  • Birthday off and an extra celebration day.
  • 5 days holiday additional when getting married.
  • 5 volunteer days.
  • Private health insurance.
  • Pension contribution match and 4 x life assurance.
  • Flexible working and work from anywhere for up to 30 days per year.
  • Maternity: 16 weeks full pay, Paternity: 3 weeks full pay, Adoption: 16 weeks full pay.
  • Company social events.
  • Electric car scheme.
  • 12 days personal growth development time.

Data Platform Reliability Engineer in Manchester employer: ANS Group

Join a forward-thinking company that prioritises employee well-being and professional growth, offering a dynamic work culture where innovation thrives. As a Data Platform Reliability Engineer, you'll benefit from flexible working arrangements, generous holiday allowances, and a commitment to personal development, all while contributing to cutting-edge data solutions in a supportive environment. With a focus on collaboration and continuous improvement, this role provides an excellent opportunity to enhance your skills and make a meaningful impact.

ANS Group

Contact Details:

ANS Group Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Data Platform Reliability Engineer in Manchester

Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with potential colleagues on LinkedIn. You never know who might have the inside scoop on job openings or can put in a good word for you.

Tip Number 2

Show off your skills! Create a portfolio showcasing your projects, especially those involving SQL, Spark, or data quality management. This gives employers a taste of what you can do and sets you apart from the crowd.

Tip Number 3

Prepare for interviews by brushing up on common technical questions related to data platforms and incident management. Practice explaining your troubleshooting process clearly and confidently – it’s all about showing how you think on your feet!

Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are genuinely interested in joining our team.

We think you need these skills to ace Data Platform Reliability Engineer in Manchester

SQL
Spark
Data Pipeline Management
Performance Tuning
Data Quality Management
Change Management
Root Cause Analysis

Some tips for your application 🫡

Tailor Your CV:Make sure your CV is tailored to the Data Platform Reliability Engineer role. Highlight your experience with SQL, Spark, and any relevant cloud analytics platforms. We want to see how your skills match what we're looking for!

Showcase Your Problem-Solving Skills:In your application, give examples of how you've tackled performance issues or data quality challenges in the past. We love candidates who can demonstrate their troubleshooting abilities and proactive mindset!

Keep It Clear and Structured:When writing your cover letter, keep it clear and structured. Use bullet points if necessary to make it easy for us to read. We appreciate a well-organised application that gets straight to the point!

Apply Through Our Website:Don't forget to apply through our website! It's the best way for us to receive your application and ensures you’re considered for the role. We can't wait to hear from you!

How to prepare for a job interview at ANS Group

Know Your Tech Inside Out

Make sure you brush up on your SQL and Spark skills before the interview. Be ready to discuss specific examples of how you've optimised queries or troubleshot performance issues in past roles. This will show that you’re not just familiar with the tools, but that you can use them effectively.

Understand Incident Management

Since this role involves ITIL-aligned processes, it’s crucial to understand incident, change, and problem management. Prepare to talk about your experience with these processes and how you’ve handled incidents in a production environment. Real-life examples will make your answers stand out.

Showcase Your Problem-Solving Skills

Be ready to demonstrate your analytical thinking and root-cause analysis abilities. Think of a few challenging situations you’ve faced in data platforms and how you resolved them. This will highlight your proactive approach and ability to keep systems healthy and stable.

Communicate Clearly and Collaboratively

As a Data Platform Reliability Engineer, you'll need to work closely with customers and teams. Practice articulating your thoughts clearly and be prepared to discuss how you’ve collaborated with others in previous roles. Strong communication skills are key to success in this position.