Data Curation Developer

Data Curation Developer

London Full-Time 36000 - 60000 £ / year (est.) No home office possible
Go Premium
GlaxoSmithKline

At a Glance

  • Tasks: Curate and prepare data for R&D analysis to support healthcare innovations.
  • Company: GSK is a global biopharma company dedicated to improving health through science and technology.
  • Benefits: Enjoy hybrid working, competitive salary, bonuses, healthcare, and wellness programmes.
  • Why this job: Join a culture that values growth, inclusion, and making a real impact on global health.
  • Qualifications: BSc/MSc/PhD in relevant fields; experience with clinical data and data engineering tools required.
  • Other info: Applications close on July 13th, 2025. Be part of a team aiming to positively impact billions.

The predicted salary is between 36000 - 60000 £ per year.

This role focuses on the technical experience required to curate (e.g. pre-process, harmonize, wrangle and contextualise) data to produce high-quality data assets for R&D analysis. The aim is to support GSK’s Disease Area Strategies and other key R&D priority areas by making data analysis-ready, enabling efficient and effective decision-making across various therapeutic areas.

In this role you will:

  • Lead the development of business requirements for data curation through collaboration with R&D business and data platform teams.
  • Maintain strong connections with analytical groups and R&D Data Platform teams to ensure seamless data integration and usage.
  • Provide coaching and peer review to ensure that the team’s work reflects industry best practices for data curation activities, including data privacy and anonymization standards.
  • Deliver pre-packaged, curated datasets aligned to business requirements for analytics, which includes documenting data specification that clearly describes the required processing steps to generate analysis-ready datasets ensuring providence, lineage and privacy requirements is maintained.
  • Integrate diverse datasets (e.g., clinical trials, real-world data, omics) into a unified format for consistent analysis.
  • Ensure all datasets meet analysis-ready and privacy requirements by performing necessary data curation activities.
  • Write clean, readable code.
  • Ensure that deliverables are appropriately quality controlled, documented, and when required, can be handed over to R&D Tech team for production pipeline implementation.

Basic Qualifications & Skills:

  • BSc/MSc/PhD (or equivalent) in Computer Science, Mathematics, Statistics, or related subject.
  • Proven experience of handling various modalities of scientific clinical data such as clinical trial data (including biomarkers), real world data (RWD), omics etc.
  • Proven ability to handle and process large structured, semi-structured, and unstructured datasets efficiently.
  • Expertise to translate business needs into technical data requirements and processes.
  • Proven ability to quantify and provide insights to business impact and value creation from data curation activities.
  • Agile mindset with the ability to deliver prototypes quickly and iterate improvements based on stakeholder feedback.
  • Experience in Python, Databricks, Delta Lake, PySpark, Pandas, other data engineering frameworks and applying them to achieve industry standards-compliant datasets.
  • Strong communication skills and expertise to translate business needs into technical data requirements and processes.

Preferred Qualifications & Skills:

  • Experience in R.
  • Experience with industry data standards such as CDISC(ODM: CDASH, SDTM, ADaM), HL7 FHIR, OMOP(CDM) etc.
  • Experience with digital clinical trials protocol and Unified Study Definition Model (USDM).
  • Experience in data modelling.

Closing Date for Applications – July 13th, 2025 (COB).

At GSK, we have bold ambitions for patients, aiming to positively impact the health of 2.5 billion people by the end of the decade. Our R&D focuses on discovering and delivering vaccines and medicines, combining our understanding of the immune system with cutting-edge technology to transform people’s lives.

GSK fosters a culture ambitious for patients, accountable for impact, and committed to doing the right thing, making sure that we focus our efforts on accelerating significant assets that meet patients’ needs and have the highest probability of success.

GSK is an Equal Opportunity Employer. This ensures that all qualified applicants will receive equal consideration for employment without regard to race, color, religion, sex (including pregnancy, gender identity, and sexual orientation), parental status, national origin, age, disability, genetic information (including family medical history), military service or any basis prohibited under federal, state or local law.

Data Curation Developer employer: GlaxoSmithKline

GSK is an exceptional employer that prioritises the growth and wellbeing of its employees, offering a competitive salary, annual bonuses, and comprehensive healthcare programmes. With a strong commitment to an inclusive work culture and flexible hybrid working options, GSK empowers its team members to thrive while contributing to meaningful R&D initiatives aimed at improving global health. Join us in a dynamic environment where your contributions directly impact patient outcomes and where you can continuously develop your skills and career.
GlaxoSmithKline

Contact Detail:

GlaxoSmithKline Recruiting Team

UKRecruitment.Adjustments@gsk.com

StudySmarter Expert Advice 🤫

We think this is how you could land Data Curation Developer

✨Tip Number 1

Familiarise yourself with the specific data curation processes mentioned in the job description, such as pre-processing, harmonising, and contextualising data. Understanding these concepts will help you articulate your experience and how it aligns with the role during interviews.

✨Tip Number 2

Network with professionals in the data curation and R&D fields. Engaging with industry peers can provide insights into the latest trends and practices, which you can leverage to demonstrate your knowledge and enthusiasm for the role.

✨Tip Number 3

Prepare to discuss your experience with the tools and technologies listed in the job description, such as Python, Databricks, and PySpark. Being able to share specific examples of how you've used these tools effectively will set you apart from other candidates.

✨Tip Number 4

Research GSK's current projects and initiatives in R&D. Showing that you understand their goals and how your skills can contribute to their mission will demonstrate your genuine interest in the company and the role.

We think you need these skills to ace Data Curation Developer

Data Curation
Data Pre-processing
Data Harmonisation
Data Wrangling
Data Contextualisation
Data Anonymisation
Experience with Clinical Trial Data
Real World Data (RWD) Handling
Omics Data Processing
Python Programming
Databricks
Delta Lake
PySpark
Pandas
Data Integration
Technical Documentation
Agile Methodologies
Strong Communication Skills
Business Requirement Analysis
Quality Control in Data Processing

Some tips for your application 🫡

Tailor Your Cover Letter: Make sure to customise your cover letter to highlight how your skills and experiences align with the specific requirements of the Data Curation Developer role. Use examples from your past work that demonstrate your ability to handle various modalities of scientific clinical data.

Highlight Technical Skills: In your CV, emphasise your technical expertise, particularly in Python, Databricks, and data engineering frameworks. Provide specific examples of projects where you successfully processed large datasets or developed analysis-ready datasets.

Showcase Collaboration Experience: Since the role involves collaboration with R&D teams, include instances in your application where you've worked effectively in a team setting. Highlight your communication skills and how you've translated business needs into technical requirements.

Demonstrate an Agile Mindset: Mention any experience you have with agile methodologies in your application. Discuss how you've delivered prototypes quickly and iterated based on feedback, as this is a key aspect of the role.

How to prepare for a job interview at GlaxoSmithKline

✨Showcase Your Technical Skills

Make sure to highlight your experience with data curation tools and programming languages like Python, Databricks, and PySpark. Be prepared to discuss specific projects where you've successfully handled large datasets and how you ensured they were analysis-ready.

✨Understand the Business Context

Familiarise yourself with GSK's Disease Area Strategies and how data curation supports R&D priorities. This will help you articulate how your skills can directly contribute to their goals during the interview.

✨Prepare for Scenario-Based Questions

Expect questions that assess your problem-solving abilities in real-world scenarios. Think of examples where you've had to wrangle or harmonise data under tight deadlines, and be ready to explain your thought process.

✨Emphasise Collaboration and Communication

Since the role involves working closely with R&D teams, demonstrate your ability to communicate complex technical concepts to non-technical stakeholders. Share examples of how you've successfully collaborated in cross-functional teams.

Data Curation Developer
GlaxoSmithKline
Location: London
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>