Data Engineer

Data Engineer

Full-Time 39636 - 39636 £ / year (est.) No working from home possible
E

At a Glance

  • Tasks: Optimise data pipelines and enhance data processing for vital biological resources.
  • Company: Join EMBL-EBI, a leader in biological data storage and analysis.
  • Benefits: Enjoy flexible working, generous leave, and private medical insurance.
  • Other info: Hybrid working options and a collaborative, inclusive culture await you.
  • Why this job: Make a real impact on global health and biodiversity through data engineering.
  • Qualifications: MSc in IT or bioinformatics, with strong SQL and Python skills.

The predicted salary is between 39636 - 39636 £ per year.

About the Team

The Velankar team maintains macromolecular structure databases that form essential resources for biologists and other life scientists worldwide. PDBe is a founding partner of the Worldwide Protein Data Bank organisation, which maintains the global archive of 3D structural data on macromolecules the Protein Data Bank (PDB). The PDBe team also develops the PDBe Knowledge Base (PDBe-KB) and AlphaFold Protein Structure Database (AFDB). The PDBe team is international and inter-disciplinary and consists of expert data curators, bioinformaticians, scientific software developers and IT specialists.

Your role

We seek a skilled and motivated Data Engineer to join our dynamic team. As a Data Engineer, you will play a crucial role in optimising and enhancing our data pipelines, ensuring efficient data processing, storage and retrieval. You will work closely with cross-functional teams to analyse requirements, propose new data pipeline architectures, and implement solutions to improve performance and scalability.

The tasks for this post include the following:

  • Analyse existing data pipelines and identify areas for improvement, optimisation, and scalability.
  • Work closely with Bioinformaticians and annotators to integrate data pipelines with existing systems and applications.
  • Monitor data pipeline performance, troubleshoot issues, and implement solutions to ensure reliability and efficiency.
  • Stay current with industry trends and best practices in data engineering and recommend new technologies or tools to enhance data infrastructure.
  • Document data pipelines, processes, and workflows for internal reference and knowledge sharing.

Join us in shaping the future of structural biology data. In this role, you’ll use your IT skills and creative ideas to support and scale vital resources like the PDB, PDBe, PDBe-KB and AFDB—ensuring they remain robust, sustainable, and ready for tomorrow’s scientific challenges.

You have

  • MSc in computer science, IT or a related field, or in bioinformatics with a demonstrated IT expertise.
  • Expert in Data Modelling and Advanced SQL.
  • Proficiency in Python programming.
  • Proficiency in ETL (Extract, Transform, Load) processes and tools for large-scale data processing.
  • Strong understanding of relational databases.
  • Hands-on experience across multiple RDBMS platforms:
    • PostgreSQL: Deep knowledge of PostgreSQL database architecture, performance tuning, partitioning strategies, indexing techniques, and query optimisation.
    • Oracle: Extensive experience with Oracle databases, including PL/SQL, Oracle-specific features, and performance optimisation.
    • MySQL/MariaDB: Familiarity with alternative RDBMS platforms for data migration and compatibility scenarios.
  • Experience with database migration.
  • Proven experience in migrating databases between different RDBMS platforms, specifically:
    • Oracle to PostgreSQL migration: Hands-on experience with Oracle to PostgreSQL migration projects, including understanding of compatibility layer (pg_proguard), data type mapping, stored procedure conversion, trigger migration, and handling Oracle-specific features in PostgreSQL.
    • Data migration best practices: Experience with migration tools such as Oracle Data Pump, GoldenGate, custom ETL scripts, and data validation strategies.
    • Migration planning: Ability to plan and execute migration projects, including downtime management, data consistency verification, and rollback strategies.
    • Cross-platform optimisation: Knowledge of leveraging PostgreSQL features to improve performance during migration scenarios.
  • Proficiency in data warehousing (Redshift, BigQuery).
  • Strong communication and collaboration skills, with the ability to work effectively in a team environment.
  • Proficiency in oral and written English.

You might also have

  • PhD in computer science, IT or a related field, or in bioinformatics with a demonstrated IT expertise.
  • Experience in big data technologies and frameworks, such as Apache Spark, Hadoop or similar platforms.
  • Hands-on experience with CI/CD (GitLab CI/GitHub Actions).
  • Familiarity with Java.
  • Familiarity with Google Cloud Platform or AWS.
  • Familiarity with data modelling techniques for AI (Artificial Intelligence) and ML (Machine Learning) applications.
  • Familiarity with Neo4J or other graph databases is an added advantage.
  • Familiarity with data visualisation (Tableau, PowerBI).
  • Knowledge of, or affinity with, structural biology and bioinformatics.
  • Experience working in international teams.

Other helpful information

  • Hybrid Working: At EMBL-EBI we are pleased to offer hybrid working options for all our employees. A dedicated desk will be available every day, but our team work two days on site and three from home.
  • Interviews: We plan to hold first round technical introductory meetings with selected candidates remotely starting mid April 2026.
  • Contract length: Grant based contract for 3 years.
  • Salary: Grade 5 monthly salary starting at £3,303 per month after tax but excluding pension and insurance contributions. Plus generous benefits.

Why join us

Do something meaningful. At EMBL-EBI you can apply your talent and passion to accelerate science and tackle some of humankind's greatest challenges. EMBL-EBI, part of the European Molecular Biology Laboratory, is a worldwide leader in the storage, analysis and dissemination of large biological datasets. We provide the global research community with access to publicly available databases and tools which are crucial for the advancement of healthcare, food security, and biodiversity.

Join a culture of innovation. We are located on the Wellcome Genome Campus, alongside other prominent research and biotech organisations, and surrounded by beautiful Cambridgeshire countryside. This is a highly collaborative and inclusive community where our employees enjoy a relaxed atmosphere. We are committed to ensuring our employees feel valued, supported and empowered to reach their professional potential.

Enjoy lots of benefits:

  • Financial incentives: Monthly family, child and non-resident allowances, annual salary review, pension scheme, death benefit, long-term care, accident-at-work and unemployment insurances.
  • Flexible working arrangements - including hybrid working patterns.
  • Private medical insurance for you and your immediate family (including all prescriptions and generous dental & optical cover).
  • Generous time off: 30 days annual leave per year, in addition public holidays.
  • Relocation package including installation grant (if required).
  • Campus life: Free shuttle bus to and from work, on-site library, subsidised on-site gym and cafeteria, casual dress code, extensive sports and social club activities (on campus and remotely).
  • Family benefits: On-site nursery, 10 days of child sick leave, generous parental leave, holiday clubs on campus and monthly family and child allowances.
  • Benefits for non-UK residents: Visa exemption, education grant for private schooling, financial support to travel back to your home country every second year and a monthly non-resident allowance.

For detailed information please visit our employee benefits page here.

What else you need to know

  • International applicants: We recruit internationally and successful candidates are offered visa exemptions. Please take a look at our International Applicants page for further information.
  • EMBL is a signatory of DORA. Find out how we apply DORA principles to our recruitment and performance assessment processes here.
  • Diversity and inclusion: At EMBL, we believe that diverse teams drive innovation and scientific excellence. We encourage applications from candidates of all genders, identities, nationalities and/or any other diverse backgrounds.

How to apply: To apply please submit a cover letter and a CV through our online system. Applications will close at 23:59 CET on the date shown below. We aim to provide a response within two weeks after the closing date.

Closing Date: 06/05/2026

Data Engineer employer: European Molecular Biology Laboratory

At EMBL-EBI, we pride ourselves on being an exceptional employer, offering a dynamic and inclusive work culture that fosters innovation and collaboration. Located on the picturesque Wellcome Genome Campus in Cambridgeshire, our team enjoys flexible hybrid working arrangements, generous benefits including private medical insurance and extensive annual leave, and ample opportunities for professional growth in the field of structural biology. Join us to make a meaningful impact in the scientific community while enjoying a supportive environment that values your contributions.

E

Contact Details:

European Molecular Biology Laboratory Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Data Engineer

Tip Number 1

Network like a pro! Reach out to current employees at EMBL-EBI or similar organisations on LinkedIn. A friendly chat can give you insider info and might just get your foot in the door.

Tip Number 2

Prepare for those technical interviews! Brush up on your SQL, Python, and data pipeline skills. Practising common interview questions can help you feel more confident when it’s your turn to shine.

Tip Number 3

Show off your projects! If you've worked on any relevant data engineering projects, make sure to discuss them during interviews. Real-world examples can really set you apart from other candidates.

Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team.

We think you need these skills to ace Data Engineer

Data Modelling
Advanced SQL
Python Programming
ETL Processes
Relational Databases
PostgreSQL
Oracle PL/SQL

Some tips for your application 🫡

Tailor Your CV:Make sure your CV is tailored to the Data Engineer role. Highlight your experience with data pipelines, SQL, and any relevant projects that showcase your skills. We want to see how you can contribute to our team!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you're passionate about data engineering and how your background aligns with our mission at StudySmarter. Keep it engaging and personal.

Showcase Your Technical Skills:Don’t hold back on showcasing your technical expertise! Mention your proficiency in Python, ETL processes, and any experience with database migration. We love seeing candidates who are technically savvy and ready to tackle challenges.

Apply Through Our Website:Remember to apply through our website! It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it’s super easy to do—just follow the prompts!

How to prepare for a job interview at European Molecular Biology Laboratory

Know Your Data Pipelines

Before the interview, brush up on your understanding of data pipelines. Be ready to discuss how you've optimised or enhanced data processing in previous roles. Think about specific examples where you identified bottlenecks and implemented solutions.

Showcase Your Technical Skills

Make sure to highlight your expertise in SQL, Python, and ETL processes during the interview. Prepare to answer technical questions or even solve problems on the spot. Practising coding challenges related to data engineering can give you a leg up.

Collaborate and Communicate

Since this role involves working closely with bioinformaticians and other teams, be prepared to discuss your collaboration experiences. Share examples of how you effectively communicated complex technical concepts to non-technical team members.

Stay Current with Industry Trends

Demonstrate your passion for data engineering by discussing recent trends or technologies you've been following. This could include advancements in big data frameworks or new tools that enhance data infrastructure. Showing that you're proactive about learning can set you apart.