Bioinformatics Data Engineer (RNA Resources) in Cambridge

Bioinformatics Data Engineer (RNA Resources) in Cambridge

Cambridge Full-Time 39636 - 39636 £ / year (est.) Home office (partial)
S

At a Glance

  • Tasks: Develop and optimise data pipelines for RNA databases, enhancing performance and scalability.
  • Company: Join a leading bioinformatics team at EMBL-EBI, shaping RNA research globally.
  • Benefits: Hybrid working, competitive salary, and generous benefits package.
  • Other info: Exciting opportunities for career growth and outreach to the scientific community.
  • Why this job: Make a real impact in RNA science while collaborating with top experts.
  • Qualifications: Master’s degree in a relevant field and proficiency in Python and bioinformatics tools.

The predicted salary is between 39636 - 39636 £ per year.

About the Team

Rfam and RNAcentral are key resources for RNA biology, serving tens of thousands of users every year and widely cited in the scientific literature. We are recruiting a Bioinformatics Data Engineer to develop and maintain both the Rfam and RNAcentral databases. They are currently funded by the BBSRC and Wellcome. The RNA Resources team is part of the Sequence Families group led by Alex Bateman. You will be reporting to the Project Leader for RNA Resources, and working closely with an RNA bioinformatician, two full-stack software developers, and an Rfam biocurator.

Your role

As a Bioinformatics Data Engineer, you will run, maintain and optimise our data pipelines, ensuring efficient data processing, storage and retrieval for Rfam and RNAcentral. You will work closely with cross-functional teams to analyse requirements, propose new data pipeline architectures, and implement solutions to improve performance and scalability. The tasks will include:

  • Analysing existing data curation and data production pipelines and identifying areas for improvement, optimisation, and scalability.
  • Modernising and containerising Rfam curation pipelines, and implementing human-in-the-loop, AI-assisted agentic curation.
  • Developing and scaling LLM pipelines used in RNAcentral for literature summarisation and curation.
  • Developing scalable workflows for ncRNA annotation in genomes.
  • Documenting data pipelines, processes, and workflows for internal reference and knowledge sharing.
  • Participating in RNAcentral and Rfam data releases.

You will also be responsible for outreach to the scientific community through presentations at major conferences such as the RNA Society Annual Meeting and ISMB. Additionally, you will present at the RNAcentral consortium meetings and Scientific Advisory Board meetings, gathering regular feedback from community members. Finally, you are expected to keep up to date with the latest developments in RNA science to ensure the resources continue to provide our diverse users with valuable data and analysis. You should be passionate about RNA science and want to help the scientific community. RNAcentral and Rfam are widely used resources, and this role offers the opportunity to shape the work of many RNA researchers worldwide.

You have

  • Master’s level or equivalent qualification in a computational, biological or related scientific discipline.
  • Proficiency in Python and other relevant languages for bioinformatics tool development.
  • Experience with relational databases (PostgreSQL, MySQL) and SQL: knowledge of PostgreSQL and MySQL database architecture, performance tuning, partitioning strategies, indexing techniques, and query optimisation.
  • Demonstrated track record of developing and maintaining production bioinformatics pipelines with workflow management systems such as Nextflow or Snakemake.
  • Experience building applications with LLMs and other AI technologies.
  • Familiarity with Docker or other containerisation technologies, such as Singularity.
  • Comfortable using Git/GitHub, Unix, and Bash.
  • Experience of using AI assisted coding tools.
  • Ability to apply best-practice software development methodologies.
  • Strong communication skills.

You may also have

  • Knowledge of RNA biology and/or demonstrable practical experience with Rfam, Infernal, R-scape and tools for secondary structure prediction.
  • Familiarity with gene annotation or genome feature representation.
  • Experience with high-performance computing environments such as Slurm.
  • Experience in planning and executing data migration projects, including downtime management, data consistency verification, and rollback strategies.
  • Experience with AI workflow libraries such as LangChain and LangGraph.
  • Experience with Kubernetes and cloud infrastructure platforms such as OpenStack.
  • Experience with the Rust programming language.

Other information

Hybrid Working: At EMBL-EBI we are pleased to offer hybrid working options for all our employees. You would be required to work 2 days from the office in Hinxton (currently this is Monday and Tuesday), with the flexibility to come on site more often if preferred.

Interviews

We plan to hold introductory meetings with selected candidates remotely starting in Early July, following this we plan to hold panel interviews remotely in Mid July.

Application instructions

To apply, please include both a CV and a tailored Cover letter. Applications submitted without both documents will not be considered.

Contract

Contract length: 3 years (Grant based contract).

Salary

Salary: Grade 5 monthly salary starting at £3,303 per month after tax but excluding pension and insurance contributions. Plus generous benefits.

Bioinformatics Data Engineer (RNA Resources) in Cambridge employer: Sebibc

At EMBL-EBI, we pride ourselves on being an exceptional employer, offering a collaborative and innovative work culture that fosters professional growth and development. As a Bioinformatics Data Engineer in Hinxton, you will have the unique opportunity to contribute to vital RNA resources while enjoying hybrid working options, competitive salary, and generous benefits that support your well-being and career aspirations.

S

Contact Details:

Sebibc Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Bioinformatics Data Engineer (RNA Resources) in Cambridge

Tip Number 1

Network like a pro! Reach out to people in the RNA and bioinformatics community on LinkedIn or at conferences. A friendly chat can lead to opportunities that aren’t even advertised yet.

Tip Number 2

Show off your skills! Create a portfolio showcasing your projects, especially those related to bioinformatics pipelines or AI technologies. This gives potential employers a taste of what you can do.

Tip Number 3

Prepare for interviews by brushing up on common bioinformatics questions and be ready to discuss your experience with tools like Nextflow or Snakemake. Confidence is key!

Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, we love seeing candidates who are genuinely interested in our work.

We think you need these skills to ace Bioinformatics Data Engineer (RNA Resources) in Cambridge

Python
Bioinformatics Tool Development
Relational Databases (PostgreSQL, MySQL)
SQL
Workflow Management Systems (Nextflow, Snakemake)
LLM Pipelines Development
Docker

Some tips for your application 🫡

Tailor Your Cover Letter:Make sure to customise your cover letter for the Bioinformatics Data Engineer role. Highlight your experience with data pipelines and any relevant projects you've worked on that align with Rfam and RNAcentral's goals.

Show Off Your Skills:In your CV, don’t just list your qualifications—show us how you’ve used your skills in Python, SQL, and bioinformatics tools. Include specific examples of projects or tasks where you made a significant impact.

Be Clear and Concise:Keep your application clear and to the point. Use bullet points for easy reading and make sure to address all the key requirements mentioned in the job description. We want to see your strengths shine through!

Apply Through Our Website:Don’t forget to submit your application through our website! It’s the best way for us to receive your documents and ensures you’re considered for the role. We can’t wait to see what you bring to the table!

How to prepare for a job interview at Sebibc

Know Your RNA

Make sure you brush up on your RNA biology knowledge. Familiarise yourself with Rfam and RNAcentral, as well as the latest developments in RNA science. This will not only help you answer questions confidently but also show your passion for the field.

Showcase Your Technical Skills

Be prepared to discuss your experience with Python, SQL, and bioinformatics tools. Have examples ready that demonstrate your proficiency in developing and maintaining production pipelines, especially using workflow management systems like Nextflow or Snakemake.

Prepare for Problem-Solving Questions

Expect questions that assess your ability to analyse and optimise data pipelines. Think of specific challenges you've faced in previous roles and how you approached them. This will highlight your problem-solving skills and your ability to work collaboratively with cross-functional teams.

Communicate Clearly

Strong communication skills are essential for this role. Practice explaining complex technical concepts in simple terms, as you may need to present your ideas to non-technical stakeholders. Being able to articulate your thoughts clearly will set you apart from other candidates.