At a Glance
- Tasks: Use data science to analyse and improve scholarly content and internal datasets.
- Company: PLOS is a nonprofit driving open science for meaningful change in research publishing.
- Benefits: Enjoy remote work, health insurance, paid vacation, and a 401k with employer match.
- Why this job: Join a mission-driven team, collaborate on innovative projects, and make a real impact in science.
- Qualifications: Master's degree in Data Science or related field preferred; experience in scientific publishing is a plus.
- Other info: This is a short-term, fully remote role with flexible working hours across time zones.
The predicted salary is between 78000 - 108000 £ per year.
The position is anticipated to be short-term, and not expected to exceed December 31, 2025. Please note that there is no guaranteed duration of the position, and your employment will be at will and for no fixed duration. As such, it can be terminated by you or the organization at any time, with or without notice or cause, for any reason not otherwise prohibited by law.
This position is fully remote/home based. Applications will be accepted from candidates based in the UK and the following US states: FL, MA, MD, NY, PA, TX, VA.
PLOS is a nonprofit organization on a mission to drive open science forward with measurable, meaningful change in research publishing, policy, and practice. We believe in a better future where science is open to all, for all.
Role SummaryUse data science to provide insight into the nature and structure of our data and content, both published content and internal data sets, and lead on developing models to improve processing, access, understanding and use of that data. Working closely with the Subject Matter Experts, Product Managers, Software Engineers and Product Designers, you will play a key role in improving understanding of our content and data, improving how we manage, process and use that data in support of PLOS’s goals. You will be tasked with the large-scale analysis of our broad and varied collection of scholarly content, which includes research articles and associated data sets, and line of business data and information. This will require working with structured and unstructured data, a large corpus of scholarly articles, using programmatic techniques such as statistical analysis, natural language processing, information retrieval, and machine learning. You will also work with the rest of the team to turn your insights and software prototypes into production services that improve the utility of this data for both our end users and internal stakeholders.
Responsibilities- Create and use machine learning models, statistical analysis, natural language processing to improve scientific content workflows, enhance discoverability, and support Open Science initiatives.
- Collect, clean, and analyze large datasets of scientific content and related information from various sources, ensuring data quality and integrity.
- Build and test predictive models and machine learning algorithms for tasks such as entity extraction, workflow automation, and enhancing the understanding of scientific content.
- Visualize and present findings in a clear, concise, and compelling manner to both technical and non-technical audiences.
- Work as part of a cross-functional team, contributing insights, models and code and deploying production services that improve our use of data.
- Collaborate with editorial, marketing, product, and colleagues across PLOS to understand data needs and translate business requirements into analytical solutions that enable new open science capabilities.
- Contribute to the development of data strategies and best practices within the organization and identify opportunities for workflow optimization and automation.
- Engage with the latest research and trends in data science, Open Science, and scholarly publishing, proactively identifying opportunities to apply innovative techniques and refine best practices.
- Consider the ethical implications of all data techniques as applied to our data, always ensuring that they are appropriate, take into account the potential for negative impact and do not bias research.
- Extensive experience in statistical modeling, machine learning, and data mining techniques, with a focus on applications in text analysis or scientific data, including knowledge of forecasting, A/B testing, entity extraction, and feature engineering.
- Proficiency in programming languages such as Python, R, and SQL, and data analysis libraries (e.g., Pandas, NumPy, SciPy, Tidyverse).
- Strong knowledge of machine learning frameworks and libraries (e.g., TensorFlow, PyTorch, scikit-learn, NLTK).
- Experience with NLP techniques, such as named entity recognition (NER), topic modeling, semantic similarity, and knowledge graph construction.
- Demonstrated ability to communicate complex technical findings clearly and effectively, both verbally and in writing, through reports and presentations to diverse audiences.
- Strong analytical and problem-solving skills, with a high degree of attention to detail and accuracy in handling scientific data.
- Experience working with large datasets and database systems, and ideally with scientific content repositories or publishing platforms.
- Familiarity with the scientific research environment, scholarly literature, and open science principles are an advantage.
- Able to develop hypotheses based on quantitative and qualitative evidence.
- Experience working with solid development practices, git, CI etc.
- Ability to work effectively both independently and collaboratively within a remote, agile team environment.
A Master's degree in a relevant field such as Data Science, Statistics, Computer Science, Bioinformatics, or a related quantitative discipline with a focus on scientific applications is preferred. Relevant work experience in a data science role within scientific publishing, research, or a related field is desirable.
Physical Requirements and Work EnvironmentProlonged periods stationary at a desk and working on a computer. Some national and international travel will be required. Some flexibility to work across time zones.
The base salary range we’ve established for this position is (US) $105,000 - $145,000. PLOS also offers a comprehensive benefits package summarized below.
BENEFITS:US: 401k with employer match. Employee sponsored health, dental and vision insurance (Dental and Vision 100% employer paid). Paid Vacation, 12 public holidays and sick leave. Parental leave. Birthday and three winter holidays days off. Short term and long term disability insurance. 2 days paid time off for volunteering per year. Fully remote work environment with stipend on joining for home office.
Data Scientist (R&D Project) US-Remote employer: PLOS GmbH
Contact Detail:
PLOS GmbH Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Data Scientist (R&D Project) US-Remote
✨Tip Number 1
Familiarise yourself with the latest trends in data science and Open Science. This will not only help you understand the role better but also allow you to engage in meaningful conversations during interviews, showcasing your passion for the field.
✨Tip Number 2
Network with professionals in the scientific publishing and data science communities. Attend relevant webinars or online meetups to connect with potential colleagues and learn more about the challenges they face, which can give you an edge in discussions.
✨Tip Number 3
Prepare to discuss specific projects where you've applied machine learning or NLP techniques. Be ready to explain your thought process, the challenges you faced, and how your contributions made a difference, as this will demonstrate your practical experience.
✨Tip Number 4
Showcase your ability to communicate complex data findings clearly. Practice explaining your past work to non-technical audiences, as this skill is crucial for collaborating with cross-functional teams at PLOS.
We think you need these skills to ace Data Scientist (R&D Project) US-Remote
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights relevant experience in data science, particularly in statistical modelling, machine learning, and natural language processing. Use keywords from the job description to demonstrate your fit for the role.
Craft a Compelling Cover Letter: In your cover letter, express your passion for open science and how your skills can contribute to PLOS's mission. Mention specific projects or experiences that align with the responsibilities outlined in the job description.
Showcase Your Technical Skills: Include a section in your application that details your proficiency in programming languages like Python and R, as well as any relevant libraries and frameworks. Provide examples of how you've applied these skills in previous roles.
Highlight Collaboration Experience: Since the role involves working with cross-functional teams, emphasise your experience collaborating with different departments. Share examples of how you’ve contributed to team projects and how you communicate complex findings to diverse audiences.
How to prepare for a job interview at PLOS GmbH
✨Showcase Your Technical Skills
Make sure to highlight your experience with statistical modelling, machine learning, and data mining techniques. Be prepared to discuss specific projects where you've applied these skills, especially in text analysis or scientific data.
✨Demonstrate Collaboration Experience
Since the role involves working closely with cross-functional teams, share examples of how you've successfully collaborated with others, such as product managers or software engineers, to achieve common goals.
✨Prepare for Data Visualisation Questions
Be ready to explain how you would visualise complex data findings for both technical and non-technical audiences. Consider bringing examples of past visualisations you've created to illustrate your points.
✨Understand Open Science Principles
Familiarise yourself with the principles of open science and be prepared to discuss how they relate to your work. Showing a genuine interest in the mission of the organisation can set you apart from other candidates.