Data Engineer - Research in London

Data Engineer - Research in London

London Full-Time 60000 - 80000 Β£ / year (est.) No working from home possible
Gravity Engineering Services Pvt Ltd.

At a Glance

  • Tasks: Join a dynamic team to build and optimise data pipelines for cutting-edge machine learning models.
  • Company: Innovative research-focused company at the forefront of AI technology.
  • Benefits: Competitive salary, flexible working hours, and opportunities for professional growth.
  • Other info: Collaborative environment with a focus on innovation and career advancement.
  • Why this job: Contribute to groundbreaking projects in Generative AI and make a real impact.
  • Qualifications: Experience with large-scale data workloads and proficiency in Python and AWS.

The predicted salary is between 60000 - 80000 Β£ per year.

About the Role

We are looking for a talented Data Engineer with a focus on scaling efficient distributed workloads. You will work alongside a growing multidisciplinary team of talented research scientists and machine learning engineers to improve and scale the efficiency within our models. In this role, you will contribute to groundbreaking projects such as training the largest open language models and be responsible for ensuring data is collected, processed and utilised in the right way.

Responsibilities

  • Clean, normalize, and preprocess data in a scalable, parallelizable way to prepare it for ingestion into our machine learning model training pipelines while ensuring data quality.
  • Building and maintaining highly scalable distributed workloads.
  • Build data pipelines to ingest and process data (e.g. images and text) for feeding into ML models.
  • AWS Resource Management.
  • Keep up-to-date with methods regarding how to improve data quality and/or curate data for Image, Video, LLMs etc.

Qualifications

  • Proven background within large scale distributed workloads.
  • Experience with large scale data loading for machine learning training runs.
  • Experience with cloud storage and file systems. AWS (S3) is strongly preferred, but open to other cloud platforms.
  • Experience with Python + Pytorch.
  • Experience with multiprocessing and multithreading python workloads.
  • Excellent communication skills to effectively collaborate with users, solve issues, and provide guidance.
  • Attention to detail and the ability to document processes and solutions effectively.
  • Strong interest in Generative AI.
  • Experience working with Machine Learning projects and ideally some Deep learning / Comp Vision knowledge.
  • Experience with dataloading stack (webdataset, torchdata, fsspec, AIstore) and parallel dataframe manipulation using Pyspark/Ray is a plus point.

Data Engineer - Research in London employer: Gravity Engineering Services Pvt Ltd.

Join a pioneering team at the forefront of AI research, where your contributions as a Data Engineer will directly impact the development of cutting-edge language models. Our collaborative work culture fosters innovation and growth, providing ample opportunities for professional development while working in a vibrant location that encourages creativity and teamwork. With a focus on employee well-being and a commitment to excellence, we offer a rewarding environment for those looking to make a meaningful difference in the tech landscape.

Gravity Engineering Services Pvt Ltd.

Contact Details:

Gravity Engineering Services Pvt Ltd. Recruitment Team

We think you need these skills to ace Data Engineer - Research in London

Data Engineering
Distributed Workloads
Data Quality Management
Data Preprocessing
AWS (S3)
Python
Pytorch