Senior Data Engineer (AWS) — Metadata & Vector Embeddings

Senior Data Engineer (AWS) — Metadata & Vector Embeddings

Full-Time 80000 - 100000 £ / year (est.) No working from home possible
Zensar Technologies

At a Glance

  • Tasks: Lead the design and implementation of advanced data engineering solutions on AWS.
  • Company: Join a cutting-edge tech firm focused on AI/ML innovations.
  • Benefits: Attractive salary, flexible working options, and opportunities for professional growth.
  • Other info: Dynamic team environment with a focus on innovation and collaboration.
  • Why this job: Be at the forefront of data engineering and shape the future of AI technologies.
  • Qualifications: 7+ years of hands-on AWS data engineering experience and strong Python skills.

The predicted salary is between 80000 - 100000 £ per year.

The overall technical lead and architect. Designs the metadata schema, builds the simulation onboarding pipeline, deploys metadata embedding pipeline and OpenSearch k‑N‑N vector store, and authors data export format spec for AI/ML use case. This role is the deepest technical seat on the engagement.

Key responsibilities on this engagement:

  • Run the Sprint 1 architecture review of the existing UAT codebase (S3 + Glue + S3 Tables + OpenSearch + Athena) and deliver written gap findings.
  • Design the metadata schema, taxonomy, and field catalogue (Light, Brain, Power).
  • Tune data orchestration — Glue jobs, Athena queries, S3 Tables config, scheduling.
  • Lead the deep‑dive technical sessions with analysts on visualization requirements.
  • Build and validate the simulation data onboarding pipeline against real data — including the 30 GB‑per‑run acoustic spectra dataset.
  • Configure and validate the OpenSearch k‑N‑N vector store and the Bedrock embedding pipeline.
  • Author the AI/ML data export format specification and the AI onboarding pattern document.
  • Co‑design the API middleware blueprint with the Cloud Infrastructure Architect.

Must Have:

  • Principal‑level hands‑on data engineering on AWS — 7+ years.
  • Deep production experience with S3, S3 Tables, Glue, Athena, and OpenSearch.
  • Built and shipped vector embedding workloads.
  • Strong metadata modelling and data taxonomy design experience for scientific data.
  • Comfort working with Parquet, JSON‑LD, and large binary scientific data formats (mesh, time‑series, spectra).
  • Python proficiency; PySpark / Glue job tuning experience.

Nice‑to‑have / differentiators:

  • Familiarity with surrogate model training data pipelines.
  • Experience with SageMaker Unified Studio or comparable governed data‑mesh tooling (in case of required integration).
  • Published or contributed to AWS data architecture patterns or blueprints.

Senior Data Engineer (AWS) — Metadata & Vector Embeddings employer: Zensar Technologies

As a Senior Data Engineer (AWS) at our company, you will be part of a dynamic and innovative team that values technical excellence and collaboration. We offer a supportive work culture that encourages continuous learning and professional growth, with access to cutting-edge technologies and projects that make a real impact in the field of AI/ML. Located in a vibrant tech hub, our employees enjoy a flexible work environment, competitive benefits, and opportunities to engage in meaningful work that drives advancements in data engineering.

Zensar Technologies

Contact Details:

Zensar Technologies Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Senior Data Engineer (AWS) — Metadata & Vector Embeddings

Tip Number 1

Network like a pro! Reach out to your connections in the data engineering field, especially those who work with AWS. A friendly chat can lead to insider info about job openings that aren't even advertised yet.

Tip Number 2

Show off your skills! Create a portfolio showcasing your projects, especially those involving S3, Glue, and OpenSearch. This will give potential employers a taste of what you can do and set you apart from the crowd.

Tip Number 3

Prepare for technical interviews by brushing up on your Python and data orchestration skills. Practice common data engineering problems and be ready to discuss your past experiences with metadata modelling and vector embeddings.

Tip Number 4

Don't forget to apply through our website! We love seeing candidates who are genuinely interested in joining our team. Plus, it makes it easier for us to keep track of your application and get back to you quickly.

We think you need these skills to ace Senior Data Engineer (AWS) — Metadata & Vector Embeddings

AWS
Metadata Schema Design
Simulation Onboarding Pipeline Development
OpenSearch k-N-N Vector Store Configuration
Data Export Format Specification for AI/ML
S3
Glue

Some tips for your application 🫡

Tailor Your CV:Make sure your CV reflects the skills and experiences that match the Senior Data Engineer role. Highlight your hands-on experience with AWS, S3, Glue, and OpenSearch, as well as any relevant projects you've worked on.

Craft a Compelling Cover Letter:Use your cover letter to tell us why you're the perfect fit for this role. Share specific examples of your work with metadata schemas and data orchestration, and how they relate to our needs at StudySmarter.

Showcase Your Technical Skills:Don’t shy away from showcasing your technical prowess! Mention your proficiency in Python, PySpark, and any experience with vector embedding workloads. We want to see your deep production experience shine through.

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it’s super easy!

How to prepare for a job interview at Zensar Technologies

Know Your Tech Inside Out

Make sure you’re well-versed in the technologies mentioned in the job description, especially AWS services like S3, Glue, and OpenSearch. Brush up on your Python and PySpark skills, as you'll likely be asked to demonstrate your technical expertise during the interview.

Prepare for Deep-Dive Discussions

Since this role involves leading technical sessions, be ready to discuss your past experiences in detail. Think of specific projects where you designed metadata schemas or built data pipelines, and be prepared to explain your thought process and the challenges you faced.

Showcase Your Problem-Solving Skills

Expect questions that assess your ability to identify gaps in existing systems. Prepare examples of how you've conducted architecture reviews or optimised data orchestration in previous roles. Highlight your analytical skills and how you approach problem-solving.

Familiarise Yourself with AI/ML Use Cases

Since the role involves authoring data export formats for AI/ML, brush up on relevant use cases and be ready to discuss how you would approach integrating data for these applications. Showing an understanding of the broader context will set you apart from other candidates.