At a Glance
- Tasks: Lead technical architecture and design data pipelines for AI/ML projects.
- Company: Innovative tech firm focused on cutting-edge data solutions.
- Benefits: Competitive salary, flexible work options, and opportunities for professional growth.
- Other info: Dynamic environment with opportunities to work on advanced technologies.
- Why this job: Join a team shaping the future of data engineering with impactful projects.
- Qualifications: 7+ years in data engineering on AWS with strong technical skills.
The predicted salary is between 80000 - 100000 £ per year.
Role summary: The overall technical lead and architect. Designs the metadata schema, builds the simulation onboarding pipeline, deploys metadata embedding pipeline and OpenSearch k-NN vector store, and authors data export format spec for AI/ML use case. This role is the deepest technical seat on the engagement:
- Key responsibilities
- Run the Sprint 1 architecture review of the existing UAT codebase (S3 + Glue + S3 Tables + OpenSearch + Athena) and deliver written gap findings.
- Design the metadata schema, taxonomy, and field catalogue (Light, Brain, Power).
- Tune data orchestration — Glue jobs, Athena queries, S3 Tables config, scheduling.
- Lead the deep-dive technical sessions with analysts on visualization requirements.
- Build and validate the simulation data onboarding pipeline against real data — including the 30 GB-per-run acoustic spectra dataset.
- Configure and validate the OpenSearch k-NN vector store and the Bedrock embedding pipeline.
- Author the AI/ML data export format specification and the AI onboarding pattern document.
- Co-design the API middleware blueprint with the Cloud Infrastructure Architect.
Must-have
- Principal-level hands-on data engineering on AWS — 7+ years.
- Deep production experience with S3, S3 Tables, Glue, Athena, and OpenSearch (including k-NN / vector search).
- Built and shipped vector embedding workloads.
- Strong metadata modelling and data taxonomy design experience for scientific or engineering domains.
- Comfort working with Parquet, JSON-LD, and large binary scientific data formats (mesh, time-series, spectra).
- Python proficiency; PySpark / Glue job tuning experience.
Nice-to-have / differentiators
- Prior simulation / CAE / HPC data lake experience (Ansys, Siemens NX, BETA CAE, OpenFOAM, etc.).
- Familiarity with surrogate model training data pipelines.
- Experience with SageMaker Unified Studio or comparable governed data-mesh tooling (in case of required integration).
- Multi-cloud data engineering (AWS GCP) experience.
- Published or contributed to AWS data architecture patterns or blueprints.
Data Engineer in Cheltenham employer: Response Informatics
As a leading employer in the tech industry, we offer Data Engineers an exceptional opportunity to work at the forefront of data architecture and engineering. Our collaborative work culture fosters innovation and creativity, while our commitment to employee growth ensures that you will have access to continuous learning and development opportunities. Located in a vibrant tech hub, we provide a dynamic environment where your contributions directly impact cutting-edge AI/ML projects, making your work both meaningful and rewarding.
StudySmarter Expert Advice🤫
We think this is how you could land Data Engineer in Cheltenham
✨Tip Number 1
Network like a pro! Reach out to your connections in the data engineering field and let them know you're on the hunt for a role. Attend meetups or webinars related to AWS, OpenSearch, or data engineering to meet potential employers and learn about job openings.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your projects, especially those involving S3, Glue, and Athena. This will give you an edge during interviews and help demonstrate your hands-on experience with the technologies mentioned in the job description.
✨Tip Number 3
Prepare for technical interviews by brushing up on your Python and PySpark skills. Be ready to discuss your experience with metadata modelling and data taxonomy design, as these are crucial for the role. Practice coding challenges that focus on data engineering concepts.
✨Tip Number 4
Don't forget to apply through our website! We love seeing candidates who are genuinely interested in joining our team. Tailor your application to highlight your relevant experience with vector embedding workloads and any familiarity with simulation data lakes.
We think you need these skills to ace Data Engineer in Cheltenham
Some tips for your application 🫡
Tailor Your CV:Make sure your CV is tailored to the Data Engineer role. Highlight your experience with AWS, S3, Glue, and OpenSearch, as these are key for us. Use specific examples that showcase your hands-on skills and achievements in data engineering.
Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Tell us why you're passionate about data engineering and how your background aligns with our needs. Be sure to mention any relevant projects or experiences that demonstrate your expertise in metadata modelling and data taxonomy design.
Showcase Your Technical Skills:In your application, don't shy away from showcasing your technical skills. Mention your proficiency in Python, PySpark, and any experience with large scientific data formats. We love seeing candidates who can clearly articulate their technical capabilities!
Apply Through Our Website:We encourage you to apply through our website for the best chance of getting noticed. It helps us keep track of applications and ensures you’re considered for the role. Plus, it’s super easy to do!
How to prepare for a job interview at Response Informatics
✨Know Your Tech Inside Out
Make sure you’re well-versed in the technologies mentioned in the job description, especially AWS services like S3, Glue, and OpenSearch. Brush up on your Python skills and be ready to discuss your hands-on experience with vector embedding workloads.
✨Prepare for Technical Deep Dives
Expect to dive deep into technical discussions during the interview. Prepare to explain your past projects, particularly those involving metadata schema design and data orchestration. Be ready to showcase how you’ve tackled challenges in previous roles.
✨Showcase Your Problem-Solving Skills
Think of specific examples where you identified gaps in existing systems or processes. Be prepared to discuss how you approached these issues, particularly in relation to data pipelines and architecture reviews. This will demonstrate your analytical thinking and leadership capabilities.
✨Familiarise Yourself with the Company’s Domain
Research the company’s focus areas, especially if they relate to scientific or engineering domains. Understanding their specific needs can help you tailor your responses and show that you’re genuinely interested in how your skills can benefit them.