At a Glance
- Tasks: Design and maintain scalable data pipelines and optimise AWS-based data infrastructure.
- Company: Join a forward-thinking tech company at the forefront of AI innovation.
- Benefits: Enjoy competitive pay, flexible work options, and opportunities for professional growth.
- Other info: Collaborative environment with endless learning and career advancement opportunities.
- Why this job: Make an impact in the exciting world of GenAI and data engineering.
- Qualifications: Strong skills in PySpark, SQL, Python, and AWS services required.
The predicted salary is between 60000 - 80000 £ per year.
Your responsibilities:
- Design and maintain scalable data pipelines using PySpark, Python, and distributed computing frameworks to support high‑volume data processing.
- Architect and optimize AWS-based data and AI infrastructure, ensuring secure, performant, and cost‑efficient ingestion, transformation, and storage.
- Develop, finetune, benchmark, and evaluate GenAI/LLM models, including custom training and inference optimization.
- Implement and maintain RAG pipelines, vector databases, and document-processing workflows for enterprise GenAI applications.
- Build reusable frameworks for prompt management, evaluation, and GenAI operations.
- Collaborate with cross-functional teams to integrate GenAI capabilities into production systems and ensure high-quality data, governance, and operational reliability.
Your Profile
Essential skills/knowledge/experience:
- Strong experience with PySpark, distributed data processing, and largescale ETL/ELT pipelines.
- Strong SQL expertise including star/snowflake schema design, indexing strategies, writing optimized queries, and implementing CDC, SCD Type 1/2/3 patterns for reliable data warehousing.
- Advanced proficiency in Python for data engineering, automation, and ML/GenAI integration.
- Hands‑on expertise with AWS services (S3, Glue, Lambda, EMR, Bedrock / custom model hosting).
- Practical experience with GenAI/LLM model creation, finetuning, benchmarking, and evaluation.
- Solid understanding of RAG architectures, embeddings, vector stores, and LLM evaluation methods.
- Experience working with structured and unstructured datasets (documents, logs, text, images).
- Familiarity with scalable data storage solutions (Delta Lake, Parquet, Redshift, DynamoDB).
- Understanding model optimization techniques (quantization, distillation, inference optimization).
- Strong capability to debug, tune, and optimize distributed systems and AI pipelines.
Desirable skills/knowledge/experience: (As applicable) Pyspark, Python, SQL, AWS, GenAI
Data Engineer employer: Webologix Ltd/ INC
Contact Detail:
Webologix Ltd/ INC Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Data Engineer
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, attend meetups, and connect with potential colleagues on LinkedIn. You never know who might have the inside scoop on job openings or can refer you directly.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your data pipelines, models, and any projects you've worked on. This is your chance to demonstrate your expertise in PySpark, Python, and AWS – make it shine!
✨Tip Number 3
Prepare for interviews by brushing up on your technical knowledge and problem-solving skills. Practice coding challenges and be ready to discuss your experience with ETL/ELT pipelines and GenAI models. Confidence is key!
✨Tip Number 4
Don’t forget to apply through our website! We’re always on the lookout for talented Data Engineers. Keep an eye on our job listings and get your application in – we’d love to see what you can bring to the team!
We think you need these skills to ace Data Engineer
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights your experience with PySpark, Python, and AWS. We want to see how your skills match the role, so don’t be shy about showcasing your relevant projects and achievements!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you’re passionate about data engineering and how your background aligns with our needs at StudySmarter. Keep it engaging and personal!
Showcase Your Projects: If you've worked on any cool data pipelines or GenAI models, make sure to mention them! We love seeing practical examples of your work, especially if they demonstrate your problem-solving skills and creativity.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates from our team!
How to prepare for a job interview at Webologix Ltd/ INC
✨Know Your Tech Inside Out
Make sure you brush up on your PySpark, Python, and AWS skills. Be ready to discuss specific projects where you've designed data pipelines or optimised infrastructure. The more detailed examples you can provide, the better!
✨Showcase Your Problem-Solving Skills
Prepare to talk about challenges you've faced in data engineering, especially with large-scale ETL/ELT processes. Think of a couple of scenarios where you had to debug or optimise a system and how you approached it.
✨Understand GenAI and LLMs
Since this role involves working with GenAI models, be prepared to discuss your experience with model creation, finetuning, and evaluation. Bring examples of how you've implemented RAG architectures or worked with vector databases.
✨Collaboration is Key
This position requires working with cross-functional teams, so be ready to share experiences where you collaborated with others. Highlight how you ensured high-quality data and operational reliability in those projects.