At a Glance
- Tasks: Design and maintain scalable data pipelines and optimise AWS-based data infrastructure.
- Company: Join a forward-thinking tech company focused on GenAI innovation.
- Benefits: Competitive salary, flexible work options, and opportunities for professional growth.
- Other info: Dynamic team environment with excellent career advancement opportunities.
- Why this job: Be at the forefront of AI technology and make a real impact in data engineering.
- Qualifications: Strong experience with PySpark, SQL, and AWS services required.
The predicted salary is between 60000 - 80000 £ per year.
Your responsibilities:
- Design and maintain scalable data pipelines using PySpark, Python, and distributed computing frameworks to support high‑volume data processing.
- Architect and optimize AWS-based data and AI infrastructure, ensuring secure, performant, and cost‑efficient ingestion, transformation, and storage.
- Develop, finetune, benchmark, and evaluate GenAI/LLM models, including custom training and inference optimization.
- Implement and maintain RAG pipelines, vector databases, and document-processing workflows for enterprise GenAI applications.
- Build reusable frameworks for prompt management, evaluation, and GenAI operations.
- Collaborate with cross-functional teams to integrate GenAI capabilities into production systems and ensure high-quality data, governance, and operational reliability.
Your Profile
Essential skills/knowledge/experience:
- Strong experience with PySpark, distributed data processing, and largescale ETL/ELT pipelines.
- Strong SQL expertise including star/snowflake schema design, indexing strategies, writing optimized queries, and implementing CDC, SCD Type 1/2/3 patterns for reliable data warehousing.
- Advanced proficiency in Python for data engineering, automation, and ML/GenAI integration.
- Hands‑on expertise with AWS services (S3, Glue, Lambda, EMR, Bedrock / custom model hosting).
- Practical experience with GenAI/LLM model creation, finetuning, benchmarking, and evaluation.
- Solid understanding of RAG architectures, embeddings, vector stores, and LLM evaluation methods.
- Experience working with structured and unstructured datasets (documents, logs, text, images).
- Familiarity with scalable data storage solutions (Delta Lake, Parquet, Redshift, DynamoDB).
- Understanding model optimization techniques (quantization, distillation, inference optimization).
- Strong capability to debug, tune, and optimize distributed systems and AI pipelines.
Desirable skills/knowledge/experience: (As applicable) Pyspark, Python, SQL, AWS, GenAI
Data Engineer in City of London employer: Webologix Ltd/ INC
Contact Detail:
Webologix Ltd/ INC Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Data Engineer in City of London
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, attend meetups, and connect with potential colleagues on LinkedIn. You never know who might have the inside scoop on job openings or can refer you directly.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your data pipelines, models, and any projects you've worked on. This is your chance to demonstrate your expertise in PySpark, Python, and AWS – make it shine!
✨Tip Number 3
Prepare for those interviews! Brush up on your SQL queries and be ready to discuss your experience with GenAI models. Practise common interview questions and think about how your skills align with the role of a Data Engineer.
✨Tip Number 4
Don't forget to apply through our website! We love seeing applications from passionate candidates like you. Plus, it’s a great way to ensure your application gets the attention it deserves.
We think you need these skills to ace Data Engineer in City of London
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights your experience with PySpark, Python, and AWS. We want to see how your skills match the role, so don’t be shy about showcasing your relevant projects and achievements!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you’re passionate about data engineering and how your background aligns with our needs at StudySmarter. Keep it engaging and personal!
Showcase Your Projects: If you've worked on any cool data pipelines or GenAI models, make sure to mention them! We love seeing practical examples of your work, especially if they demonstrate your problem-solving skills and creativity.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates from our team!
How to prepare for a job interview at Webologix Ltd/ INC
✨Know Your Tech Inside Out
Make sure you brush up on your PySpark, Python, and AWS skills. Be ready to discuss specific projects where you've designed scalable data pipelines or optimised AWS infrastructure. The more detailed examples you can provide, the better!
✨Showcase Your Problem-Solving Skills
Prepare to talk about challenges you've faced in data engineering, especially with large-scale ETL/ELT processes. Think of a couple of scenarios where you had to debug or optimise a system, and be ready to explain your thought process.
✨Understand GenAI and LLMs
Since this role involves working with GenAI models, make sure you can discuss your experience with model creation, finetuning, and evaluation. Bring examples of how you've implemented RAG pipelines or worked with vector databases to the table.
✨Collaboration is Key
This position requires working with cross-functional teams, so be prepared to share experiences where you've collaborated effectively. Highlight how you ensured high-quality data and operational reliability in those projects.