At a Glance
- Tasks: Design and maintain scalable data pipelines and optimise AWS-based infrastructure for GenAI applications.
- Company: Join a forward-thinking tech company at the forefront of AI innovation.
- Benefits: Competitive salary, flexible working options, and opportunities for professional growth.
- Other info: Dynamic work environment with a focus on collaboration and innovation.
- Why this job: Be part of a team that shapes the future of AI with cutting-edge technology.
- Qualifications: Strong experience in PySpark, SQL, Python, and AWS services required.
The predicted salary is between 60000 - 80000 £ per year.
Your responsibilities:
- Design and maintain scalable data pipelines using PySpark, Python, and distributed computing frameworks to support high‑volume data processing.
- Architect and optimize AWS-based data and AI infrastructure, ensuring secure, performant, and cost‑efficient ingestion, transformation, and storage.
- Develop, finetune, benchmark, and evaluate GenAI/LLM models, including custom training and inference optimization.
- Implement and maintain RAG pipelines, vector databases, and document-processing workflows for enterprise GenAI applications.
- Build reusable frameworks for prompt management, evaluation, and GenAI operations.
- Collaborate with cross-functional teams to integrate GenAI capabilities into production systems and ensure high-quality data, governance, and operational reliability.
Your Profile
Essential skills/knowledge/experience:
- Strong experience with PySpark, distributed data processing, and largescale ETL/ELT pipelines.
- Strong SQL expertise including star/snowflake schema design, indexing strategies, writing optimized queries, and implementing CDC, SCD Type 1/2/3 patterns for reliable data warehousing.
- Advanced proficiency in Python for data engineering, automation, and ML/GenAI integration.
- Hands‑on expertise with AWS services (S3, Glue, Lambda, EMR, Bedrock / custom model hosting).
- Practical experience with GenAI/LLM model creation, finetuning, benchmarking, and evaluation.
- Solid understanding of RAG architectures, embeddings, vector stores, and LLM evaluation methods.
- Experience working with structured and unstructured datasets (documents, logs, text, images).
- Familiarity with scalable data storage solutions (Delta Lake, Parquet, Redshift, DynamoDB).
- Understanding model optimization techniques (quantization, distillation, inference optimization).
- Strong capability to debug, tune, and optimize distributed systems and AI pipelines.
Desirable skills/knowledge/experience: (As applicable) ~ Pyspark, Python, SQL, AWS, GenAI
Ingegnere dei dati (f/m) employer: Webologix Ltd/ INC
Contact Detail:
Webologix Ltd/ INC Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Ingegnere dei dati (f/m)
✨Tip Number 1
Network like a pro! Reach out to folks in the industry on LinkedIn or at meetups. We all know that sometimes it’s not just what you know, but who you know that can land you that dream job.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your projects, especially those involving PySpark and AWS. We love seeing real-world applications of your expertise, so make sure to highlight your best work.
✨Tip Number 3
Prepare for interviews by practising common data engineering questions. We recommend doing mock interviews with friends or using online platforms. The more comfortable you are, the better you'll perform!
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, we’re always on the lookout for talented individuals like you to join our team.
We think you need these skills to ace Ingegnere dei dati (f/m)
Some tips for your application 🫡
Tailor Your CV: Make sure your CV reflects the skills and experiences that match the job description. Highlight your experience with PySpark, AWS, and data pipelines to show us you’re the right fit!
Craft a Compelling Cover Letter: Use your cover letter to tell us why you're passionate about data engineering and how your background aligns with our needs. Be specific about your experience with GenAI and LLM models!
Showcase Your Projects: If you've worked on relevant projects, don’t hold back! Include links or descriptions of your work with data processing, AWS services, or any GenAI applications to give us a taste of what you can do.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates from our team!
How to prepare for a job interview at Webologix Ltd/ INC
✨Know Your Tech Inside Out
Make sure you brush up on your PySpark, Python, and AWS skills. Be ready to discuss how you've designed scalable data pipelines or optimised cloud infrastructure in past projects. The more specific examples you can provide, the better!
✨Showcase Your Problem-Solving Skills
Prepare to talk about challenges you've faced in data engineering, especially with large-scale ETL/ELT processes. Think of a couple of scenarios where you had to debug or optimise a system, and explain your thought process clearly.
✨Familiarise Yourself with GenAI Concepts
Since the role involves working with GenAI models, make sure you understand the basics of model creation, finetuning, and evaluation. Bring examples of any relevant projects you've worked on, and be ready to discuss RAG architectures and vector databases.
✨Collaborate and Communicate
This role requires working with cross-functional teams, so be prepared to demonstrate your teamwork skills. Share experiences where you successfully collaborated with others to integrate data solutions or improve operational reliability.