Principal Data Architect
Principal Data Architect

Principal Data Architect

Full-Time 80000 - 100000 £ / year (est.) Home office (partial)
C

At a Glance

  • Tasks: Design and lead Chemify’s data architecture for a global network of automated laboratories.
  • Company: Join Chemify, a pioneering tech company revolutionising chemistry with AI and robotics.
  • Benefits: Enjoy competitive salary, hybrid work, and the chance to shape the future of science.
  • Other info: Be part of a rapidly scaling team tackling cutting-edge challenges in AI and chemistry.
  • Why this job: Make a real impact in digital chemistry and accelerate scientific discovery.
  • Qualifications: 8+ years as a Data Architect with strong Python and data engineering skills.

The predicted salary is between 80000 - 100000 £ per year.

Location: Glasgow or London (King’s Cross)

Workstyle: Hybrid

Reports to: CTO

About Chemify:

Chemify is revolutionising chemistry. We are creating a future where the synthesis of previously unimaginable molecules, drugs, and materials is instantly accessible. By combining AI, robotics, and the world’s largest continually expanding database of chemical programs, we are accelerating chemical discovery to improve quality of life and extend the reach of humanity.

Our newly opened Chemifarm facility in Glasgow operates a growing fleet of advanced robotic systems that automate synthesis, optimisation, and library generation. As we scale up to globally distributed facilities, we are undertaking a foundational transformation of data integrates and scales across our platform.

The Role

We are looking for a Principal Data Architect to design and lead the evolution of Chemify’s data architecture into a performant, distributed, well-governed, and enterprise-ready data ecosystem. Your mission is to define & implement how data flows across our platforms, how it is stored, synchronized, governed, and shared, both within Chemify and with external partners while complying with contractual, regulatory, and security constraints. This is a foundational role: your decisions will shape how Chemify scales from a single Chemifarm to a global network of automated laboratories that can safely collaborate with Enterprise customers and research partners.

If you enjoy problem solving complex technical challenges that blend system Architecture, Data engineering and Distributed systems, are a natural communicator and are energized by working closely with scientists using cutting edge technologies, then we’d love to welcome you to our team.

Key Responsibilities

  • AI-Native Data Strategy
    • Define the enterprise data architecture for scientific and operational data to ensure it is "ML-ready" from the moment of ingestion.
    • Establish a Data Lakehouse architecture on AWS to manage the massive scale of raw, unstructured "dark data" from robotic sensors (spectra, video, logs, etc.).
    • Lead the strategic design and implementation of a unified chemical data fabric that integrates molecular structures, retrosynthetic reaction networks, and high-frequency robotic telemetry.
    • You will be responsible for architecting a versioned Feature Store that standardizes chemical ontologies and pre-computed molecular descriptors, ensuring a seamless, high-fidelity data loop between robotic laboratory execution and AI-driven discovery engines.
  • Advanced Relational & Semantic Modeling
    • Architect Graph Data Models representing complex chemical reaction networks, optimized for synthesis AI and automated manufacturing planning.
    • Lead the development of Foundational Semantic Ontologies that allow AI models to reason across disparate chemical data types.
    • Design Vector Database integrations (e.g., pgvector, Pinecone) to facilitate similarity searches across billions of chemical entities.
  • Industrial Telemetry & Edge Synchronization
    • Architect the ingestion of high-frequency robot and sensor telemetry using MQTT/Streaming patterns, ensuring zero loss of "negative data" (failed experiments) critical for model training.
    • Design a globally distributed data system that synchronizes local lab "Edge" data with global AI training clusters while maintaining consistency guarantees.
  • Governance & Enterprise Readiness
    • Own the data governance framework, specifically defining Data Tenancy and Partitioning models for Fortune 500 clients to ensure strict IP isolation.
    • Architect secure, compliant Data Sharing patterns for external research partners, translating legal/contractual constraints into technical controls.
    • Drive the data architecture roadmap toward SOC 2 and ISO 27001 readiness, focusing on auditability and access control for training data.

About You

You are an experienced Architect (e.g., TOGAF, AWS Certified Solutions Architect, or equivalent) with strong Python expertise in production data. You have a natural curiosity for complex scientific domains and thrive on creating lasting value through building modern data engineering solutions.

We expect you to bring:

  • BSc or equivalent experience, preferably in a Data Engineering-related field.
  • 8+ years commercial Data Architect & python experience.
  • Deep experience with PostgreSQL, ideally in AWS RDS.
  • Proven experience designing high-throughput telemetry / IoT / industrial data systems generating very large volumes of time-series data.
  • Hands-on understanding of stream ingestion patterns (MQTT).
  • Experience with graph or Vector databases (Neo4j, Pinecone, pgvector) and modelling complex, highly relational domains.
  • Proven experience designing distributed data systems across multiple services, teams, or locations.
  • Demonstrable experience building impactful solutions with:
  • Data governance frameworks
  • Data tenancy and segregation models
  • Data consistency and replication patterns
  • Secure data sharing between organizations

Beneficial Skills

  • Prior involvement in SOC 2, ISO 27001 compliance programmes, particularly from a data architecture perspective.
  • Exposure to scientific, chemical, or manufacturing data environments.
  • Familiarity with modern data stack components (e.g., data lakes, streaming, or batch/real-time pipelines).
  • Chemistry or AI Drug Discovery domain knowledge is a real differentiator for us.

Why Join Chemify?

  • Impact: You will help build the infrastructure that enables digital chemistry at scale — accelerating discovery, improving reproducibility, and unlocking new possibilities in science and medicine.
  • Autonomy: Reporting directly to the CTO, you will have meaningful influence over the technical direction and data strategy of a Series B deep-tech rocket ship.
  • Ambition: We are scaling rapidly, investing in world-class infrastructure, and tackling problems that sit at the frontier of robotics, AI, and chemistry. You will have the resources and mandate to build the right foundations for the future.

Principal Data Architect employer: Chemify

At Chemify, we pride ourselves on being an exceptional employer, offering a dynamic work culture that fosters innovation and collaboration in the heart of Glasgow or London. Our commitment to employee growth is evident through our investment in cutting-edge technology and the opportunity to shape the future of digital chemistry, all while enjoying a hybrid workstyle that promotes work-life balance. Join us to be part of a transformative journey where your contributions will directly impact scientific discovery and improve quality of life.
C

Contact Detail:

Chemify Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Principal Data Architect

✨Tip Number 1

Network like a pro! Get out there and connect with people in the industry. Attend meetups, conferences, or even online webinars. You never know who might have the inside scoop on job openings or can put in a good word for you.

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to data architecture or AI/ML. This gives potential employers a taste of what you can do and sets you apart from the crowd.

✨Tip Number 3

Prepare for interviews by practising common questions and scenarios specific to data architecture. Think about how you would tackle challenges at Chemify, and be ready to discuss your thought process and solutions.

✨Tip Number 4

Apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining Chemify and being part of our mission to revolutionise chemistry.

We think you need these skills to ace Principal Data Architect

Data Architecture
Python
Cloud Computing (AWS)
Distributed Systems
AI/ML Infrastructure
Data Governance
Data Lakehouse Architecture
Relational and Semantic Modelling
Graph Databases (e.g., Neo4j)
Vector Databases (e.g., pgvector, Pinecone)
Telemetry Ingestion (MQTT)
Data Sharing Compliance (SOC 2, ISO 27001)
Data Engineering
Problem Solving
Communication Skills

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the Principal Data Architect role. Highlight your experience with Python, cloud technologies, and distributed systems. We want to see how your skills align with our mission at Chemify!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Share your passion for data architecture and how it relates to our work in AI and chemistry. Let us know why you’re excited about joining Chemify and what unique perspective you bring.

Showcase Relevant Projects: Include examples of past projects that demonstrate your expertise in data governance, telemetry systems, or any relevant technologies. We love seeing real-world applications of your skills, so don’t hold back!

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you’re serious about joining our team at Chemify!

How to prepare for a job interview at Chemify

✨Know Your Data Architecture Inside Out

Make sure you’re well-versed in data architecture principles, especially those relevant to AI and distributed systems. Be prepared to discuss your past experiences with designing data ecosystems and how they can apply to Chemify’s mission.

✨Showcase Your Problem-Solving Skills

Expect to face complex technical challenges during the interview. Prepare examples of how you've tackled similar issues in the past, particularly in high-throughput telemetry or IoT environments. This will demonstrate your ability to think critically and creatively.

✨Familiarise Yourself with Chemify's Vision

Research Chemify’s goals and the role of data in revolutionising chemistry. Understanding their mission will help you align your answers with their objectives and show that you’re genuinely interested in contributing to their success.

✨Prepare for Governance and Compliance Questions

Given the importance of data governance at Chemify, be ready to discuss frameworks like SOC 2 and ISO 27001. Highlight any relevant experience you have in ensuring data security and compliance, as this will be crucial for the role.

Principal Data Architect
Chemify

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>