Staff Software Engineer (Inference Infrastructure) in London

Staff Software Engineer (Inference Infrastructure) in London

London Full-Time 70000 - 90000 € / year (est.) Home office (partial)
Deepstreamtech

At a Glance

  • Tasks: Build high-performance AI systems and deploy cutting-edge NLP models.
  • Company: Join a leading tech company shaping the future of AI.
  • Benefits: Competitive salary, flexible work options, and growth opportunities.
  • Other info: Collaborative team environment with exciting challenges and career advancement.
  • Why this job: Make a real impact in the AI space with innovative technology.
  • Qualifications: 5+ years in engineering, experience with Kubernetes and cloud platforms.

The predicted salary is between 70000 - 90000 € per year.

Requirements

  • 5+ years of engineering experience running production infrastructure at a large scale
  • Experience designing large, highly available distributed systems with Kubernetes and GPU workloads on those clusters
  • Experience with Kubernetes dev and production coding and support
  • Experience with GCP, Azure, AWS, OCI, multi-cloud on-prem / hybrid serving
  • Experience in designing, deploying, supporting, and troubleshooting in complex Linux-based computing environments
  • Experience in compute/storage/network resource and cost management
  • Excellent collaboration and troubleshooting skills to build mission-critical systems, and ensure smooth operations and efficient teamwork
  • The grit and adaptability to solve complex technical challenges that evolve day to day
  • Familiarity with computational characteristics of accelerators (GPUs, TPUs, and/or custom accelerators), especially how they influence latency and throughput of inference
  • Strong understanding or working experience with distributed systems
  • Experience in Golang, C++ or other languages designed for high-performance scalable servers

If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply!

What the job involves

We are looking for Members of Technical Staff to join the Model Serving team at Cohere. The team is responsible for developing, deploying, and operating the AI platform delivering Cohere's large language models through easy to use API endpoints. In this role, you will work closely with many teams to deploy optimized NLP models to production in low latency, high throughput, and high availability environments. You will also get the opportunity to interface with customers and create customized deployments to meet their specific needs.

Staff Software Engineer (Inference Infrastructure) in London employer: Deepstreamtech

Cohere is an exceptional employer for those passionate about advancing AI technology, offering a dynamic work culture that fosters innovation and collaboration. With a focus on employee growth, we provide ample opportunities for professional development in a supportive environment, all while working on cutting-edge projects in the heart of a vibrant tech hub. Join us to be part of a team that values creativity and encourages you to tackle complex challenges in the rapidly evolving field of machine learning.

Deepstreamtech

Contact Detail:

Deepstreamtech Recruiting Team

StudySmarter Expert Advice🤫

We think this is how you could land Staff Software Engineer (Inference Infrastructure) in London

Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with people on LinkedIn. You never know who might have the inside scoop on job openings or can refer you directly.

Tip Number 2

Show off your skills! Create a portfolio showcasing your projects, especially those involving Kubernetes, distributed systems, or machine learning. This gives potential employers a taste of what you can do beyond your CV.

Tip Number 3

Prepare for technical interviews by brushing up on your coding skills in Golang or C++. Practice common algorithms and system design questions. We recommend using platforms like LeetCode or HackerRank to get in the zone.

Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you’re genuinely interested in joining our team at Cohere.

We think you need these skills to ace Staff Software Engineer (Inference Infrastructure) in London

High-performance machine learning systems
Scalable and reliable infrastructure
Distributed systems design
Kubernetes
GPU workloads
GCP
Azure

Some tips for your application 🫡

Show Off Your Experience:Make sure to highlight your 5+ years of engineering experience, especially in running production infrastructure. We want to see how you've tackled large-scale systems and what you've learned along the way!

Tailor Your Application:Don’t just send a generic application! Take the time to align your skills with our requirements, like Kubernetes and cloud services. We love seeing how your unique background fits into our team.

Be Clear and Concise:When writing your application, keep it straightforward. We appreciate clarity, so make sure your points are easy to understand. This helps us see your thought process and technical skills more clearly.

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it’s super easy!

How to prepare for a job interview at Deepstreamtech

Know Your Tech Inside Out

Make sure you’re well-versed in the technologies mentioned in the job description, especially Kubernetes, GCP, and distributed systems. Brush up on your knowledge of high-performance machine learning systems and be ready to discuss your past experiences with these technologies.

Showcase Your Problem-Solving Skills

Prepare to share specific examples of complex technical challenges you've faced and how you tackled them. Highlight your adaptability and grit, as these qualities are crucial for the role. Use the STAR method (Situation, Task, Action, Result) to structure your responses.

Collaboration is Key

Since this role involves working closely with various teams, be ready to discuss your collaboration experiences. Talk about how you’ve worked with others to deploy models or troubleshoot issues, and emphasise your communication skills and teamwork.

Ask Insightful Questions

Prepare thoughtful questions about the team’s current projects, challenges they face, or their approach to deploying NLP models. This shows your genuine interest in the role and helps you gauge if the company is the right fit for you.