Site Reliability Engineer

Site Reliability Engineer

Full-Time 60000 - 80000 £ / year (est.) No working from home possible
Alpaca

At a Glance

  • Tasks: Ensure our brokerage platform is reliable and operable while managing cloud infrastructure and databases.
  • Company: Join Alpaca, a leading fintech company transforming financial services globally.
  • Benefits: Enjoy competitive salary, stock options, health benefits, and a home-office setup stipend.
  • Other info: Be part of a diverse global team committed to innovation and open-source contributions.
  • Why this job: Make a real impact in a dynamic team while working with cutting-edge technology.
  • Qualifications: 4+ years in SRE or DevOps with strong PostgreSQL and Kubernetes experience.

The predicted salary is between 60000 - 80000 £ per year.

Alpaca is a US-headquartered self‑clearing broker‑dealer and brokerage infrastructure for stocks, ETFs, options, crypto, fixed income, 24/5 trading, and more. Our recent Series D funding round brought our total investment to over $320 million, fueling our ambitious vision. Amongst our subsidiaries, Alpaca is a licensed financial services company, serving hundreds of financial institutions across 40 countries with our institutional‑grade APIs. This includes broker‑dealers, investment advisors, wealth managers, hedge funds, and crypto exchanges, totalling over 9 million brokerage accounts. Our global team is a diverse group of experienced engineers, traders, and brokerage professionals who are working to achieve our mission of opening financial services to everyone on the planet. We're deeply committed to open‑source contributions and fostering a vibrant community, continuously enhancing our award‑winning, developer‑friendly API and the robust infrastructure behind it.

We're a dynamic team of 380+ globally distributed members who thrive working from our favourite places around the world, with teammates spanning the USA, Canada, Japan, Hungary, Nigeria, Brazil, the UK, and beyond! We're searching for passionate individuals eager to contribute to Alpaca's rapid growth. If you align with our core values—Stay Curious, Have Empathy, and Be Accountable—and are ready to make a significant impact, we encourage you to apply.

As a Site Reliability Engineer at Alpaca, you'll help keep our brokerage platform reliable, observable, and operable as we grow—working across our cloud infrastructure, Kubernetes platform, observability stack, messaging layer, and data layer. We're especially interested in candidates with strong PostgreSQL fundamentals who'd like to grow into deeper ownership of our database reliability posture: PostgreSQL sits on the trading‑critical path, and we want this person to spend a meaningful share of their time leveling it up while still being a well‑rounded SRE the rest of the week.

Things You Get To Do

  • Operate production day‑to‑day - oncall, incident response, postmortems, and the follow‑ups that actually close the loop.
  • Own reliability practice - define and refine SLIs/SLOs and error budgets, and help product teams live within them.
  • Strengthen our observability across metrics, logs, traces, and alerting.
  • Ship infrastructure through code in a GitOps workflow—cloud resources and Kubernetes workloads alike.
  • Look after PostgreSQL: performance tuning, schema and migration review, online migrations on large tables, HA/DR, and CDC pipelines.
  • Mentor engineers on reliability and database fundamentals through code review, design review, and pairing.

Who You Are (must-haves)

  • 4+ years in SRE, DevOps, Platform/Infrastructure, or backend engineering with significant production operations ownership.
  • Hands‑on experience operating production services on Kubernetes, and shipping infrastructure as code in a GitOps workflow.
  • Solid working knowledge of PostgreSQL in production—query plans, pg_stat_*, indexing and schema trade‑offs, and what a safe online migration looks like on a non‑trivial table.
  • Cloud networking fundamentals (VPCs, routing, L4/L7 load balancing, DNS, TLS) and comfort debugging cross‑service connectivity.
  • Comfortable with a modern observability stack and proficient with Linux at the operator level.
  • Practiced in incident response—calm under pressure, structured debugging, postmortems that drive change.
  • At least working proficiency in Go or Python, plus strong written and verbal communication.
  • Genuine interest in databases and in growing your PostgreSQL/DBA expertise.

Who You Might Be (Nice‑to‑Haves)

  • Deeper PostgreSQL experience: large clusters at OLTP load, online migrations on big tables, HA/DR ownership, connection pooling at scale, or change‑data‑capture pipelines.
  • Experience with typed SQL access layers in Go (e.g., pgx, gorm, sqlc).
  • Production experience with messaging systems at scale (e.g., RabbitMQ, Kafka, Redpanda).
  • Security & compliance experience in a regulated environment (SOC 2, secrets management, audit logging).
  • Familiarity with trading, brokerage, or other regulated fintech domains.

How We Take Care of You

  • Competitive Salary & Stock Options
  • Health Benefits
  • New Hire Home‑Office Setup: One‑time USD $500
  • Monthly Stipend: USD $150 per month via a Brex Card

Alpaca is proud to be an equal opportunity workplace dedicated to pursuing and hiring a diverse workforce.

Site Reliability Engineer employer: Alpaca

At Alpaca, we pride ourselves on being an exceptional employer that champions a culture of curiosity, empathy, and accountability. Our global team enjoys competitive salaries, stock options, and comprehensive health benefits, alongside a generous home-office setup and monthly stipends to support remote work. With a strong commitment to employee growth and a vibrant, diverse work environment, we empower our Site Reliability Engineers to make a meaningful impact while advancing their skills in a cutting-edge fintech landscape.

Alpaca

Contact Details:

Alpaca Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Site Reliability Engineer

Tip Number 1

Network like a pro! Reach out to current or former employees on LinkedIn and ask about their experiences at Alpaca. A friendly chat can give you insider info and maybe even a referral!

Tip Number 2

Prepare for the technical interview by brushing up on your PostgreSQL skills and Kubernetes knowledge. We want to see you shine, so practice common SRE scenarios and incident responses to show off your expertise.

Tip Number 3

Show us your passion for open-source contributions! If you've worked on any projects or have ideas for improving our observability stack, be ready to share them during your interview. It’ll set you apart from the crowd.

Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re serious about joining our awesome team at Alpaca.

We think you need these skills to ace Site Reliability Engineer

Site Reliability Engineering
Kubernetes
PostgreSQL
GitOps
Cloud Networking
Incident Response
Linux

Some tips for your application 🫡

Tailor Your CV:Make sure your CV reflects the skills and experiences that align with the Site Reliability Engineer role. Highlight your hands-on experience with Kubernetes and PostgreSQL, as these are key to what we’re looking for!

Craft a Compelling Cover Letter:Use your cover letter to tell us why you’re passionate about SRE and how you embody our core values—Stay Curious, Have Empathy, and Be Accountable. This is your chance to show us your personality!

Showcase Your Projects:If you've worked on any relevant projects, especially those involving cloud infrastructure or observability stacks, make sure to mention them. We love seeing practical examples of your work!

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands and shows us you’re serious about joining our team!

How to prepare for a job interview at Alpaca

Know Your PostgreSQL Inside Out

Since the role heavily involves PostgreSQL, make sure you brush up on your knowledge of query plans, indexing, and schema trade-offs. Be ready to discuss your experience with online migrations and performance tuning, as these are crucial for the position.

Demonstrate Your SRE Skills

Prepare to talk about your hands-on experience with Kubernetes and incident response. Share specific examples of how you've managed production services and what strategies you've implemented to improve reliability and observability in past roles.

Show Your Coding Proficiency

Familiarise yourself with GitOps workflows and be prepared to discuss your coding experience in Go or Python. You might even want to bring a small code sample that showcases your ability to ship infrastructure as code, as this will highlight your technical skills.

Emphasise Your Soft Skills

Alpaca values empathy and accountability, so be ready to share examples of how you've worked collaboratively in teams. Discuss how you've mentored others or contributed to a positive team culture, as this aligns with their core values.