At a Glance
- Tasks: Design and maintain monitoring systems for the largest SIEM platform in the security industry.
- Company: Join CrowdStrike, a market leader in cybersecurity with a vibrant office culture.
- Benefits: Enjoy competitive pay, wellness programs, and professional development opportunities.
- Other info: Hybrid role based in Bucharest, offering excellent career growth and a supportive work environment.
- Why this job: Make a real impact on security by ensuring reliability and scalability of critical systems.
- Qualifications: 10+ years in software or platform engineering with expertise in large-scale distributed systems.
The predicted salary is between 80000 - 100000 £ per year.
About the Role
Our mission is to make all of our customers’ security-relevant data continuously available for automated detection and response, threat hunting, and other Falcon platform use cases. The systems behind NG‑SIEM (next‑generation security information and event management) are expanding to accommodate >100PB of event and action data ingested every day, up to 10 years of retention, and dozens of millions of queries per hour across large sections of the data stored for tens of thousands of customers. As a Senior Engineer II on the newly established NG‑SIEM EPICS (End‑to‑End Performance, Incident‑response, Cost, and Scaling) team, you will own the reliability and scalability of the security industry’s largest SIEM platform.
Location: Hybrid role based in Bucharest, Romania – 2‑3 times per week in office.
What You’ll Do
- Design, build and maintain monitoring and synthetic test suites that provide deep visibility into the health of the entire NG‑SIEM pipeline.
- Engineer orchestrated scaling solutions that treat the NG‑SIEM pipeline as a unified system, proportionally increasing resources across all dependent components (Kafka, ingest pipelines, downstream services) to eliminate cascading bottleneck patterns.
- Serve as a subject‑matter expert during platform‑wide incidents (P2 and above), applying cross‑service knowledge to diagnose and resolve multi‑component failures.
- Participate in follow‑the‑sun on‑call rotations, providing incident commander coordination for critical platform‑wide events.
- Build and refine models for end‑to‑end capacity forecasting that account for all pipeline dimensions, including partner team dependencies.
- Develop tooling to continuously track and surface cost drivers across the platform.
- Transform manual standard operating procedures into automated remediation workflows – including pipeline‑wide scaling responses, CID rebalancing, and infrastructure healing – with the goal of resolving issues before customers are impacted.
- Partner with cell‑level teams, product engineering, GDI/3PI, and external stakeholders (e.g., CSM) to triage SLO breaches, drive problem management for large reliability efforts, and ensure consistent communication during incidents.
- Use your broad NG‑SIEM knowledge to identify and drive systemic improvements across teams, contributing to the platform’s long‑term resilience and efficiency.
What You’ll Need
- 10+ years of experience in software engineering, site reliability engineering, or platform engineering, with significant time spent on large‑scale distributed systems.
- Strong proficiency in at least one systems programming language (Go, Java, Rust, or C++) and one scripting language (Python, Bash).
- Deep experience with end‑to‑end observability – building monitoring pipelines, defining SLI/SLOs, and creating dashboards that drive actionable insights across multi‑service architectures.
- Demonstrated ability to diagnose and resolve complex incidents spanning multiple distributed components operating 24/7.
- Experience with coordinated capacity planning and scaling for systems with significant infrastructure footprints.
- Hands‑on experience with streaming platforms (Kafka or similar) and understanding of backpressure, partition management, and consumer group dynamics at scale.
- Familiarity with infrastructure‑as‑code, CI/CD pipelines, and automated deployment practices.
- A can‑do attitude – you thrive collaborating in a team and are not afraid of taking on responsibilities.
- Strong written and verbal communication skills – you will lead incident communications and produce post‑incident analyses that drive lasting improvements.
- Comfort working across time zones with globally distributed teams.
Bonus Points
- Experience in a similar reliability or platform engineering role at a hyperscaler (AWS, Azure, GCP) or large‑scale SaaS provider.
- Track record of building automated remediation and self‑healing infrastructure.
- Experience with cost modeling and unit economics for large compute and storage footprints.
- Familiarity with cloud‑native architectures and serverless computing paradigms.
- Hands‑on experience operating platforms processing over 1 trillion events per day or more than 10PB of data per day.
- Exposure to or experience with Log Management, cybersecurity products, or security operations workflows.
- Experience with disaster recovery planning and execution for multi‑region systems.
Benefits of Working at CrowdStrike
- Market leader in compensation and equity awards.
- Comprehensive physical and mental wellness programs.
- Competitive vacation and holidays for recharge.
- Paid parental and adoption leaves.
- Professional development opportunities for all employees regardless of level or role.
- Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections.
- Vibrant office culture with world‑class amenities.
- Great Place to Work Certified™ across the globe.
CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy‑related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions—including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay‑offs, return from lay‑off, terminations and social/recreational programs—on valid job requirements.
Sr. Engineer II - EPICS, NG-SIEM (Hybrid) in London employer: CrowdStrike Holdings, Inc.
CrowdStrike is an exceptional employer, offering a vibrant work culture in Bucharest that prioritises employee well-being and professional growth. With competitive compensation, comprehensive wellness programmes, and a commitment to diversity and inclusion, employees are empowered to thrive both personally and professionally. The hybrid work model fosters flexibility while maintaining strong team collaboration, making it an ideal environment for those seeking meaningful and rewarding careers in the tech industry.
Contact Details:
CrowdStrike Holdings, Inc. Recruitment Team
StudySmarter Expert Advice🤫
We think this is how you could land Sr. Engineer II - EPICS, NG-SIEM (Hybrid) in London
✨Tip Number 1
Network like a pro! Reach out to folks in your industry on LinkedIn or at local meetups. You never know who might have the inside scoop on job openings or can put in a good word for you.
✨Tip Number 2
Prepare for those interviews! Research the company and its products, especially if they relate to security and data management. Be ready to discuss how your experience aligns with their mission and the role.
✨Tip Number 3
Show off your skills! If you’ve got projects or contributions that demonstrate your expertise in large-scale systems or automation, make sure to highlight them during your conversations.
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team.
We think you need these skills to ace Sr. Engineer II - EPICS, NG-SIEM (Hybrid) in London
Some tips for your application 🫡
Tailor Your Application:Make sure to customise your CV and cover letter to highlight your experience with large-scale distributed systems and the specific skills mentioned in the job description. We want to see how your background aligns with our mission at StudySmarter!
Showcase Your Technical Skills:Don’t hold back on showcasing your proficiency in programming languages like Go, Java, or Python. Include examples of projects where you’ve built monitoring pipelines or automated workflows, as this will resonate well with us.
Be Clear and Concise:When writing your application, keep it clear and to the point. Use bullet points for key achievements and avoid jargon unless it’s relevant. We appreciate straightforward communication that gets to the heart of your experience.
Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows you’re keen on joining the StudySmarter team!
How to prepare for a job interview at CrowdStrike Holdings, Inc.
✨Know Your Tech Inside Out
Make sure you’re well-versed in the systems programming languages mentioned in the job description, like Go, Java, Rust, or C++. Brush up on your scripting skills too, especially in Python or Bash. Being able to discuss your experience with large-scale distributed systems and observability will show that you’re the right fit for the role.
✨Demonstrate Problem-Solving Skills
Prepare to share specific examples of how you've diagnosed and resolved complex incidents in the past. Think about times when you’ve had to coordinate across multiple teams during a platform-wide incident. This will highlight your ability to handle high-pressure situations and your collaborative spirit.
✨Showcase Your Automation Experience
Since the role involves transforming manual processes into automated workflows, be ready to discuss any relevant projects where you’ve implemented automation. Talk about your experience with CI/CD pipelines and infrastructure-as-code, as this will demonstrate your capability to improve efficiency and reliability.
✨Communicate Clearly and Confidently
Strong communication skills are key, especially since you'll be leading incident communications. Practice articulating your thoughts clearly and concisely. Prepare to explain technical concepts in a way that’s easy to understand, as this will be crucial when working with cross-functional teams.