At a Glance
- Tasks: Join our team to enhance system reliability and tackle exciting challenges in tech.
- Company: Blackstone, the world's largest alternative asset manager, focused on innovation.
- Benefits: Competitive salary, diverse opportunities, and a supportive work culture.
- Why this job: Make a real impact by improving systems and collaborating with talented professionals.
- Qualifications: Experience in coding, cloud services, and a passion for problem-solving.
- Other info: Dynamic environment with opportunities for growth and learning.
The predicted salary is between 36000 - 60000 £ per year.
Blackstone is the world’s largest alternative asset manager. We seek to create positive economic impact and long-term value for our investors, the companies we invest in, and the communities in which we work. Our $1.1 trillion in assets under management include investment vehicles focused on private equity, real estate, public debt and equity, infrastructure, life sciences, growth equity, opportunistic, non-investment grade credit, real assets and secondary funds, all on a global basis.
Blackstone’s Site Reliability Engineering team is responsible for improving the reliability of systems and services to meet the needs of the business. This is achieved through collaboration with the development and engineering teams to leverage SRE practices and principles. You’ll have the opportunity to identify and solve new problems as they arise, deploy and maintain observability systems and pipelines, mature the operations and support of services and platforms, and pursue emerging opportunities for efficiency and business value.
This position involves the selection, implementation, and maintenance of key observability tooling. It requires ongoing evaluation of the firm’s needs in observability, monitoring, alerting, resilience, and recovery. We work alongside service owners on design, implementation, and management of services for continuous improvement. We achieve the requisite reliability of services using clear definitions and measurable targets. We plan for and practice recovery from disaster scenarios and respond in real time to incidents. We guide the postmortem process in order to mitigate risks, prevent future disruptions, and improve the on-call experience. We aim to eliminate manual work, improve operational efficiency, and ensure high quality outputs in all that we do.
Key Responsibilities:- Provide technical leadership in the understanding and adoption of SRE methodologies across the firm
- Incorporating observability standards into code and deployment pipelines.
- Evolving the SRE standards that are adopted across all teams
- Partnering with colleagues in various roles and reporting lines to improve service reliability and operational efficiency
- Assisting developers and engineers directly and through AI assistants.
- Implement instrumentation and provide comprehensive performance insights to service owners
- Ensuring monitoring and alerting that reflects the reliability of services for users and enables effective on-call operations
- Implementing strategic observability tools and working to control overhead in maintenance and cost
- Participate in on-call rotations and respond to system incidents to ensure service availability and minimize operational impact
- Using automation to manage, maintain, and scale SRE systems with minimal human intervention
- Fostering a blameless culture while assisting in postmortem discussions and reporting
- Ability to write automation scripts, as well as read and troubleshoot code (Python, C#, Typescript, etc.)
- Make effective use of coding assistants and chat models (Anthropic, Open AI)
- Proficiency with public cloud providers (strong AWS experience required, preferred Azure experience)
- Configuration as code, infrastructure management, and CI/CD tooling (Terraform, Puppet, Gitlab CI)
- Hands-on experience with Docker and container schedulers including AWS ECS & EKS
- Excellent troubleshooting skills for Linux, Windows, and Networking
- Experience with observability tools (Grafana, Prometheus, Splunk, etc)
- Comfortable under pressure with incident management and collaborating during postmortems
- Excellent communication and organizational skills
- Curiosity and drive to improve systems and processes through a sense of shared ownership
The duties and responsibilities described here are not exhaustive and additional assignments, duties, or responsibilities may be required of this position. Assignments, duties, and responsibilities may be changed at any time, with or without notice, by Blackstone in its sole discretion.
Blackstone is committed to providing equal employment opportunities to all employees and applicants for employment without regard to race, color, creed, religion, sex, pregnancy, national origin, ancestry, citizenship status, age, marital or partnership status, sexual orientation, gender identity or expression, disability, genetic predisposition, veteran or military status, status as a victim of domestic violence, a sex offense or stalking, or any other class or status in accordance with applicable federal, state and local laws. This policy applies to all terms and conditions of employment, including but not limited to hiring, placement, promotion, termination, transfer, leave of absence, compensation, and training.
BXTI, Site Reliability Engineer - Data, Cloud & Developer Experience in London employer: The Blackstone Group L.P.
Contact Detail:
The Blackstone Group L.P. Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land BXTI, Site Reliability Engineer - Data, Cloud & Developer Experience in London
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, especially those already at Blackstone. A friendly chat can open doors and give you insider info on what they're really looking for.
✨Tip Number 2
Show off your skills! If you’ve got a portfolio or GitHub with projects that highlight your SRE expertise, share it during interviews. It’s a great way to demonstrate your hands-on experience.
✨Tip Number 3
Prepare for the technical challenges! Brush up on your coding skills and be ready to tackle some real-world problems during interviews. Practice makes perfect, so don’t skip this step!
✨Tip Number 4
Apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, you’ll find all the latest job openings there, so keep an eye out!
We think you need these skills to ace BXTI, Site Reliability Engineer - Data, Cloud & Developer Experience in London
Some tips for your application 🫡
Tailor Your Application: Make sure to customise your CV and cover letter for the Site Reliability Engineer role. Highlight your experience with SRE methodologies, observability tools, and any relevant coding skills. We want to see how you fit into our team!
Showcase Your Technical Skills: Don’t hold back on showcasing your technical prowess! Mention your experience with Python, AWS, Docker, and any other relevant technologies. We love seeing candidates who can demonstrate their hands-on experience.
Be Clear and Concise: When writing your application, keep it clear and to the point. Use bullet points where necessary to make it easy for us to read through your qualifications and experiences. We appreciate a well-structured application!
Apply Through Our Website: Make sure to submit your application through our website. It’s the best way for us to receive your details and ensures you’re considered for the role. We can’t wait to see what you bring to the table!
How to prepare for a job interview at The Blackstone Group L.P.
✨Know Your SRE Principles
Before the interview, brush up on Site Reliability Engineering principles. Understand how observability, monitoring, and incident management play a role in improving system reliability. Be ready to discuss how you've applied these concepts in your previous roles.
✨Showcase Your Technical Skills
Prepare to demonstrate your coding abilities, especially in Python, C#, or Typescript. Bring examples of automation scripts you've written or projects where you've implemented CI/CD tooling. This will show your hands-on experience and problem-solving skills.
✨Familiarise Yourself with Tools
Get comfortable with the observability tools mentioned in the job description, like Grafana and Prometheus. If you have experience with AWS, Docker, or Terraform, be prepared to discuss specific instances where you've used these tools to enhance service reliability.
✨Communicate Effectively
During the interview, focus on clear communication. Explain your thought process when troubleshooting issues or during postmortem discussions. Highlight your ability to collaborate with cross-functional teams, as this is crucial for the role.