Site Reliability Engineer II in London

Site Reliability Engineer II in London

London Full-Time 42000 - 84000 £ / year (est.) No working from home possible
Bank of America

At a Glance

  • Tasks: Join our team to enhance system reliability and automate processes for critical applications.
  • Company: Bank of America is dedicated to improving financial lives through responsible growth and community impact.
  • Benefits: Enjoy a supportive work culture, opportunities for growth, and wellness initiatives.
  • Other info: Be part of an on-call rotation to deepen your understanding of reliability challenges.
  • Why this job: Make a real impact while collaborating with tech teams in a dynamic environment.
  • Qualifications: Basic ITIL knowledge, familiarity with operating systems, databases, and scripting languages required.

The predicted salary is between 42000 - 84000 £ per year.

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day. Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to attracting and developing exceptional talent, supporting our teammates’ physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve. At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!

This job is responsible for partnering with engineering and technology teams to implement measures as prescribed by lead/senior SRE engineers. Key responsibilities include ensuring appropriate instrumentation, tooling, ticketing, alerting and on-call routines are in place for key services, identifying root causes of issues through production triage efforts, and suggesting code enhancements to technology teams to automate services and improve reliability and efficiency. Job expectations include using software development skills to improve efficiency and to address gaps in reliability.

The Global Information Security Application Production Services (GIS APS) SWAT team is looking for a candidate to fill a role in Site Reliability Engineer. The candidate should have experience supporting business critical applications in an environment focused on information security. Some responsibilities of the role include monitoring for and driving the resolution of incidents utilizing methodologies such as ITIL, data analysis through tools like Splunk or Dynatrace, and interacting with both engineering teams and clients to handle requests or issues.

To meet these responsibilities, the candidate should at least have working knowledge of operating systems (Windows and Linux/Unix), database (Oracle, MS SQL) and networking standards such as TCP/IP and SAML as well as an understanding of how Java and Middleware applications function. Additionally, the candidate should exhibit a self-starting attitude towards driving various types of project work to completion. Some examples include the creation of and maintenance of dashboards, writing and updating technical documentation, and owning or assisting with the development of enhancements aimed at improving the environment.

Responsibilities:

  • Develops and maintains reliability scripts, tools and libraries and leverages them for common instrumentation, automation, and operational needs, and when mentoring Site Reliability Engineer (SRE) resources on reliability practices and established tools/capabilities.
  • Collaborates with Development and Infrastructure teams to understand technical solutions and implement monitoring capabilities outlined in the application and system monitoring designs put forward by the SRE Lead.
  • Partners to implement code changes to make use of common reliability libraries and tools and helps Application Production Services and Application Development teammates understand how to use them.
  • Identifies vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and 'noise' in monitoring, and defines solutions to reduce manual support effort and/or improve system reliability.
  • Engages as a subject matter expert in major incident triage efforts and failure scenario modelling and diagnosis with Problem Manager root causes for major incident/problem management investigations.
  • Participates regularly in an on-call rotation with Production Support teammates to learn more about reliability issues affecting their portfolio.

Required Qualifications:

  • Foundational knowledge of core ITIL processes such as the management of incidents, changes, and problems.
  • Should exhibit disciplined, process-driven, and results-oriented approach when providing support.
  • Comfortable in the Splunk environment – able to analyze logs, create/modify dashboards, and utilize reporting and alerting functionality.
  • Basic understanding of Federated IAM protocols such as SAML, OAuth, OpenID Connect, and FIDO2.
  • Able to understand and analyze HTTP traces/Wireshark captures.
  • Database/SQL knowledge - basic understanding of how a database functions and able to craft queries to pull data.
  • Working knowledge of both Unix and Windows Operating Systems.
  • Ability to understand and utilize various programming or scripting languages such as shell scripting, Perl, and PowerShell.
  • Practical knowledge of SSL/TLS cryptography and PKI.
  • Knowledge of LDAP and Active Directory services.

Desired Qualifications:

  • Strong knowledge and troubleshooting experiences in Windows, Linux, Oracle and MS SQL env platforms/environments.
  • Analytical skills and expertise in finding root causes and isolating complicated issues with various tools such as Splunk.
  • Knowledge around Multi-Factor Authentication, Single-Sign On, Password Management, and Passwordless Authentication (FIDO2) solutions.
  • Exposure to supporting Web Access Management solutions, such as Ping Access or CA SiteMinder.
  • Experience with Apache and IIS solutions.
  • Understanding of the OSI model.
  • Knowledge of the Software Development Life Cycle.
  • Familiarity and understanding of high-availability environments.

Skills:

  • Analytical Thinking
  • Automation
  • Collaboration
  • Production Support
  • Result Orientation
  • Application Development
  • Architecture
  • Influence
  • Project Management
  • Solution Design
  • Adaptability
  • DevOps Practices
  • Risk Management
  • Solution Delivery Process
  • Stakeholder Management

Shift: 1st shift (United States of America)

Hours Per Week: 40

Site Reliability Engineer II in London employer: Bank of America

At Bank of America, we pride ourselves on being a great place to work, fostering a culture that prioritises the well-being and growth of our employees. As a Site Reliability Engineer II in Pennington, you will have access to exceptional career development opportunities, a supportive work environment, and the chance to make a meaningful impact in the financial sector. Join us to be part of a team that values innovation, collaboration, and community engagement.

Bank of America

Contact Details:

Bank of America Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Site Reliability Engineer II in London

Tip Number 1

Familiarise yourself with ITIL processes, especially incident management and problem resolution. Understanding these frameworks will help you demonstrate your ability to manage incidents effectively during interviews.

Tip Number 2

Brush up on your skills with monitoring tools like Splunk or Dynatrace. Being able to discuss your experience with these tools and how you've used them to analyse logs or create dashboards can set you apart from other candidates.

Tip Number 3

Showcase your knowledge of both Windows and Linux operating systems. Be prepared to discuss specific scenarios where you've troubleshot issues in these environments, as this is crucial for the role.

Tip Number 4

Highlight any experience you have with scripting languages like PowerShell or shell scripting. Being able to automate tasks and improve system reliability through code will be a significant advantage in your application.

We think you need these skills to ace Site Reliability Engineer II in London

ITIL Knowledge
Incident Management
Change Management
Problem Management
Splunk Proficiency
Log Analysis
Dashboard Creation

Some tips for your application 🫡

Tailor Your CV:Make sure your CV highlights relevant experience and skills that align with the Site Reliability Engineer II role. Focus on your knowledge of ITIL processes, database management, and any programming or scripting languages you are proficient in.

Craft a Strong Cover Letter:In your cover letter, express your enthusiasm for the role and the company. Mention specific projects or experiences that demonstrate your ability to improve system reliability and your familiarity with tools like Splunk or Dynatrace.

Showcase Problem-Solving Skills:Provide examples in your application that illustrate your analytical thinking and problem-solving abilities. Discuss how you've identified root causes of issues in past roles and the steps you took to resolve them.

Highlight Collaboration Experience:Since the role involves working closely with engineering and technology teams, emphasise your experience in collaborative environments. Mention any instances where you partnered with others to implement solutions or improve processes.

How to prepare for a job interview at Bank of America

Understand the Role

Make sure you have a solid grasp of what a Site Reliability Engineer does, especially in the context of Bank of America. Familiarise yourself with key responsibilities like incident management, monitoring, and automation to demonstrate your understanding during the interview.

Showcase Your Technical Skills

Be prepared to discuss your experience with relevant technologies such as Splunk, Oracle, and both Windows and Linux operating systems. Highlight any projects where you've implemented monitoring tools or automated processes, as this aligns with the job's requirements.

Prepare for Scenario-Based Questions

Expect questions that assess your problem-solving skills. Prepare to discuss how you would handle specific incidents or reliability issues, using examples from your past experiences to illustrate your thought process and technical expertise.

Demonstrate Collaboration Skills

Since the role involves working closely with engineering and technology teams, be ready to talk about your experience collaborating with others. Share examples of how you've successfully partnered with different teams to achieve common goals, particularly in high-pressure situations.