VP, Incident Management & Operational Resilience
Join to apply for the VP, Incident Management & Operational Resilience role at dLocal
Get AI-powered advice on this job and more exclusive features.
Why should you join dLocal?
dLocal enables the biggest companies in the world to collect payments in 40 countries in emerging markets. Global brands rely on us to increase conversion rates and simplify payment expansion effortlessly. As both a payments processor and a merchant of record where we operate, we make it possible for our merchants to make inroads into the world\âs fastest-growing, emerging markets.
By joining us you will be a part of an amazing global team that makes it all happen, in a flexible, remoteâfirst dynamic culture with travel, health and learning benefits, among others. Being a part of dLocal means working with 1000+ teammates from 30+ different nationalities and developing an international career that impacts millions of people\âs daily lives. We are builders, we never run from a challenge, we are customerâcentric, and if this sounds like you, we know you will thrive in our team.
What\âs the opportunity?
We are looking for a VP of Incident Management & Operational Resilience to design, build, and lead the companyâwide incident management function at dLocal. This role owns the global framework, governance, and execution model for all critical incidents across the company â not only Technology. You will be accountable for how dLocal anticipates, responds to, and learns from major events impacting customers, operations, security, compliance, employees, and our reputation. You will work closely with business, operations, technology, and risk leaders to ensure incidents are handled in a consistent, predictable, and transparent way.
What will I be doing?
Define and own the companyâwide incident management strategy
Set the vision, strategy, and operating model for global incident management and operational resilience across dLocal, aligned with our NASDAQâlisted public company obligations, boardâapproved risk appetite, investor expectations, and growth plans
Define and maintain a unified incident management and resilience framework including policies, processes, playbooks, severity/impact matrices, roles and responsibilities, and escalation paths for business, operational, security, and technology incidents
Ensure alignment and integration with Business Continuity, Disaster Recovery, Operational Risk, Security, and Compliance frameworks, and with global regulations and standards such as DORA, PSD2, NIS2, dataâbreach and outageâreporting regimes, and centralâbank expectations across 45+ countries and regions
Build and lead the incident management function
Build and lead a distributed global incident management and resilience organization, with regional leaders and incident commanders in LATAM, EMEA, APAC, and North America, operating in a followâtheâsun, 24/7 model
Define clear ownership boundaries between the central function and domain teams (IT, Security, Operations, CS, Product, Finance, Legal, Compliance, Corporate Communications, etc.) and ensure everyone understands their role during incidents
Develop a network of trained Incident Commanders and functional responders across regions and time zones, with clear expectations and training paths, including participation in a structured, global onâcall rotation that guarantees 24/7 executive and technical coverage for highâseverity incidents
Govern major incident execution & 24/7 global operations
Ensure that all highâseverity incidents follow a structured lifecycle: detection, assessment, triage, containment, mitigation, recovery, and closure, with handâoffs and ownership clearly defined between regions (LATAM, EMEA, APAC, North America) in a followâtheâsun model
Establish criteria for declaring major incidents/crises, and ensure rapid mobilization of the right stakeholders, including executives when needed
Define expectations and standards for realâtime decisionâmaking, risk tradeâoffs, approvals, and business signâoffs during incidents
Personally act as executive sponsor and, when necessary, executive Incident Commander for the most critical, companyâlevel events
Own incident communications and stakeholder alignment
Define and enforce standards for internal and external communications during incidents (frequency, content, channels, approvals)
Ensure effective coordination with Customer Success, Commercial, Operations, Legal, Compliance, Security, and Communications/PR on messaging to merchants, partners, employees, and regulators
Provide clear, concise updates to the Executive Team and Boardâlevel forums during and after major incidents
Institutionalize learning and continuous improvement
Establish and own the postâincident review process for significant incidents, ensuring highâquality root cause analysis, clear corrective and preventive actions, and accountable owners
Create governance to track, prioritize, and close postâincident action items, and to escalate systemic issues that require investment or strategic decisions
Use incident data to identify structural weaknesses (technology, process, organization, capacity, controls) and feed them into roadmaps, risk registers, and investment cases
Enable teams with processes, tooling, and training
Define the strategy and requirements for incident management tooling (alerting, collaboration channels, runbook systems, ticketing/workflow integration, status pages, dashboards)
Partner with Tech/SRE, Security, and Operations to ensure observability, monitoring, and alerting support effective incident detection and response
Design and run training programs, simulations, and game days across the company so all teams know how to respond and elevate appropriately
Measure, report, and challenge performance
Define and own incident KPIs and KRIs (e.g. MTTD, MTTR, incident volume by type/severity, recurrence rate, SLA/SLO breaches, comms timeliness, business impact)
Produce regular executiveâlevel reporting and insights on incident trends, operational risk, and readiness, with clear proposals for improvement
Challenge teams constructively on incident quality (runbooks, communications, action plans) and champion a noâblame, learningâoriented culture
What skills do I need?
Experience
10+ years of experience in roles related to incident management, operations, SRE/DevOps, security operations, or crisis management, with increasing scope and complexity
Proven experience leading major, crossâfunctional incidents in highâavailability, highârisk environments (e.g. payments, fintech, financial services, largeâscale SaaS, telco, cloud)
5+ years in people leadership roles, building and leading teams and/or programs that cut across multiple departments or regions
Experience designing and implementing companyâwide processes or frameworks (incident management, BCP/DR, operational risk, or similar)
Leadership & Communication
Strong executive presence and the ability to lead under pressure, making decisions with incomplete information and bringing clarity in ambiguous situations
Excellent communication skills in English, both written and spoken, with the ability to tailor messages to ICs, managers, executives, customers, and external stakeholders
Demonstrated ability to influence without formal authority, align conflicting priorities, and work effectively with senior stakeholders (Câlevel, functional heads)
Process, Risk & Analytical Skills
Structured problemâsolving skills and a dataâdriven mindset, comfortable working with metrics, trends, and rootâcause analysis
Strong understanding of operational risk, business continuity, and control environments, ideally in a regulated or audited context
Ability to balance shortâterm mitigation vs. longâterm resilience, making and articulating tradeâoffs clearly
Domain & Tooling
Familiarity with frameworks and practices such as ITIL/ITSM, SRE, incident command, operational resilience, BCP/DR
Handsâon experience with some of the following (or equivalent) is a plus: monitoring/observability platforms, ticketing/ITSM systems, onâcall and alerting tools, collaboration platforms, status page tools, and runbook automation
Comfort working with distributed, multiâtimeâzone teams and 24/7 operations
Nice to have
Experience interacting with regulators, auditors, or Boardâlevel committees on topics related to incidents, outages, operational risk, or resilience
Exposure to standards such as SOC 1/2, ISO 27001, PCI DSS, DORA, or operational resilience regulations, especially where incident management and business continuity are in scope
Background in payments, banking, or financial infrastructure
What success looks like in this role
dLocal has a clear, trusted, and widely adopted incident management framework that is understood across all departments and regions
Major incidents are handled in a predictable, wellâcoordinated way, with faster time to detect and resolve, and fewer repeats
Incident communication is transparent, timely, and consistent, building trust with customers, partners, regulators, and internal teams
Postâincident reviews consistently lead to real improvements in technology, processes, controls, and organization
Incident data and learnings are actively used to shape strategy, investments, and risk decisions at the executive level
The incident management function is seen as a key enabler of reliability and growth, not just a reactive firefighting team
What do we offer?
Remote work: work from anywhere or one of our offices around the globe!*
Flexibility: we have flexible schedules and we are driven by performance
Fintech industry: work in a dynamic and everâevolving environment, with plenty to build and boost your creativity
Referral bonus program: our internal talents are the best recruiters â refer someone ideal for a role and get rewarded
Learning & development: get access to a Premium Coursera subscription
Language classes: we provide free English, Spanish, or Portuguese classes
Social budget: you\âll get a monthly budget to chill out with your team (in person or remotely) and deepen your connections!
dLocal Houses: want to rent a house to spend one week anywhere in the world coworking with your team? We\âve got your back!
For people based in Montevideo (Uruguay) applying to nonâIT roles, 55% monthly attendance to the office is required
What happens after you apply?
Our Talent Acquisition team is invested in creating the best candidate experience possible, so don\ât worry, you will definitely hear from us. We will review your CV and keep you posted by email at every step of the process!
Also, you can check out our webpage, Linkedin, Instagram, and Youtube for more about dLocal!
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
#J-18808-Ljbffr
Contact Detail:
dLocal Recruiting Team