As a Lead DBA you are expected to own and evolve our mission‑critical database platforms in a high‑availability, 24×7 production environment. This role sits at the intersection of database engineering, SRE practices, and automation, and will play a key part in ensuring reliability, scalability, and operational excellence for business‑critical systems.
You will be responsible not just for keeping databases running, but for engineering resilience, driving automation, and mentoring teams to operate with production‑first and reliability‑focused mindsets.
Duties and Responsibilities
Database Platform Ownership
- Lead design, build, and operations of large‑scale database platforms including MongoDB Enterprise, Percona MongoDB, MongoDB Atlas, along with SQL Server, Cassandra, PostgreSQL, and MySQL.
- Ensure high availability, fault tolerance, backup, and disaster recovery across multi‑region deployments with stringent uptime SLAs.
- Drive capacity planning, performance optimization, and lifecycle management for production databases.
Reliability & Production Excellence
- Own production support and incident management for database services, including high‑severity incidents in 24/7 environments.
- Lead root cause analysis (RCA) and implement permanent fixes to reduce recurrence and improve MTTR.
- Define and implement SRE practices such as SLIs, SLOs, error budgets, proactive monitoring, and alerting.
- Design and develop automation using Ansible, Python, Bash, and PowerShell to eliminate manual work in patching, deployment, DR testing, monitoring, and compliance validation.
- Integrate database operations into CI/CD pipelines and infrastructure‑as‑code practices.
- Apply software engineering principles to database operations to improve consistency, repeatability, and scalability.
- Identify operational and architectural risks related to database platforms and proactively mitigate them.
- Define and enforce security, governance, and compliance controls aligned with enterprise and regulatory standards.
- Support audits and contribute to policy development and standard operating procedures.
- Work closely with application teams, product managers, architects, cloud, and infrastructure teams to align database solutions with business needs.
- Act as a technical advisor to stakeholders, providing data‑driven recommendations on platform strategy and optimization.
- Mentor and coach DBA and SRE teams across locations, raising the overall operational maturity of the organization.
Technical Skills Required
- 10+ years of experience in database administration
- Automation: Ansible + Python/Shell scripting
- SRE: SLIs/SLOs, observability, incident management, MTTR reduction
- Good communication skills and should understand the technical requirements and implement the changes
Qualification
Must be educated to at least degree level or equivalent.