At a Glance
- Tasks: Lead technical operations and ensure system reliability for cutting-edge marketing solutions.
- Company: Join Epsilon, a global leader in data-driven marketing technology.
- Benefits: Competitive pay, great perks, hybrid work options, and endless growth opportunities.
- Why this job: Make a real impact in digital marketing while working with innovative technologies.
- Qualifications: 5+ years in Site Reliability, strong tech skills, and leadership experience required.
- Other info: Dynamic team culture focused on collaboration, innovation, and personal development.
The predicted salary is between 48000 - 84000 ÂŁ per year.
A subsidiary of Publicis Groupe, Epsilon is a leading provider of multi-channel marketing services, technologies, and database solutions. We do more than collect and store data, and we might be the most important Internet company you’ve never heard of. Join our team for your chance to work in the digital marketing space and solve meaningful problems on a massive scale—and have fun doing it.
The System and Platform Operations Director is a technical leadership role that is responsible for the support, reliability and stability of Epsilon Retail Media production systems, environments and offerings. The team owns the reliability vision for the company, driving continuous improvement through a combination of development and operations initiatives as well as process excellence. This position and their team has solid-line responsibility for operations including the deployment, management, monitoring, reporting, troubleshooting, and repair of production systems.
Core to the success of the role is to provide a premium customer support experience focused on a “center of excellence” that allows for a full-service delivery support cycle. This role is responsible for managing the Platform Operation Team centralized within a single geo-region, orchestrating the regional teamwork, serving with both technical and professional support, and championing the company values. The Platform Operations Engineer works closely with the Engineering team to ensure ongoing system stability and supports the Technical Account Managers from an environment's perspective. The Platform Operations team is responsible for supporting all retailers once they are live. Critically important is how this team collaborates and liaises with other teams such as Customer Support, Technical Account Management, Engineering and Customer Success teams.
What you’ll do
- Operational Practices: Establish and manage operational practices and ensure we design, implement and operate a support model that is fit for purpose for our future. Implement proactive solutions for incident and problem detection, response and remediation and continuous improvement. Owner of the operational integrity of all production environments.
- Production Monitoring and Operational Reporting: Adopt a “Measure Everything” approach to ensure that internal service level objectives and customer service levels agreements are exceeded including executive level reporting on operational health metrics such as SLAs, incident resolution, performance, availability, reliability, capacity etc.
- Customer Support & Incident Management: Own incident management processes and on call response. Take ownership of complex issues related to performance, reliability, and scalability and leading resolution of serious incidents and events including communications with customers and wider stakeholders.
- Change Management: Uphold processes and procedures to manage change across production platforms. Provide insight and expertise on how customers will perceive the changes or impacts to customers to drive customer organization change management and communication. Empower the Delivery teams to release new products, features, updates and fixes quickly, while ensuring Platforms remain reliable and stable.
- System Reliability: Work with the wider Engineering, Product, Delivery and Security teams to ensure that appropriate attention is given to production/system reliability. Establish Operational Practices in conjunction with the Product and Engineering teams (e.g. understanding how product feature development could affect the system’s overall reliability and performance). Provide delivery status information on System Reliability initiatives to the IT Leadership Team and additional stakeholders with a focus and ensure proper communication concerning changes to agreed milestones or challenges, risks and blockers that may affect the outcome or agreed completion dates (with proactive suggestions to resolve).
- IT Service Management: Execute Service Management processes including Change, Config, Service Level, Performance, Incident and Problem Management to deliver a high level of support and system availability. Leverage industry standards and best practices for improving service levels and performance. Uphold Customer Support standards in line with Service Level Agreements. Ensure SLAs and KPIs are met to the best of your ability, with particular focus on first level response times, escalation paths and resolution times. Uphold the IT Service and Support workflow - with a particular focus on ensuring best in class customer experience. Deliver support and service solutions for the Group in line with industry best practice. Work as a team to ensure all SLAs and practices are well defined, documented and consistently applied/adhered to provide premium customer support services.
- Organizational Capability: Identify the capabilities needed to meet the current and emerging business needs of a significant function. Evaluate current capabilities, identify gaps, and prioritize development activities. Embed personal development and the fulfillment of personal potential in the culture of the organization. Build capabilities elsewhere in the organization through mentoring and other informal methods.
- Technical Developments, Process Improvement and Simplification: Discuss and recommend more complex or innovative technical developments to improve the quality of software and supporting infrastructure to better meet users' needs. As subject matter expert on the team, maintain understanding of current technology, database management, reliability practices, and future trends through ongoing education, conference attendance and industry press. Ensure all processes and procedures are documented for ease of continuous improvement activities. Proactively identify new opportunities to drive improvements and simplification of our overall technology solutions.
- Personal Capability Building: Develop own capabilities by participating in assessment and development planning activities as well as formal and informal training and coaching; gain or maintain external professional accreditation where relevant to improve performance and fulfill personal potential. Maintain an in-depth understanding of technology, external regulation, and industry best practices through ongoing education, attending conferences, and reading specialist media.
Who You Are
What you’ll bring with you:
- At least 5 years of hands-on experience in Site Reliability focused positions.
- Strong knowledge of containerization technologies (Docker, Kubernetes).
- Experience with infrastructure as code (Terraform).
- Solid understanding of networking, security, and system architecture.
- Proficient in scripting languages (Java, Golang, Python, Bash, or similar).
- Experience with monitoring and observability tools (DataDog, Prometheus, Grafana).
- Knowledge of database management systems (PostgreSQL, Bigtable).
- Understanding of API and microservices architecture.
- Strong people leadership skills with at least a year in leading and driving high-performance technical teams.
- Operations teams within enterprise environments with knowledge of DevOps, ITIL, Cloud Services, IT Infrastructure and Operations supporting and maintaining production and development environments and building cloud services that are secure, reliable, scalable and observable.
- Experience implementing and managing Logging, Monitoring and Alerting frameworks.
- Knowledge and experience of establishing deployment and automation pipelines.
- Expertise with ITSM principles from previous positions held.
- Excellent communications and written skills, and must be able to talk about technology intelligently and passionately to all levels of an organization including Developers, Architects and senior management (technical and non-technical).
- Past establishing support strategies to support SaaS or Cloud based backends with a particular focus on APM deployment (such as Dynatrace or other monitoring tools).
- Experience with establishing Service Delivery strategies that align to new ways of work methods, including Agile.
- Understanding of international requirements relating to data/information security.
- Experience in the design, development and management of commercial technology contracts, technical service level agreements, and KPIs.
- Experience of establishing and delivering IT support services in a high availability (HA) environment such as 24/7 operations.
Why you might stand out from other talent:
- Google Cloud Architect or Engineer certification preferred.
- Achieved certificates in relevant Database Management Systems, referenced programming languages/scripting tools, or similarly related subject matter.
- Bachelor’s degree or equivalent.
Click here to view how Epsilon transforms marketing with 1 View, 1 Vision, 1 Voice.
Additional Information
When You Join Us, We’ll Create Something EPIC Together. Epsilon is a global data, technology and services company that powers the marketing and advertising ecosystem. For decades, we’ve provided marketers from the world’s leading brands the data, technology and services they need to engage consumers with 1 View, 1 Vision and 1 Voice.
Epsilon’s comprehensive portfolio of capabilities across our suite of digital media, messaging and loyalty solutions bridge the divide between marketing and advertising technology. We process 400+ billion consumer actions each day using advanced AI and hold many patents of proprietary technology, including real-time modeling languages and consumer privacy advancements. Thanks to the work of every employee, Epsilon has been consistently recognized as industry-leading by Forrester, Adweek and the MRC. Epsilon is a global company with more than 9,000 employees around the world.
Our pillars aren't just words. They’re how we show up every day:
- People centricity: We focus on employee well-being in an environment where colleagues truly care about each other.
- Collaboration: We work together, support one another, and collectively achieve goals.
- Growth: There are endless opportunities for growth through learning, development and career advancement.
- Innovation: We drive progress through cutting-edge solutions and forward-thinking approaches.
- Flexibility: We’ve created a balance between work and personal life, and we encourage adaptability to solve problems creatively.
Our values guide us to create value for our clients, our people and consumers:
- Act with integrity.
- Work together to win together.
- Innovate with purpose.
- Respect all voices.
- Empower with accountability.
Because You Matter. We know that we have some of the brightest and most talented employees in the world, and we believe in rewarding them accordingly. If you work here, expect competitive compensation, a great benefits package and endless opportunities to advance your career. We offer hybrid working opportunities, with our office space located in the Iconic Television Centre, White City. As part of our dedication to enhance our inclusive and diverse workforce, Epsilon is committed to equal access to opportunity for people without regard to race, age, sex, disability, neurodiversity, sexual orientation, gender identity, pregnancy and maternity, marriage and civil partnership or religion or belief. We are committed to providing reasonable adjustments for candidates in our application process.
Platform Solutions Director (RMN) in London employer: Epsilon Data Management, LLC
Contact Detail:
Epsilon Data Management, LLC Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Platform Solutions Director (RMN) in London
✨Tip Number 1
Network like a pro! Reach out to people in your industry on LinkedIn or at events. A friendly chat can lead to opportunities that aren’t even advertised yet.
✨Tip Number 2
Prepare for interviews by researching the company and its culture. Understand their values and think about how you can contribute to their mission—this will help you stand out!
✨Tip Number 3
Practice your pitch! Be ready to explain your experience and how it aligns with the role. Keep it concise but impactful—show them why you’re the perfect fit.
✨Tip Number 4
Don’t forget to follow up after interviews! A quick thank-you email can leave a lasting impression and show your enthusiasm for the role. Plus, it keeps you on their radar!
We think you need these skills to ace Platform Solutions Director (RMN) in London
Some tips for your application 🫡
Tailor Your CV: Make sure your CV is tailored to the Platform Solutions Director role. Highlight your experience with system reliability, customer support, and technical leadership. We want to see how your skills align with our needs!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Share your passion for digital marketing and how you can contribute to Epsilon's mission. Be sure to mention specific experiences that demonstrate your problem-solving skills.
Showcase Your Technical Skills: Don’t hold back on showcasing your technical expertise! Mention your hands-on experience with containerization technologies, scripting languages, and monitoring tools. We love candidates who can talk tech with confidence!
Apply Through Our Website: We encourage you to apply through our website for a smoother application process. It’s the best way for us to receive your application and keep track of it. Plus, it shows you’re keen on joining our team!
How to prepare for a job interview at Epsilon Data Management, LLC
✨Know Your Tech Inside Out
Make sure you brush up on your knowledge of containerization technologies like Docker and Kubernetes, as well as infrastructure as code tools like Terraform. Be ready to discuss how these technologies can enhance system reliability and performance.
✨Showcase Your Leadership Skills
Since this role involves managing a team, be prepared to share examples of how you've successfully led high-performance technical teams in the past. Highlight your experience in driving collaboration across different departments, especially in a fast-paced environment.
✨Prepare for Incident Management Scenarios
Think about complex issues you've faced related to performance and reliability. Be ready to explain how you approached these challenges, the steps you took to resolve them, and how you communicated with stakeholders during serious incidents.
✨Understand Customer Support Excellence
Familiarise yourself with best practices in customer support and incident management. Be ready to discuss how you would uphold service level agreements and ensure a premium customer experience, especially when it comes to managing changes that impact customers.