Manager of System and Platform Operations (RMN)

Manager of System and Platform Operations (RMN)

Full-Time 60000 - 80000 £ / year (est.) No working from home possible
Epsilon

At a Glance

  • Tasks: Lead a team to ensure the reliability and stability of production systems.
  • Company: Join a forward-thinking company focused on innovation and excellence.
  • Benefits: Competitive salary, flexible working options, and opportunities for professional growth.
  • Other info: Dynamic environment with a focus on collaboration and cutting-edge technology.
  • Why this job: Make a real impact by driving continuous improvement in system operations.
  • Qualifications: 5+ years in Site Reliability, strong leadership, and technical skills required.

The predicted salary is between 60000 - 80000 £ per year.

Requirements

  • At least 5 years of hands-on experience in Site Reliability focused positions.
  • Strong knowledge of containerization technologies (Docker, Kubernetes).
  • Experience with infrastructure as code (Terraform).
  • Solid understanding of networking, security, and system architecture.
  • Proficient in scripting languages (Java, Golang, Python, Bash, or similar).
  • Experience with monitoring and observability tools (DataDog, Prometheus, Grafana).
  • Knowledge of database management systems (PostgreSQL, Bigtable).
  • Understanding of API and microservices architecture.
  • Strong people leadership skills with at least a year in leading and driving high-performance technical teams.
  • Operations teams within enterprise environments with knowledge of DevOps, ITIL, Cloud Services, IT Infrastructure and Operations supporting and maintaining production and development environments.
  • Experience with establishing Service Delivery strategies that align to new ways of work methods, including Agile.
  • Experience of establishing and delivering IT support services in a high availability (HA) environment such as 24/7 operations.

What the job involves

  • The System and Platform Operations Manager is a technical leadership role responsible for the support, reliability, and stability of Epsilon Retail Media production systems, environments, and offerings.
  • The team owns the reliability vision for the company, driving continuous improvement through a combination of development and operations initiatives as well as process excellence.
  • This position and their team have solid-line responsibility for operations including the deployment, management, monitoring, reporting, troubleshooting, and repair of production systems.
  • Core to the success of the role is to provide a premium customer support experience focused on a “center of excellence” that allows for a full-service delivery support cycle.
  • This role is responsible for managing the Platform Operation Team centralized within a single geo-region, orchestrating the regional teamwork, serving with both technical and professional support, and championing the company values.
  • The Platform Operations Engineer works closely with the Engineering team to ensure ongoing system stability and supports the Technical Account Managers from an environment's perspective.
  • The Platform Operations team is responsible for supporting all retailers once they are live.
  • Critically important is how this team collaborates and liaises with other teams such as Customer Support, Technical Account Management, Engineering, and Customer Success teams.
  • You'll establish and manage operational practices and ensure we design, implement, and operate a support model that is fit for purpose for our future.
  • Adopt a “Measure Everything” approach to ensure that internal service level objectives and customer service levels agreements are exceeded including executive level reporting on operational health metrics such as SLAs, incident resolution, performance, availability, reliability, capacity etc.
  • Take ownership of complex issues related to performance, reliability, and scalability and lead resolution of serious incidents and events including communications with customers and wider stakeholders.
  • Provide insight and expertise on how customers will perceive the changes or impacts to customers to drive customer organization change management and communication.
  • Empower the Delivery teams to release new products, features, updates, and fixes quickly, while ensuring Platforms remain reliable and stable.
  • Work with the wider Engineering, Product, Delivery, and Security teams to ensure that appropriate attention is given to production/system reliability.
  • Identify the capabilities needed to meet the current and emerging business needs of a significant function.
  • As subject matter expert on the team, maintain understanding of current technology, database management, reliability practices, and future trends through ongoing education, conference attendance, and industry press.

Manager of System and Platform Operations (RMN) employer: Epsilon

Epsilon Retail Media is an exceptional employer that fosters a dynamic work culture centred on innovation and collaboration. With a strong emphasis on employee growth, we offer extensive training opportunities and encourage continuous learning in cutting-edge technologies. Our commitment to a premium customer support experience and a supportive team environment makes this role not just a job, but a meaningful career path in the heart of a thriving tech community.

Epsilon

Contact Details:

Epsilon Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Manager of System and Platform Operations (RMN)

Tip Number 1

Network like a pro! Attend industry meetups, webinars, and conferences to connect with folks in the field. You never know who might have the inside scoop on job openings or can put in a good word for you.

Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those involving containerization or infrastructure as code. This gives potential employers a taste of what you can do beyond your CV.

Tip Number 3

Prepare for interviews by brushing up on common technical questions related to system reliability and operations. Practice explaining your past experiences in leading teams and managing high-availability environments to demonstrate your leadership skills.

Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are proactive about their job search!

We think you need these skills to ace Manager of System and Platform Operations (RMN)

Site Reliability Engineering
Containerization Technologies (Docker, Kubernetes)
Infrastructure as Code (Terraform)
Networking and Security
System Architecture
Scripting Languages (Java, Golang, Python, Bash)
Monitoring and Observability Tools (DataDog, Prometheus, Grafana)

Some tips for your application 🫡

Tailor Your CV:Make sure your CV reflects the specific skills and experiences mentioned in the job description. Highlight your hands-on experience with containerization technologies and scripting languages, as these are key for us.

Craft a Compelling Cover Letter:Use your cover letter to tell us why you're the perfect fit for the Manager of System and Platform Operations role. Share examples of how you've led high-performance teams and improved system reliability in past positions.

Showcase Your Technical Skills:Don’t just list your technical skills; demonstrate them! Mention specific projects where you used tools like Terraform or monitoring solutions like DataDog. We love seeing real-world applications of your expertise.

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for this exciting opportunity with StudySmarter!

How to prepare for a job interview at Epsilon

Know Your Tech Inside Out

Make sure you brush up on your knowledge of containerization technologies like Docker and Kubernetes, as well as infrastructure as code with Terraform. Be ready to discuss your hands-on experience and how you've applied these technologies in real-world scenarios.

Showcase Your Leadership Skills

Since this role requires strong people leadership skills, prepare examples of how you've led high-performance technical teams. Think about specific challenges you faced and how you motivated your team to overcome them.

Understand the Bigger Picture

Familiarise yourself with the company's vision for reliability and stability in production systems. Be prepared to discuss how you can contribute to continuous improvement and what strategies you would implement to enhance service delivery.

Prepare for Scenario-Based Questions

Expect questions that assess your problem-solving abilities, especially around performance, reliability, and scalability issues. Think of complex incidents you've managed and be ready to explain your approach to resolution and communication with stakeholders.