At a Glance
- Tasks: Investigate and resolve complex application issues while designing automation for improved reliability.
- Company: Join a forward-thinking tech company with a hybrid work culture.
- Benefits: Enjoy private medical insurance, birthday off, and flexible working hours.
- Why this job: Make a real impact in cloud operations and enhance your technical skills.
- Qualifications: 3+ years in third-line support or cloud operations with strong problem-solving skills.
- Other info: Collaborative environment with excellent career growth opportunities.
The predicted salary is between 36000 - 60000 £ per year.
You will be the bridge between support, engineering, and cloud operations.
Responsibilities
- Investigating and fixing complex application and infrastructure issues.
- Monitoring capacity, performance, and error budgets across all deployments.
- Designing automation and tooling to improve reliability and reduce manual work.
Responsibilities and Tasks
- Monitor ST and MT environments for server performance, response times, error rates, and application health.
- Detect and resolve database issues, stalled file processing, or misplaced storage objects.
- Use Azure diagnostics and telemetry to troubleshoot and resolve complex incidents.
- Provide third-line support for escalated customer cases, collaborating with development for code-level fixes.
Reliability Engineering (Fleet Level)
- Maintain uptime, performance, and scalability across all ST and MT deployments.
- Define and track service-level objectives (SLOs) and error budgets for different environment types.
- Perform capacity planning for servers, databases, and storage, scaling resources before issues occur.
- Identify systemic patterns causing downtime and implement fixes at scale.
Automation & Tooling
- Build scripts and automation (PowerShell, C#, Azure Functions, Logic Apps) to detect and remediate common application or infrastructure issues.
- Automate environment health checks and reporting.
- Develop self-healing routines for recurring problems.
Monitoring & Reporting
- Incident trends and recurring issue analysis.
- Provide regular reliability reports and improvement recommendations to stakeholders.
- Feed recurring issues and systemic risks into the continuous improvement programme.
- Contribute to post-incident reviews with actionable follow-ups.
- Maintain troubleshooting guides and technical runbooks for common issues.
Success Measures (KPIs)
- Uptime: target SLO % for ST and MT environments.
- Error Budget Burn Rate: Maintain within agreed thresholds.
- Reduce MTTR for P1/P2 incidents.
- Reduce recurrence rate of common issues.
- Automation Impact: Number of recurring issues automated/self-healed.
- Hours saved through automation vs manual intervention.
- Customer Impact: Reduced escalations from L1/L2 support. Improved customer satisfaction for technical cases.
Your Qualifications, Technical Skills and Experience
- 3+ years in third-line support, SRE, or cloud operations for enterprise SaaS.
- Proven track record in incident resolution and root cause analysis.
- Experience working with both multi-tenant and single-tenant cloud architectures.
- Strong background in supporting C#/.NET Core/MVC web applications with SQL Server backends and Azure Blob Storage.
- Proficient in SQL for investigation and remediation.
- Scripting and automation skills in PowerShell and/or C#.
- Understanding of Azure components: App Services, VMs, SQL DB, Blob Storage, scaling strategies.
- Experience in capacity planning, SLOs, and error budget management.
Your Personal Skills and Attributes
- Exceptional problem-solving skills with strong attention to detail.
- Ability to clearly document findings and communicate with technical and non-technical audiences.
- Calm under pressure during high-priority incidents.
- Collaborative mindset, working closely with support, dev, and ops teams.
This job description is not intended to be an exhaustive list of duties and responsibilities. You may be expected to perform different tasks as the needs of the business and your role evolve. Your job description will be reviewed and updated accordingly.
Benefits
- Private Medical Insurance: Your health matters, and we've got you covered.
- Birthday Off: Celebrate your day your way - it's on us.
- Holiday Purchase: Need more downtime? Purchase up to an additional 5 days of holiday.
- Employee Assistance Programme: Confidential 24/7 helpline and support for you and your immediate family.
- Time for You: We value your personal time. That's why we aim to finish work at 2pm on Fridays.
- Better Working: We embrace hybrid working and where it is operationally practicable, we support employees splitting their working time between the office and home.
- Pension: Plan for tomorrow with our pension scheme via NEST.
Senior Azure SaaS Reliability & Support Engineer in Kingston upon Thames employer: BOSS ERP Consulting
Contact Detail:
BOSS ERP Consulting Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Senior Azure SaaS Reliability & Support Engineer in Kingston upon Thames
✨Tip Number 1
Network like a pro! Reach out to folks in your industry on LinkedIn or at local meetups. You never know who might have the inside scoop on job openings or can put in a good word for you.
✨Tip Number 2
Prepare for those interviews by practising common questions and scenarios related to Azure and SaaS. We recommend doing mock interviews with friends or using online platforms to get comfortable with the process.
✨Tip Number 3
Show off your skills! Create a portfolio or GitHub repository showcasing your automation scripts and troubleshooting guides. This gives potential employers a tangible look at what you can do.
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are proactive about their job search.
We think you need these skills to ace Senior Azure SaaS Reliability & Support Engineer in Kingston upon Thames
Some tips for your application 🫡
Tailor Your CV: Make sure your CV reflects the skills and experiences that match the job description. Highlight your experience with Azure, incident resolution, and automation tools like PowerShell or C#. We want to see how you can bridge support, engineering, and cloud operations!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're the perfect fit for the Senior Azure SaaS Reliability & Support Engineer role. Share specific examples of how you've tackled complex issues and improved reliability in past roles.
Showcase Your Problem-Solving Skills: In your application, don’t just list your skills—demonstrate them! Provide examples of how you've resolved high-priority incidents or improved system performance. We love seeing candidates who can think on their feet and deliver results under pressure.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows you’re keen on joining the StudySmarter team!
How to prepare for a job interview at BOSS ERP Consulting
✨Know Your Azure Inside Out
Make sure you brush up on your Azure knowledge, especially around App Services, VMs, and SQL DB. Be ready to discuss how you've used these components in past roles, as well as any specific challenges you've faced and how you overcame them.
✨Showcase Your Problem-Solving Skills
Prepare to share examples of complex incidents you've resolved. Highlight your approach to root cause analysis and how you documented your findings. This will demonstrate your exceptional problem-solving skills and attention to detail.
✨Demonstrate Your Automation Expertise
Since automation is key for this role, come prepared with examples of scripts or tools you've built using PowerShell or C#. Discuss how these have improved reliability and reduced manual work in your previous positions.
✨Communicate Clearly and Collaboratively
Practice explaining technical concepts in a way that non-technical audiences can understand. This will show your ability to communicate effectively across teams, which is crucial for bridging support, engineering, and cloud operations.