At a Glance
- Tasks: Develop and enhance monitoring systems to ensure performance and security.
- Company: Join a forward-thinking tech company focused on innovation and collaboration.
- Benefits: Enjoy competitive pay, health perks, remote work options, and growth opportunities.
- Other info: Dynamic team environment with a commitment to continuous improvement.
- Why this job: Make a real impact on cloud infrastructure and product reliability.
- Qualifications: Bachelor's degree in computer science and 2+ years in a related role.
The predicted salary is between 50000 - 60000 £ per year.
You will be responsible for the continued development of our monitoring systems and use them to proactively identify and communicate performance, reliability, security and cost issues. You will assist in responding to incidents and the remediation of vulnerabilities in our platform. You will also identify, plan and implement improvements to our cloud infrastructure and deployment processes in a secure and robust way, working alongside other engineers to support our product roadmap. As part of the wider product engineering team you will advocate throughout the design process for effective monitoring to ensure the performance, stability and security of our products in line with our commitment to ISO 27001 compliance.
Minimum Qualifications:
- Minimum Bachelor 2:1 degree in computer science or a related field
- 2+ years experience in a professional DevOps, SRE, Platform Engineering or similar role
- Self-motivated with strong problem-solving and analytical skills
- Experience using and configuring monitoring tools, ideally Grafana and Prometheus, to identify insights and alert to potential issues
- Experience using and configuring cloud infrastructure (ideally GCP but Azure also desirable)
- Experience with IaC tools (ideally Terraform)
- Experience with Docker, Kubernetes and Helm
- Knowledge of security and reliability best practices for cloud infrastructure and application deployments to Kubernetes
- Experience using Python and Bash for scripting or small CLI applications
- Experience using Git for professional software development
- Experience responding to and investigating security or reliability incidents in a distributed cloud environment
- The ability to communicate technical challenges and opportunities to people outside your area of expertise
- Some familiarity with the applications in our tech stack: NGINX, Flask (Python), React (TypeScript), PostgreSQL, Opensearch, Valkey, Keycloak
- Knowledge of administering Linux based systems
- Experience using CI tools, ideally CircleCI, to manage application deployments
- Experience applying and monitoring compliance with information security policies
- Experience applying Agile methodologies and working in sprints
The above is not an exhaustive list and you are required to be flexible in your approach to carrying out your duties which may change from time to time in order to reflect business needs or the company’s continuous improvement.
Site Reliability Engineer employer: Cw (cambridge Wireless Ltd
Contact Detail:
Cw (cambridge Wireless Ltd Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Site Reliability Engineer
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, attend meetups, and connect with other Site Reliability Engineers. You never know when a casual chat could lead to your next big opportunity.
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those involving monitoring tools like Grafana and Prometheus. This gives potential employers a taste of what you can do.
✨Tip Number 3
Prepare for interviews by brushing up on common SRE scenarios. Think about how you’d handle incidents or improve cloud infrastructure. Practising these responses will help you stand out during the interview process.
✨Tip Number 4
Don’t forget to apply through our website! We’re always on the lookout for talented individuals like you. Plus, it’s a great way to ensure your application gets the attention it deserves.
We think you need these skills to ace Site Reliability Engineer
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights your experience with monitoring tools like Grafana and Prometheus, as well as your cloud infrastructure skills. We want to see how your background aligns with the responsibilities of a Site Reliability Engineer.
Showcase Your Problem-Solving Skills: In your cover letter, share specific examples of how you've tackled performance or reliability issues in the past. We love seeing candidates who can think on their feet and come up with effective solutions!
Be Clear and Concise: When writing your application, keep it straightforward. Use clear language to explain your technical skills and experiences. We appreciate candidates who can communicate complex ideas simply, especially when it comes to security and reliability.
Apply Through Our Website: We encourage you to submit your application through our website. It’s the best way for us to receive your details and ensures you’re considered for the role. Plus, it shows you’re keen on joining our team!
How to prepare for a job interview at Cw (cambridge Wireless Ltd
✨Know Your Monitoring Tools
Make sure you’re well-versed in monitoring tools like Grafana and Prometheus. Be ready to discuss how you've used these tools in past roles to identify performance issues and improve system reliability.
✨Showcase Your Cloud Experience
Familiarise yourself with cloud infrastructure, especially GCP and Azure. Prepare examples of how you've implemented improvements or resolved incidents in a cloud environment, highlighting your experience with IaC tools like Terraform.
✨Communicate Technical Concepts Clearly
Practice explaining complex technical challenges in simple terms. You’ll need to communicate effectively with team members who may not have a technical background, so think of examples where you’ve done this successfully.
✨Demonstrate Problem-Solving Skills
Be prepared to discuss specific incidents you've responded to, detailing your approach to troubleshooting and remediation. Highlight your analytical skills and how they’ve helped you resolve security or reliability issues in the past.