Reliability Engineering Lead - Microsoft Studios - Rare - Twycross/London/Cambridge

Games Jobs - Programming - United Kingdom

Job Title Reliability Engineering Lead - Microsoft Studios - Rare - Twycross/London/Cambridge
Job Category Programming
Job Description OVERVIEW
Rare: we're not your typical developer. Our 30+ years in the game-making business have been dedicated to crafting one beloved title after another, constantly trying new things, and infusing the fun we have every day into the games we create. Check out the Rare Replay collection for three whirlwind decades of evidence! We strive to keep Rare a fantastic place to work, from its beautiful location and state-of-the-art facilities to a strong focus on work/life balance.If you're a keen gamer, chances are you've already caught wind of our epic shared world pirate adventure Sea of Thieves, released in March 2018 and a journey we're committed to for the long haul even as we begin to think about what's out there beyond the horizon. With every day bringing new challenges and discoveries in equal measure, there's never been a more rewarding time to join our daredevil crew!

ROLE PURPOSEAt Rare, customer experience is our priority. An essential part of good customer experience is maintaining both high availability and quality of service across all our software services.The goal of reliability engineering is to ensure excellence of service provision for the retail customer. This involves accountability for non-functional characteristics of service performance, including:
•    Continuity: minimisation of interruption to retail customer experience. 
•    Performance: quality of customer experience. 
•    Resilience to surges, peaks and malicious attacks. 
•    Recoverability of user data and rapid restoration of service in the face of a data-centre disaster.The Reliability Engineering Lead will both lead and grow an engineering team and manage its practices, managing both workload and structure of the team to hit.

KEY ACCOUNTABILITIESThe Reliability Engineering Lead is responsible for management of the live site, co-ordinating the following:
•    Working with the Service Engineering team to understand usage patterns and specify non-functional characteristics of new services work. 
•    Quality/risk/acceptance of new deployments – including failure model analysis, pre-release validation and testing in production. 
•    Working with the Deployment Pipeline team to reduce downtime and improve consistency and reliability of our release/handover process. 
•    24-7 live operational management of service environment: deployments, scale and topology. 
•    Ongoing improvement to capability to detect and respond to incidences of service behaviour impacting negatively on customers. 
•    Ensuring reliability engineering incident bridges, supporting with data on scope and impact and first-line actions involving deployment, rollback or hotfixes. Co-ordinating with partner engineering teams to drive solutions to improve customer experience. 
•    Definition of metrics that represent high-quality service to players, and outward reporting of historical performance. 
•    Analysis and forecasting of demand based on historical volume and forward commercial guidance, with associated outward risk reporting. 
•    Identification and prioritisation of service engineering work items to address efficiency, performance or scalability needs.

REQUIRED SKILLS AND EXPERIENCE
•    At least five years' enterprise-level experience managing operations in an IT and/or Critical Environment infrastructure. 
•    At least three years' experience leading and motivating a diverse, technical workforce.
•    Enterprise-level experience in managing large-scale and complex projects/programs. 
•    Working knowledge of audit and compliance requirements in a large global enterprise. 
•    Financial management experience and good business acumen. 
•    Strong problem-solving skills, analytical capabilities, data analysis and attention to detail. 
•    Strong verbal and written communication and organisation skills. 
•    Ability to multitask and project manage many tasks simultaneously.

This role can be based in Twycross, London or Cambridge. If the candidate we hire is based in London or Cambridge, it is expected that travel to Rare in Twycross will be as per business needs.
Salary Competitive + benefits
Location Twycross, United Kingdom
Job Category Programming
Date posted 14/01/2019
More View other Microsoft Rare jobs
Recruiter This job is advertised on behalf of Microsoft Rare using their internal reference 549651.
Login or register to apply Login Register