REQ 138974 - Keabetswe Modise
Closing Date: 13 May 2025
Job FamilyInformation Technology
Career StreamApplication Development
Leadership PipelineManage Self: Professional
Job PurposeTo serve as an IT professional specialising in Site Reliability Engineering (SRE) at Nedbank, contributing to the strategic capability of the organisation as part of a dynamic team. The role is focused on advancing SRE discipline and working with other domains to influence the adoption. It is a strategic, consultancy-based role that involves enabling and contributing to solutions aligned with the principles of reliability, availability, and resilience, while also promoting frequent and efficient delivery from development teams.
Job Responsibilities Collaborating with stakeholders, engineers, and operational SMEs to ensure all relevant parties are up to date with what is top of mind within the reliability service offerings Evolve production services based on customer needs and technology to ensure we remain competitive in the financial services industry/market. Influence squads during service or platform design to prevent system failures and improve performance. Engage with leadership and teams to adopt SRE practices with a core focus to contribute towards incident management and advocate for blameless postmortems. Engage and influence all teams involved in the software development life cycle with regards to observability, high availability utilising new or existing technology and improve disaster recovery plans. Implement automated-based solutions to achieve high availability, efficiency, reduce cost and performance to systems. Coach teams on best practices within the organisation via internal forums to position SRE fundamental knowledge and promote enterprise-wide knowledge sharing Assist with creating and maintaining system health and performance metrics reflecting real-time data, enabling proactive resolution, and faster troubleshooting. Collaborate and partner with DevOps engineer/coach to ensure efficient continuous integration/continuous deployment pipelines and resolve any failures or improve the flow. Take charge of technical leadership, engage with teams to identify best solutions, and mentor Junior Site Reliability Engineers to resolve technical challenges. Assist in defining and implementing metrics such as SLI's and SLO's to gain insight of user experience and performance of application. Define and deliver technical standards in partnership with all disciplines of software engineering for adoption of site reliability engineering. Participate and closely work with relevant COE's to improve release of new features to facilitate time to market. Build and maintain strategic relationships with the business units and vendors to be in sync on current ways of work and business decisions that are being embraced. Conduct maturity assessments within teams to measure SRE level of adoption and use results to outline a plan to assist teams how to get to the next level of maturity. Utilise application monitoring tools to generate report for informed decision making and driving visibility of Site Reliability Engineering. Adhere and comply with Nedbank group information management, data integrity and security policies and best practices to protect client data. Manage concurrent objectives, projects, groups, activities and time allocation based on prioritisation for effective delivery. Stay abreast of the most recent industry trends and practices and implement learnings back into the business to ensure alignment across industry. Responsible for the success of the team and projects by taking ownership of issues and ensuring their resolution. Articulate technical concepts to diverse audiences through proficient written and verbal communication to ease the understanding of the SRE discipline. Contribute to the successful implementation of the business strategy in an innovative high passed environment. Essential Qualifications - NQF Level Matric / Grade 12 / National Senior Certificate Advanced Diplomas/National 1st Degrees Preferred Qualification B-Tech Computer systems, BSc - Info Sys/Computer System or Related qualification Preferred Certifications Associate or professional (Amazon Web Services/Azure Solutions), ITIL, DevOps Minimum Experience Level Min 8 years IT Experience with 5 years in relevant technologies or domains Business Drivers Technical Expert Analyst Consultant Problem solver Technical / Professional Knowledge Microservices and containerization (K8s or Docker) Troubleshooting and root cause analysis Site Reliability Engineering Best practices DevOps framework Relevant programming/scripting languages Infrastructure and application monitoring Incident management and post incident analysis Behavioural Competencies Tech Savvy Decision Making Building Networks Influencing Communication Trouble shooter Emotional intelligence Essentials---------------------------------------------------------------------------------------
Please contact the Nedbank Recruiting Team at +27 860 555 566