Bangalore, IND
13 days ago
Manager Site Reliability Engineering
Position Overview We are seeking a dynamic and experienced Manager for our Site Reliability Engineering (SRE) team. This individual will play a critical role in ensuring the stability, performance, and scalability of our infrastructure. The ideal candidate will possess excellent leadership skills, profound technical expertise, and the ability to thrive in a fast-paced, collaborative environment. Key Responsibilities Leadership and Team Management Lead, mentor, and develop a team of highly skilled Site Reliability Engineers. Promote a culture of continuous improvement and high performance. Foster collaboration and communication within the team and with other departments. Monitor team performance and provide constructive feedback. Technical Expertise Oversee the design, implementation, and maintenance of reliable and scalable infrastructure. Develop and enforce best practices for system reliability, monitoring, and incident management. Ensure the availability, performance, and security of our services. Collaborate with software engineering teams to design and implement solutions that improve system reliability and performance. Utilize automation and DevOps practices to streamline operations and enhance productivity. Experience with Terraform Extensive Knowledge on Multi Cloud Environment (AWS, GCP etc) is an added advantage Collaboration and Communication Work closely with cross-functional teams, including engineering, product management, and operations, to ensure alignment and successful project execution. Communicate effectively with stakeholders at all levels, providing regular updates on SRE initiatives and performance metrics. Facilitate incident response and post-mortem meetings, ensuring thorough analysis and follow-up on action items. Qualifications Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field. Proven experience in a leadership role within a Site Reliability Engineering or DevOps team. Strong technical background with extensive knowledge of cloud infrastructure, containerization, automation, and monitoring tools. Proficiency in scripting languages such as Python, Bash, or similar. Excellent problem-solving skills and a proactive approach to identifying and mitigating risks. Exceptional communication and interpersonal skills. Why Join Us? Be part of a forward-thinking company that values innovation and excellence. Work in a supportive and collaborative environment where your contributions are recognized and rewarded. Opportunities for professional growth and development through ongoing training and mentorship. Competitive compensation and benefits package. If you are a motivated and visionary leader with a passion for site reliability engineering, we would love to hear from you. Apply today and join our team in ensuring the robustness and efficiency of our cutting-edge infrastructure. 1086664 **Job:** Cloud and Hosting Services **Job Family:** TECHNOLOGY **Organization:** Corporate Strategy & Technology **Schedule:** FULL\_TIME **Req ID:** 19846
Confirm your E-mail: Send Email