Durham, NC, USA
6 days ago
Manager, Site Reliability Engineering

Manager, Site Reliability Engineering (Cloud Engineering Focus) Location: Raleigh/Durham, NC (Hybrid) Summary

Pearson’s Digital and Technology division is seeking a Manager of Site Reliability Engineering to lead a high-performing SRE team focused on building, maintaining, and optimizing Content APIs in AWS. This team ensures the scalability, reliability, security, and cost-efficiency of core services supporting Pearson’s global education platform.

You’ll manage an established team of engineers and help mature our cloud operations by fostering a culture of automation, continuous improvement, and operational excellence in a high-uptime, 24/7 environment.

 

What You’ll Be Doing

Lead and grow a high-performance SRE team delivering resilient and scalable systems. Manage and continuously improve customer-facing systems with high availability expectations. Apply Agile/Scrum methodologies to drive execution and iterative delivery. Facilitate sprint planning, retrospectives, and other agile ceremonies. Own the incident management lifecycle, including tooling, documentation, postmortems, and team training. Oversee the on-call strategy and support readiness across engineering teams. Drive collaboration across infrastructure, product, architecture, and security teams. Monitor and optimize cloud spend while designing cost-efficient and scalable architecture. Promote an “automate everything” mindset to reduce toil and improve system reliability. Champion clear, actionable, and accessible documentation for systems and processes.

 

What We’re Looking For

Strong communication and leadership skills; ability to engage cross-functional stakeholders. Deep familiarity with DevOps/SRE tools, practices, and mindset. Hands-on experience with Infrastructure as Code tools (Terraform preferred, CloudFormation acceptable). Configuration management experience using Puppet, Chef, Ansible, or SaltStack. Proficient in scripting or development with Python, Go, Java, or similar languages. Proven expertise in diagnosing and resolving complex performance issues across the stack. Experience with observability and incident response platforms (e.g., PagerDuty, Grafana). Strong documentation habits and attention to operational details. Track record of optimizing cloud environments for both performance and cost.

 

Nice to Have

AWS Certifications (Solutions Architect, DevOps Engineer, Developer Associate) Agile or Scrum certifications Security background or experience working with secure architectures Experience managing global or distributed engineering teams
Confirm your E-mail: Send Email