ZAPOPAN, JALISCO, Mexico
4 days ago
Site Reliability Engineer

About Oracle Cloud Observability and Management Platform: Oracle Cloud Observability and Management Platform is a comprehensive set of monitoring, management, diagnostic, and analytics services. It enables visibility and insight across cloud native and traditional technology, whether deployed in multi-cloud or on-premises environments, with broad, standards-based ecosystem support. It's designed to help enterprises better manage their increasingly diverse and distributed IT portfolios, while reducing troubleshooting time, preventing outages, and enabling IT to manage applications from a business perspective.

About The Job:

At Oracle, we're seeking a talented and skilled Site Reliability Engineer to work on Oracle Cloud Observability and Management platform.

As a member of the software engineering division, you will take an active role in the evolution of standard practices and procedures. You will be responsible for defining and developing software for tasks associated with the developing, designing and debugging of software applications or operating systems.

As a Site Reliability Engineer, you will solve interesting technical challenges by designing, deploying, and troubleshooting key Cloud services, platforms, and infrastructure, always thinking about reliability, scalability, resilience, security, and performance. Technically, you will understand the full stack of the services you support (Network to Application) and are able to dig deep into the service to determine how to best mitigate customer impact. Further, you will drive improvements through the development of tools and engage partner teams to drive down incident counts, reduce severity of events and minimize downtime. As a member of the O&M Site Reliability team, you will be surrounded by “willing to help” individuals representing some of the brightest and most innovative minds in the industry. You will be a part of an organization that prides itself on providing training, empowerment, and career progression. Our team provides 24/7/365, follow-the-sun coverage while pushing the boundaries of what can be accomplished in the cloud. Advancing cloud computing means great growth opportunities, and highly rewarding experiences working in our expanding computing environments and SRE team.

Work is super fun, non-routine, and, challenging, involving the application of advanced skills in area of specialization.

What You Need to Have:

2+ years of experience A BE/BTech or ME/MTech in Computer Science or equivalent education background. The successful Site Reliability Engineer should be highly motivated, dig deep into solving problems and be able to work independently. They should also be able to collaborate successfully with partner teams and stakeholders. Good analytical and problem-solving skills with strong customer service orientation. Ability to work effectively in a multi-location team Excellent communication skills, strong interpersonal skills Able to work as part of a 24x7x365 operations team.

Software skills:

Strong Scripting skills (in Java, Python, Shell or equivalent) Well versed with Micro-Services architecture, Linux administration and Oracle database Clear understanding of a CI/CD pipeline concepts Knowledge of Cloud technologies like Chef, Terraform, Docker, Kubernetes, Solr etc. Experience with writing automation utilities to streamline workload Ability and willingness to learn quickly in a dynamic environment. Ability to participate in technical discussions and communicate clearly.

Career Level - IC2

Confirm your E-mail: Send Email