Senior Site Reliability Engineer
E
EPAM Systems, Inc.
📍 desde casa, desde casa, Mexico
Job Description
We are seeking an experienced **Senior Site Reliability Engineer**to join our team.
As a key member of the Reliability Tooling team, you will be responsible for writing and reviewing code, contributing to critical technical decisions, and mentoring engineers within your squad. This role requires a deep understanding of SRE principles and best practices, as well as the ability to guide and support your team in achieving operational excellence.
**Responsibilities**
- Deploy and manage modern cloud technologies using Infrastructure as Code (IaC), self-healing mechanisms, and automated security patterns
- Create effective telemetry, alerts, and response mechanisms to reduce Mean Time to Recovery (MTTR)
- Collaborate within and across teams to provide technical leadership and ensure high-quality solutions
- Advise on best practices and develop tools to enable smooth adoption of service reliability methods, including sustainable incident response and blameless postmortems
As a key member of the Reliability Tooling team, you will be responsible for writing and reviewing code, contributing to critical technical decisions, and mentoring engineers within your squad. This role requires a deep understanding of SRE principles and best practices, as well as the ability to guide and support your team in achieving operational excellence.
**Responsibilities**
- Deploy and manage modern cloud technologies using Infrastructure as Code (IaC), self-healing mechanisms, and automated security patterns
- Create effective telemetry, alerts, and response mechanisms to reduce Mean Time to Recovery (MTTR)
- Collaborate within and across teams to provide technical leadership and ensure high-quality solutions
- Advise on best practices and develop tools to enable smooth adoption of service reliability methods, including sustainable incident response and blameless postmortems