Lead Site Reliability Engineer
E
EPAM Systems, Inc.
📍 desde casa, desde casa, Mexico
Job Description
We are looking for an experienced **Lead Site Reliability Engineer**to join our team.
In this role, you will play a pivotal part in the Reliability Tooling team, taking responsibility for writing and reviewing code, making key technical decisions, and mentoring engineers within your squad. This position requires a strong grasp of SRE principles and best practices, as well as the ability to lead and support your team in achieving operational excellence.
**Responsibilities**
- Deploy and manage cloud technologies using Infrastructure as Code (IaC), self-healing mechanisms, and automated security strategies
- Develop telemetry, alerting, and response systems to minimize Mean Time to Recovery (MTTR)
- Collaborate with internal and cross-functional teams to provide technical leadership and deliver high-quality solutions
- Advise on best practices and create tools to promote the adoption of service reliability methods, including sustainable incident management and blameless...
In this role, you will play a pivotal part in the Reliability Tooling team, taking responsibility for writing and reviewing code, making key technical decisions, and mentoring engineers within your squad. This position requires a strong grasp of SRE principles and best practices, as well as the ability to lead and support your team in achieving operational excellence.
**Responsibilities**
- Deploy and manage cloud technologies using Infrastructure as Code (IaC), self-healing mechanisms, and automated security strategies
- Develop telemetry, alerting, and response systems to minimize Mean Time to Recovery (MTTR)
- Collaborate with internal and cross-functional teams to provide technical leadership and deliver high-quality solutions
- Advise on best practices and create tools to promote the adoption of service reliability methods, including sustainable incident management and blameless...