Job Description
Tecsys is seeking a Site Reliability Engineer to enhance system reliability and performance in a remote setting. Your role will focus on maintaining our cloud infrastructure via automation and incident management.
As part of our Network and Security Operations Center (NOC), you will ensure optimal performance of mission-critical systems running on AWS and Kubernetes. This position requires a blend of technical expertise and operational leadership, allowing you to play a crucial role in reliability engineering. Collaborating with multiple teams, you will drive initiatives that promote innovation and development within our platforms.
Key Responsibilities:
• Collaborate on service planning before launch
• Identify system pain points and drive creative solutions
• Measure availability and overall system health post-deployment
• Manage observability and monitoring tools like Datadog
• Automate processes through Terraform and CI/CD pipe...
As part of our Network and Security Operations Center (NOC), you will ensure optimal performance of mission-critical systems running on AWS and Kubernetes. This position requires a blend of technical expertise and operational leadership, allowing you to play a crucial role in reliability engineering. Collaborating with multiple teams, you will drive initiatives that promote innovation and development within our platforms.
Key Responsibilities:
• Collaborate on service planning before launch
• Identify system pain points and drive creative solutions
• Measure availability and overall system health post-deployment
• Manage observability and monitoring tools like Datadog
• Automate processes through Terraform and CI/CD pipe...