Job Description
We are looking for a Cluster Test Engineer to join the AMAX solution engineering group for computing solution development, validation, deployment, and support. This is a extremely hands-on role. With hardware and software expertise, you will explore and address technical challenges and ensure the success of the solution launch and deployment.
Primary duties include:
- Install, monitor, and fine-turn GPU, job schedulers, and clusters
- Define and implement AI and HPC tests and validation for data center solutions.
- Investigate and introduce new test technology and methodology to improve solution robustness.
- Debug complex hardware and software issues during solution bring-up phases.
- Contribute to diagnostic generation and validation infrastructure.
- Support customers on the compute infrastructure build-up and workload implementation and validation
Requirement...