Terminal Bench Expert – AI Evaluation & Debugging
H
Highbrow Technology Inc
📍 multan, multan, Pakistan
Job Description
We're Hiring: Terminal Bench Expert (Contract | Remote
contribute to cutting-edge AI evaluation systems by designing real-world benchmark tasks that challenge advanced AI agents. If you're passionate about debugging, system analysis, infrastructure, data pipelines, or AI evaluation, this opportunity is for you.
Position Details
Role: Terminal Bench Expert
Employment Type: Contractor Assignment
Duration: 5 Weeks
Location: Remote (India, Bangladesh, Brazil, Colombia, Egypt, Ghana, Indonesia, Kenya, Nigeria, Pakistan, Turkey, Vietnam)
Experience: 3–10 Years
Start Date: Immediate
Commitment: Full-time (40 hrs/week)
What You'll Do
✔ Design and develop realistic AI benchmark tasks
✔ Create debugging, investigation, and system failure scenarios
✔ Define evaluation criteria and validation logic
✔ Document solutions and technical workflows
✔ Collaborate with reviewers to improve task quality and difficulty
Ideal Candidate
✅ Strong softwa...
contribute to cutting-edge AI evaluation systems by designing real-world benchmark tasks that challenge advanced AI agents. If you're passionate about debugging, system analysis, infrastructure, data pipelines, or AI evaluation, this opportunity is for you.
Position Details
Role: Terminal Bench Expert
Employment Type: Contractor Assignment
Duration: 5 Weeks
Location: Remote (India, Bangladesh, Brazil, Colombia, Egypt, Ghana, Indonesia, Kenya, Nigeria, Pakistan, Turkey, Vietnam)
Experience: 3–10 Years
Start Date: Immediate
Commitment: Full-time (40 hrs/week)
What You'll Do
✔ Design and develop realistic AI benchmark tasks
✔ Create debugging, investigation, and system failure scenarios
✔ Define evaluation criteria and validation logic
✔ Document solutions and technical workflows
✔ Collaborate with reviewers to improve task quality and difficulty
Ideal Candidate
✅ Strong softwa...