AI runtime and MLOps Engineer

Confidential

📍 Bengaluru, Karnataka, India

Full-time Computer Occupations

Job Description

JD -ML Platform (AI Runtime & MLOps Stack)

Team: AI Platform Engineering

About the AI Platform

We are building a next-generation AI platform to power intelligent, AI-driven

experiences across our global marketplace. Our platform supports the full lifecycle of large-scale

foundation models—from distributed pretraining on high-performance GPU clusters to

high-throughput production inference—enabling commerce intelligence for hundreds of millions

of users worldwide.

We focus on building state-of-the-art AI runtime infrastructure leveraging vLLM and

TensorRT-LLM as pluggable inference engines behind a standardized AI runtime layer,

alongside Megatron-LM and DeepSpeed for distributed training—integrated with provisioned

throughput management, a distributed KV cache, prefill/decode disaggregation, and a robust

MLOps stack spanning experiment management, fine-tuning automation, and pro...

Apply for this Position