Meetup··Jaipur, India
Stop the GPU Madness! Making LLM Inference Actually Efficient on K8s
AWS User Group Jaipur
LLMKubernetesGPUInferenceAWS
Abstract
AWS User Group Jaipur — main auditorium, RIC Jaipur. A meetup talk on running LLM inference workloads on Kubernetes without burning through GPU budgets.
Resources
More Talks
- Meetup
Breaking Down Inference Optimization: The Three Different Layers
CNCG Colombo · Colombo, Sri Lanka
- Conference
Help! My LLM is a Resource Hog: How We Tamed Inference with Kubernetes and Open Source Muscle
KubeCon + CloudNativeCon North America 2025 · Atlanta, USA
- Conference
Conformance for Inference: How We Reduced Bad Deploys on a GPU Platform
KubeCon + CloudNativeCon Japan 2026 · Tokyo, Japan
- Meetup
Identity Propagation in MCP: OBO, Multi-Hop Chains, and the Trust Problem
The AI Infrastructure Meetup: BLR · Bengaluru, India