Hrittik Roy
Hrittik Roy@hrittikhere
>POSTS>TALKS>PAPERS>BOOKS
  1. Talks
  2. /
  3. AWS User Group Jaipur
Meetup·February 28, 2026·Jaipur, India

Stop the GPU Madness! Making LLM Inference Actually Efficient on K8s

AWS User Group Jaipur

LLMKubernetesGPUInferenceAWS

Abstract

AWS User Group Jaipur — main auditorium, RIC Jaipur. A meetup talk on running LLM inference workloads on Kubernetes without burning through GPU budgets.

Resources

  • Event page on awsugjaipur.in↗

More Talks

  • Meetup

    Breaking Down Inference Optimization: The Three Different Layers

    CNCG Colombo · Colombo, Sri Lanka

  • Conference

    Help! My LLM is a Resource Hog: How We Tamed Inference with Kubernetes and Open Source Muscle

    KubeCon + CloudNativeCon North America 2025 · Atlanta, USA

  • Conference

    Conformance for Inference: How We Reduced Bad Deploys on a GPU Platform

    KubeCon + CloudNativeCon Japan 2026 · Tokyo, Japan

  • Conference

    Squeezing Every Millisecond: A Practical Guide to Optimizing Time To First Token with OSS Muscle

    Open Source Summit Korea 2026 · Seoul, South Korea

← All talks
PostsTalksPapersBooks

// ELSEWHERE

  • RSS feed
  • Credly
  • Sessionize

// COLOPHON

Set in Lora and Inter. Built with Next.js, shipped static.
The orange is not a brand color; it is a stubbornness.

// SIGNING OFF

© 2026 Hrittik Roy

last deploy · 2026-06-25