Filter
Exclude
Time range
-
Near
NICE Talk 141🌟invites Ph.D. at Georgia Tech Hao Kang @GT_HaoKang to discuss ThunderAgent: 4Γ— Faster LLM Agent Inference! Time ⏰ PST 3.07 18:00–19:00 ⏰ EST 3.07 21:00–22:00 ⏰ Beijing 3.08 10:00–11:00 Watch live: youtube.com/live/kHw6LZsXcH0 Register: luma.com/ezn8ho93 In this talk, the speaker will talk about: πŸš€ How can we make LLM agent workflows faster, simpler, and more robust? ❌ Traditional request-level engines (vLLM, SGLang) struggle with KV cache thrashing, memory imbalance, and resource leaks. βœ… ThunderAgent introduces Program Abstraction, treating multi-step agent workflows as programs, unifying GPU, CPU, and remote tool scheduling. With just two lines of code, ThunderAgent boosts inference throughput by 1.5–3.6Γ—, rollout throughput by 1.8–3.9Γ—, and saves 4.2Γ— disk space, while ensuring high concurrency stability. Join us to explore a principled, program-level approach to distributed agent inference and RL rollouts. #AI #LLM #AgenticAI #ReinforcementLearning #DistributedSystems #ProgramAbstraction #ThunderAgent
2
4
9
4,701