SGLang

SGLang

9 Photos and videos

Tweets

Pinned Tweet

SGLang

@sgl_project

Jun 9

👋 @sgl_project is back! Welcome to the home of the SGLang community! While @lmsysorg keeps you posted on technical drops and partner news, this space is for you! Here's what we've got lined up: 🚀 Version releases: every new SGLang drop, unpacked 🎙️ Office Hours: deep dives, live deployments, and team Q&A 📺 Tutorials: short how-tos and the best Office Hour moments 🌟 Community spotlights: the cool stuff you're building with SGLang 📅 Event updates: meetups, workshops, and where to catch us next And we'd love to hear from you! What do you want to see? Benchmarks? Model deep dives? A topic for the next Office Hour? Drop it below 👇 Every idea gets read!

11,477

SGLang

SGLang

@sgl_project

Jun 13

🎉 SGLang v0.5.13 is out! First, new model support! Nemotron 3 Ultra, Step-3.7-Flash, Command A , plus new diffusion models: Cosmos3, FLUX.2-Klein, Ideogram 4, LingBot-World, SANA-WM, and Ernie-Image. Here are the highlights for this release: - Speculative Decoding V2 is now the default! Tree drafting (topk>1) for faster generation - Breakable CUDA Graphs now make prefill faster - Qwen 3.5 runs faster on NVIDIA Blackwell with new GDN kernels - HiCache with UnifiedTree on by default for hybrid SWA/Mamba models - SGLang-Diffusion now supports realtime generation! Plus progressive resolution - Multiple performance and feature updates for DeepSeek V4 Thanks to our amazing partners and model makers: @NVIDIAAI @AMD @intel @awscloud @boson_ai @cohere @bfl_ml @ideogram_ai @deepseek_ai @Kimi_Moonshot @Alibaba_Qwen @StepFun_ai @Baidu_Inc @robbyant_brain

18,632

SGLang

SGLang

@sgl_project

Jun 13

Check the full notes: github.com/sgl-project/sglan…

Release v0.5.13 · sgl-project/sglang

Highlights New Model Support: Autoregressive: Nemotron 3 Ultra (Day-0, blog), Step-3.7-Flash, Command A Diffusion: Cosmos3, LingBot-World, SANA-WM, Ernie-Image, FLUX.2-Klein 4B/9B, Ideogram 4 Sp...

github.com

656

LMSYS Org

SGLang retweeted

LMSYS Org

@lmsysorg

Jun 12

🚀New record on GB300 NVL72: SGLang exceeds 12K tok/s per GPU on DeepSeek V4 Pro 1.6T (FP4, 8K/1K), orchestrated with NVIDIA Dynamo (SGLang) and MTP. Per @SemiAnalysis_ InferenceX benchmarks, performance stays strong across the entire interactivity curve. More to come with @NVIDIAAIInfra 🤝

120

30,773

LMSYS Org

SGLang retweeted

LMSYS Org

@lmsysorg

Jun 12

SGLang Office Hour 6/11 Higgs Audio V3 TTS x.com/i/broadcasts/1nKOLLdYN…

LMSYS Org

SGLang Office Hour 6/11 Higgs Audio V3 TTS

697

SGLang

SGLang

@sgl_project

Jun 11

Honor to co-host with @CrusoeAI!

Crusoe

@CrusoeAI

Jun 11

We joined @HOFCapital, @CloudflareDev @ArklexAI to host SGLang’s #NYTechWeek Happy Hour Lightning Talk exploring AI inference in finance. Our AI Research Manager Omri Berkovitch spoke on efficiently running large-scale inference in production. 📸 #AIinference #FinanceAI

472

Byron Hsu

SGLang retweeted

Byron Hsu

@hsu_byron

Jun 11

An absolutely impressive contribution to SGLang (@lmsysorg @sgl_project) by @BrianCChao, providing a 2× speedup with no loss in quality! In my opinion, there are three levels of contributions one can make to SGLang. The first is fixing bugs or extending existing features, such as fused kernels. The second is adding a new feature based on an existing paper, such as PD disaggregation. The third is inventing a new algorithm and integrating it into SGLang. This work clearly falls into the third category, and I’m excited to hear feedback from the community! Thanks a lot to the review from @mick_qian and supports from @ying11231 @BanghuaZ @lm_zheng

Brian Chao

@BrianCChao

Jun 11

Aside from the official code release, I am thrilled to share that Spectral Progressive Diffusion a.k.a. SPEED (arxiv.org/abs/2605.18736) is now integrated into @lmsysorg's SGLang (@sgl_project)! 🚀 Instead of always running diffusion at full resolution, SPEED progressively grows resolution across denoising steps, drastically cutting token count and achieving >2× speedup with no quality loss. SPEED is now supported in SGLang for FLUX.1 & 2, Z-Image, Qwen-Image, and Wan. Support for Ideogram 4 incoming. Try it out now: docs.sglang.io/docs/sglang-d…. [1/4]

11,413

SGLang

SGLang

@sgl_project

Jun 11

Hot take: most of your denoising steps don't need full resolution👌 SPEED proves it with progressive resolution growth during denoising, up to 2× speedup, and quality untouched. Now shipping in SGLang Diffusion: FLUX.1 & 2, Z-Image, Qwen-Image, Wan. Ideogram 4 next Congrats @BrianCChao @howard_xhc @YarivLior @GordonWetzstein 👏 and thanks to the support from @hsu_byron and @mick_qian Don't forget to check out this doc: docs.sglang.io/docs/sglang-d…

Progressive Resolution Generation - SGLang Documentation

Experimental spectral progressive resolution growing for selected SGLang Diffusion pipelines.

docs.sglang.io

Brian Chao

@BrianCChao

Jun 11

4,559

SGLang

SGLang

@sgl_project

Jun 10

📣 SGLang Office Hour: How to Build Fast, Modern Voice Cloning with @boson_ai Our Office Hour is back! This week we're joined by Ke and Huapeng from the Boson AI team to talk about Higgs Audio V3 TTS, how to build fast, modern voice cloning that actually sounds human, and their work optimizing SGLang-Omni for TTS production serving. Bring your questions for Ke and Huapeng, see you tomorrow!

2,442

SGLang

SGLang

@sgl_project

Jun 10

Higgs Audio V3 TTS: How to Build Fast, Modern Voice Cloning · Luma

📣 Higgs Audio V3 TTS: How to Build Fast, Modern Voice Cloning SGLang Office Hours is back! This week we're joined by Ke and Huapeng from the Boson AI team to…

luma.com

360

SGLang

SGLang

@sgl_project

Jun 10

Heard about Speculative Decoding? Come hear how Speculative Speculative Decoding (SSD) pushes inference speed to a new ceiling. RSVP 👇

Delta Institute

@DeltaInstitutes

Jun 9

We're hosting a research talk lunch event with @TanishqKumar07 in SF! Come learn about SSD, which is up to 2x faster than the best inference engines in the world, RSVP below!

3,673

SGLang

SGLang

@sgl_project

Jun 9

→ Michelle Chen (@CloudflareDev) on AI platforms for agents

6,364

SGLang

SGLang

@sgl_project

Jun 9

→ Prof. Zhou Yu (Columbia / @ArklexAI) on agent evaluation in production

395

SGLang

SGLang

@sgl_project

Jun 9

→ Mao Cheng (@radixark) on Miles & RL

457

SGLang

SGLang

@sgl_project

Jun 9

→ Omri Berkovitch (@CrusoeDev) on inference infra at scale

349

SGLang

SGLang

@sgl_project

Jun 9

🏙️ SGLang NY Tech Week Happy Hour Recap Last Wednesday, SGLang hosted a NY Tech Week Happy Hour in NYC, co-hosted with @HOFCapital, @Cloudflare, @CrusoeAI, and @ArklexAI. 380 registered, 200 in the room, and one unforgettable night. 🧡 The room was packed with engineers, researchers, and enthusiasts from quant funds, banks, and trading firms, all there to talk about one thing: where inference is headed as LLMs move into latency-sensitive production across trading, research, compliance, and risk. NYC, you showed up and brought the energy. We loved every minute. Until next time! ☀️ #NYTechWeek @Techweek_

3,610

SGLang

SGLang

@sgl_project

Jun 9

Huge thanks to our lightning talk speakers → Liangsheng Yin (@radixark) on SGLang for inference in finance

593

LMSYS Org

SGLang retweeted

LMSYS Org

@lmsysorg

Jun 9

🙌The community account @sgl_project is back! If you're building with SGLang, this is your space. Follow for releases, Office Hours, tutorials & events👇

SGLang

@sgl_project

Jun 9

4,437

SGLang

SGLang

@sgl_project

Jun 8

Sleeper hit no more 😄 304M downloads and 744% in 3 months — humbled to see SGLang growing alongside this incredible community. Thank you all for building with us!

ClickHouse

@ClickHouseDB

Jun 8

PyPI May 2026: 163.8 billion downloads ( 9.2% MoM). The AI agent race is now measurable in download counts: ⚡ sglang: 304M ( 744% in 3 months), @lsyincs @sgl_project - the sleeper hit 🔀 litellm: 513M ( 425% in 3 months), @krrish_dh @ishaan_jaff @LiteLLM 📐 pydantic-ai-slim: 268M ( 85% MoM), @samuelcolvin @pydantic 🌐 langchain-protocol: 37M ( 524% in one month), @hwchase17 @LangChain 🕷️ browser-use: 24M ( 132% MoM), @mamagnus00 @gregpr07 @browser_use 🤝 crewai: 14.5M ( 108% MoM), @joaomdmoura @crewAIInc boto3 still #1 at 3.3B. AWS wins the boring prize. Full data: clickhou.se/43USFbm

2,675

SGLang

SGLang

@sgl_project

Jun 8

🎉 Congrats on the launch! Exciting to see the agent infra space leveling up.

GMI Cloud

@gmi_cloud

Jun 8

Today, we are launching GMI Agent Box. A complete infrastructure stack for production-ready AI agents: native Docker, flexible deployment, 200 models under one API key, dedicated compute across regions, and a marketplace for distribution. Available now.

1:31

1,041