Brian Chao

Brian Chao

100 Photos and videos

Tweets

Gordon Wetzstein retweeted

Brian Chao

@BrianCChao

Jun 11

Aside from the official code release, I am thrilled to share that Spectral Progressive Diffusion a.k.a. SPEED (arxiv.org/abs/2605.18736) is now integrated into @lmsysorg's SGLang (@sgl_project)! 🚀 Instead of always running diffusion at full resolution, SPEED progressively grows resolution across denoising steps, drastically cutting token count and achieving >2× speedup with no quality loss. SPEED is now supported in SGLang for FLUX.1 & 2, Z-Image, Qwen-Image, and Wan. Support for Ideogram 4 incoming. Try it out now: docs.sglang.io/docs/sglang-d…. [1/4]

Howard Xiao

@howard_xhc

Jun 11

Today we release the code and a demo for our recent Spectral Progressive Diffusion paper🎉 Play around with it anytime! Just as what we have been doing also, we hope that it encourages the integration of our plug-and-play framework into latest and greatest image and video generation models!! We also included an agent skill wrapper in the repo, to making things easier. 🛜Project website: howardxiao.ca/speed/ 📄Paper: arxiv.org/abs/2605.18736 💻Github (with ComfyUI): github.com/howardhx/speed 🤗Demo (HuggingFace): huggingface.co/spaces/howard…

0:31

22,446

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

Jun 11

Speed up your diffusion or flow matching model using our plug & play SPEED module! No distillation necessary.

Howard Xiao

@howard_xhc

Jun 11

0:31

2,095

Brian Chao

Gordon Wetzstein retweeted

Brian Chao

@BrianCChao

Jun 9

Excited to release the code and model weights for our recent paper "Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation"! We hope this inspires continued research on mixed-resolution diffusion and efficient token allocation for image and video generation. 📰 Paper: arxiv.org/abs/2603.23491 💻 Code: github.com/bchao1/foveated_d… 🌐 Website: bchao1.github.io/foveated-di… 🤗 Weights: huggingface.co/bchao1/foveat… 🚀 Demo: huggingface.co/spaces/bchao1… We put together a fun demo on Hugging Face Spaces — try it out here: huggingface.co/spaces/bchao1…. Sketch your own tokenization layouts and pick model variants to run mixed-resolution diffusion live!

0:14

5,321

Brian Chao

Gordon Wetzstein retweeted

Brian Chao

@BrianCChao

Jun 5

I integrated our recent work, Spectral Progressive Diffusion a.k.a. SPEED (arxiv.org/abs/2605.18736), with the open-source @ideogram_ai model released yesterday. Turns out our method works out of the box and speeds up inference by up to 1.6× while preserving the high image quality! [1/4]

132

11,130

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

Jun 2

🔍 Ill-posed inverse problems (deblurring, inpainting, phase retrieval) need strong priors, and diffusion models are great ones. But how you use them matters. Posterior samplers fold the prior & data term into the reverse diffusion, needing costly score backprop or inner MCMC. MAP variable-splitting methods decouple them, but lack a dual variable to fix constraint violations. DDiff tackles both. At #CVPR2026 🧵👇

132

8,514

more replies

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

Jun 2

In short, DDiff is an ADMM-inspired diffusion solver that delivers: ✅ Higher image quality across metrics ✅ Robustness to high measurement noise ✅ Better measurement fidelity / lower residual ✅ Faster sampling ✅ Latent-diffusion compatibility ✅ Provable convergence under mild assumptions, a first for diffusion-based posterior optimization

806

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

Jun 2

Work led by @SoniaMinseoKim, together with @axlevy0. 📄 Paper: arxiv.org/abs/2505.17353 💻 Code: github.com/SoniaMinseoKim/dd… 🌐 Project page: soniaminseokim.github.io/ddi…

Dual Ascent Diffusion for Inverse Problems

Ill-posed inverse problems are fundamental in many domains, ranging from astrophysics to medical imaging. Emerging diffusion models provide a powerful prior for solving these problems. Existing...

arxiv.org

712

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

Jun 2

The era of ultra-high-resolution imaging has arrived. Modern image sensors exceeding 200 MP resolution are common in smartphones, with over 400 MP sensors under development. However, the large number of pixels poses significant challenges for acquisition and processing, especially on edge devices. Which pixels should be acquired, and when, for bandwidth-efficient imaging and perception? We introduce Policy-based Foveated Imaging and Perception, an on-device, real-time, predictive, and task-aware framework that dynamically allocates sensor resolution to prioritize important regions under specific perception objectives. This paper will be presented at #SIGGRAPH2026! [1/6]

0:13

128

17,976

more replies

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

Jun 2

We further demonstrate that our framework works in real-world settings on a 200 MP Samsung ISOCELL HP2 prototype. For both object tracking and scene text recognition tasks, our sensor attention policy reproduces the smooth pursuit and saccadic scanpaths seen in simulations, achieving 200 MP bandwidth-efficient sensing under real-world conditions with sensor noise. [5/6]

0:17

731

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

Jun 2

Work led by @howard_xhc, together with @jan_on_x and @boyang_deng. 📄 Paper: arxiv.org/abs/2606.02565 🌐 Project page: howardxiao.ca/foveated/ Please check out our paper Fast Forward on July 19 and our paper presentation on July 23 at 2:50 PM at #SIGGRAPH2026!

569

Hansheng Chen

Gordon Wetzstein retweeted

Hansheng Chen @HanshengCh

May 22

Pixel diffusions are getting hot🔥, but are they really good at texture details? Here's a one-shot comparison (no cherry picking) - HiDream: plain x0 prediction - AsymFLUX.2: plain AsymFlow prediction - L2P: velocity prediction through detailer head

113

9,224

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

May 20

Gaussian splatting models the structure of the observable parts of a 3D scene well, but what about the unobserved part? Video generation models should generate this!

Liyuan Zhu

@liyuan_zz

May 20

🚀Tired of floaters, flickering, and blur in 3DGS? We introduce a geometry-informed video generator that refines 3DGS renderings in the wild. 🎥✨ We let the video model actually "see" the rendering process using a Gaussian Primitive buffer. #CVPR2026 @CVPR Project page: research.zhuliyuan.net/proje… 🔥 Highlights: ✅ Geometry-Buffer-conditioned video generator ✅ Refines optimization-based & feed-forward 3DGS ✅ Novel artifact simulation pipeline ✅ Highly efficient bidirectional processing ⚡ Thread 👇

0:31

5,341

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

May 19

High-fidelity generation is hitting a scaling crisis as DiT compute grows with image resolution and video length. But do we need high-resolution denoising at every step? We introduce Spectral Progressive Diffusion, a plug-and-play framework for efficient image and video generation that directly exploits the spectral autoregression property of diffusion to grow resolution during denoising. [1/7]

0:09

415

87,568

more replies

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

May 19

The spectral-domain perspective of our approach further enables a new editing capability: keeping low-frequency structure and modifying high-frequency details. This allows for efficient texture and style edits while better preserving geometry compared to SDEdit-style editing. [6/7]

0:11

2,804

Gordon Wetzstein

Gordon Wetzstein

@GordonWetzstein

May 19

Work led by @howard_xhc, together with @BrianCChao and @YarivLior. 📄 Paper: arxiv.org/abs/2605.18736 🌐 Project page: howardxiao.ca/speed/

2,642