Research Scientist @GoogleAI #GoogleResearch. Adjunct Faculty @CarnegieMellon.

Joined October 2010
8 Photos and videos
Lu Jiang retweeted
How do we generate videos on the scale of minutes, without drifting or forgetting about the historical context? We introduce Mixture of Contexts. Every minute-long video below is the direct output of our model in a single pass, with no post-processing, stitching, or editing. 1/4
22
98
607
157,128
Lu Jiang retweeted
14 Apr 2025
Glad to share Seaweed-7B, a cost-effective foundation model for video generation. Our tech report highlights the key designs that significantly improve compute efficiency and performance given limited resources, achieving comparable quality against other industry-level models. To unleash the power of the foundation model, Seaweed-7B further enables a wide range of downstream applications including image-to-video generation, human video generation, subject-consistent video generation, video-audio joint generation, long video generation and storytelling, real-time generation, super-resolution generation, camera controlled generation. Check out our webpage and report for more details: Webpage: seaweed.video/ Paper: seaweed.video/seaweed.pdf It's a wonderful journey of the last year. Thanks to all teammates for their contributions, sincerely.
34
96
515
77,430
Lu Jiang retweeted
28 Mar 2025
Synthetic Video Enhances Physical Fidelity in Video Synthesis A turtle swimming in a green background. video matting illustration
4
14
100
16,628
Lu Jiang retweeted
14 Mar 2025
We propose Long Context Tuning (LCT) for scene-level video generation to bridge the gap between current single-shot generation and real-world narrative video productions. Homepage: guoyww.github.io/projects/lo… Report: arxiv.org/abs/2503.10589
4
23
103
46,814
Lu Jiang retweeted
13 Jan 2025
Want the deep dive? • arXiv: arxiv.org/abs/2501.06173 • Project Page: videoauteur.github.io See how VideoAuteur CookGen are shaping long narrative video generation. Big shout out to my co-authors and advisors: @fncheng2333 @liangkegui @YuilleAlan @roadjiang

1
2
494
Lu Jiang retweeted
15 Jan 2025
Seaweed APT Diffusion Adversarial Post-Training for One-Step Video Generation Existing diffusion and autoregressive generative models require repeated neural network evaluations. It is extremely slow for the high-resolution video generation task, as a few-second video can take many minutes to generate. Our work is the first to demonstrate the generation of an entire video using a single neural function evaluation (1NFE) by using our proposed adversarial post-training technique. Our model generates 2 seconds of 1280x720 24fps videos in real-time. We showcase some of the results below:
9
34
203
21,199
23 Dec 2023
Interesting comparison between our VideoPoet and other competitive models. The comparison is incredibly helpful and reinforces my belief that VideoPoet excels in generating larger motions. We know the exact reasons for this and are working on improving single frame quality.
22 Dec 2023
Google VideoPoet, Runway, Pika & Genmo Google recently announced Video Poet. Google's VideoPoet is a large language model (LLM) that is capable of a wide variety of video generation tasks, including: - text-to-video - image-to-video - video stylization - video inpainting and outpainting - video-to-audio. I tried some of their text-to-image prompts (from their demo) in Pika, Runway and Genmo. Here are the results: 10 examples 1/10 Two teddy bears holding hands, walking down rainy 5th avenue.
6
1,286
11 Dec 2023
Excited to be at #NeurIPS2023 this week! Can't wait to reconnect with colleagues and make new connections. If you're up for a coffee chat, feel free to reach out. Find me at our spotlight/posters. arxiv.org/abs/2306.17842 Tue 12 5:15 p.m. arxiv.org/abs/2306.00983 Wed 13 10:45 a.m
1
5
904
Lu Jiang retweeted
We introduce W.A.L.T, a diffusion model for photorealistic video generation. Our model is a transformer trained on image and video generation in a shared latent space. 🧵👇
49
247
1,233
431,121
20 Nov 2023
😲While preparing the meta-review for #aaai24, I stumbled upon a new form of parallelism. It wasn't about the paper's concepts, but rather in the review comments, where two reviewers listed identical comments, word for word, over 200 matching words. #PeerReview #AIResearch

ALT Shocked Joey GIF

1
4
853
2 Aug 2023
📢 Call for Papers! International Journal of Computer Vision (IJCV) invites submissions for its special issue on "Generative Models for Content Creation and Manipulation." 🗓️ Manuscript Submission Deadline: February 28, 2024 🔗 Check it out here: springer.com/journal/11263/u…

1
4
534
4 Jul 2023
Fascinating research by Google reveals the power of Language Models (LLMs) like PaLM or GPT in tackling visual tasks using in-context learning. This novel method enables LLMs to perform image generation tasks without requiring any parameter updates. #palm #GPT4 #LLMs
2
67
249
149,805
5 Jul 2023
Images can be described using multilingual words and utilized to reconstruct the image with varying levels of fidelity.
3
1
20
2,595
5 Jul 2023
Paper link: huggingface.co/papers/2306.1… Can GPT solve visual tasks by in-context learning? This paper shows this seems plausible as long as we can translate the image (or other non-linguisitic modality) into a language that the LLM can comprehend.
1
10
67
14,241
21 Feb 2023
Which model do you think is more responsible in generating images with genders: #dalle2 or #stablediffusion? Our new paper on Gender Presentation Differences in Text-to-Image Models compares these models!" Check out our webpage: salt-nlp.github.io/GEP/ #AIart #generativeart

2
2
3
555
21 Feb 2023
1
345
21 Feb 2023
Text-to-image models can generate images based on text, but we don't fully understand how they see gender. Our work proposes a new way to study this. We also propose a metric called GEP to estimate differences automatically. #GenderPresentationDifferences #TextToImage #AI
1
1
277