Vincent Sitzmann

Vincent Sitzmann

115 Photos and videos

Tweets

Pinned Tweet

Vincent Sitzmann

@vincesitzmann

Jun 8

Introducing MilliVid, our new method for long-context video generation! MilliVid creates videos that are consistent over long time spans, without using retrieval heuristics or 3D maps! (1/n) davidcharatan.com/millivid/#

0:32

357

37,156

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 11

An fantastic scenario analysis of what future is in store for Europe if it doesn't change it approach to AI, taking it serious as the transformative technology that it is. Beyond LLMs, we are about to witness a revolution in robotics and automation - we aren't close to humanoid robots in homes yet, but much of the classical know-how of robot automation that was so essential to build Europe's industrial base will be disrupted by a new kind of robot workflow that, as of today, is dominated by American companies. Just as with LLMs before, this kind of robot automation will be dependent on compute as the key resource - if Europe does not (1) get compute (2) does everything in its power to foster AI research & development in Europe - they will be delegated to *buying the full stack of automation - robot hardware and the software that controls them* - from the US and China.

Judith Dada @DadaJudith

Jun 11

Most of Europe has not yet absorbed what AI is about to do to us. The few who have are not saying it loudly enough. We wrote Europe 2031: a five-year scenario of the continent's slide into irrelevance, how AI is driving it, and what can still be done to change course.

8,426

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 11

Yikes, *A* fantastic example, but of course X won't let me edit 🥲

1,461

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 11

Yet, at the same time, my key criticism of this report is that it focuses only on the short-term. Europe is in the position it is in today b/c it has never seriously built a research -> R&D -> product pipeline. The US has numerous mechanisms for seeding research topics that eventually emerge as products *decades* later: Consider the DARPA Grand self-driving challenge, which, more than *20 years ago*, planted the seed of autonomous driving research. A decade later, the US had an ecosystem of talent and experience to rely on when this technology started become ready for prime-time, and today, we have Waymo. Further, the US is based on *concentration* of talent and capital, something that is very much antithetical to Europe, but also to my home country Germany, which - due to an ancient political tradition - chooses to distribute resources across the whole country, where the US fosters hubs such as MIT & Harvard or the Bay Area to ensure a critical mass of R&D that is critical to actually get ambitious projects off the ground. I believe that the German way is better for society *if there is enough resources to go around*, unfortunately a situation that is very much not true in Europe anymore. In my opinion, for Europe to have a shot would require to invest into a single university and surrounding industry and startup ecosystem that concentrates talent and resources in a desirable location. There will be another revolution *after* this current "AI" wave has blown over, in 10-15 years from now. But if Europe doesn't change course, we will miss that one, too.

1,500

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 8

0:32

357

37,156

more replies

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 8

Also, shoutout to some related / relevant work: Of course, FramePack by Lvming Zhang! Then, inspiring work on flexible tokenization by folks such as @ShivamDuggal4, Roman Bachmann, @JRAllardice, David Mizrahi, @andrew_atanov, @_xwen_, @BingchenZhao, some of it in @zamir_ar lab!

1,497

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 9

Arxiv link here: arxiv.org/abs/2606.09056

MilliVid: Hierarchical Latents for Long-Range Consistency in Video...

Video generative models have become increasingly powerful, but long-range consistency remains challenging to achieve because even a few dozen frames require impractically long transformer sequence...

arxiv.org

630

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 9

ArXiv paper now available here: arxiv.org/abs/2606.09056

MilliVid: Hierarchical Latents for Long-Range Consistency in Video...

Video generative models have become increasingly powerful, but long-range consistency remains challenging to achieve because even a few dozen frames require impractically long transformer sequence...

arxiv.org

Vincent Sitzmann

@vincesitzmann

Jun 8

0:32

7,120

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 7

A really cool idea! The question of how we can train sequence models such that they remember things that are T timesteps in the past without backpropping through T timesteps remains one of the core problems in ML, and this looks like an inspiring approach!

Akarsh Kumar

@akarshkumar0101

Jun 7

We never really knew how to train nonlinear RNNs well… BPTT struggled with vanishing grads (no long-range memory) and sequential rollout (hard to parallelizable). What if instead an oracle told us the optimal memory state m_t at each step? Then the RNN could do one-step supervised learning on (m_t, x_{t 1}) → m_{t 1} labels. We call this Supervised Memory Training (SMT): a replacement for BPTT that trains RNNs without unrolling them. SMT is time-parallelizable and solves vanishing gradients. Website: akarshkumar.com/smt/ arXiv: arxiv.org/abs/2606.06479

116

22,189

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 7

My students @RyuHyunwoooo and @evnkimm are presenting their paper “Scaling View Synthesis Transformers” today at 11:45 am at poster session 5. They are also brilliant and can chat about lots of things in the broader embodied intelligence landscape. Come by!!

4,389

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 7

I am incredibly grateful to be awarded the PAMI Young Researcher award. CVPR this year was amazing fun; I am very excited to be part of this community and feel honored for this vote of support in my students' and my work :) These are exciting times and I can't wait for next year!

Dima Damen @CVPR @dimadamen

Jun 6

Congratulations @vincesitzmann for winning the outstanding Young researcher @CVPR #CVPR2026 PAMI-TC awards! I’m sure many more awards to come,

306

29,449

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 7

Thanks a lot, Dima, I am so grateful to be in a research community with folks like you around!

Dima Damen @CVPR @dimadamen

Jun 6

Congratulations @vincesitzmann for winning the outstanding Young researcher @CVPR #CVPR2026 PAMI-TC awards! I’m sure many more awards to come,

5,638

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 4

Many of my students / collaborators are at CVPR - find them & chat! @ottogin1, diffusion models @ericmchen1, latent actions & robotics @RyuHyunwoooo, latent actions & robotics @ekim2339, robotics, view synthesis @SimulatedAnneal, robotics @twmitchel, latent actions & video

7,636

Vincent Sitzmann

Vincent Sitzmann

@vincesitzmann

Jun 4

Oh no I tagged the wrong Evan! This is the right one: @evnkimm sorry to both Evans!!

1,897

Mason Kamb

Vincent Sitzmann retweeted

Mason Kamb @MasonKamb

Jun 3

If you’re @CVPR: come by our tutorial tomorrow, June 4th 8:30-5:00, on Analytic Understanding of Diffusion Models. We’ll be covering how and why diffusion models generalize, learning about state-of-the-art analytical theories for their behavior, and covering key open questions.

8,263