Sven Elflein

Sven Elflein

9 Photos and videos

Tweets

Sven Elflein @s_elflein

Jun 7

We are presenting today in the afternoon poster session (15:30-17:30) at Poster No. 28! #CVPR2026

Sven Elflein @s_elflein

Feb 27

🚀 Exciting news! We’re introducing VGG-T³: a scalable model for offline feed-forward 3D reconstruction that finally tackles the "quadratic bottleneck." Ever wanted to have VGGT reconstruct a 1,000-image scene in seconds instead of 10 minutes and use it for visual localization?

5,432

Haithem Turki

Sven Elflein retweeted

Haithem Turki @Haithem_Turki

Jun 4

The ArtiFixer code and model weights are now released! Links to both on our project page: research.nvidia.com/labs/sil… Per-scene methods like 3DGS look great on captured views but collapse off-trajectory. We repair those artifacts and beat prior SOTA by 1–3 dB PSNR. 🧵(1/6)

0:08

247

23,443

Despoina Paschalidou

Sven Elflein retweeted

Despoina Paschalidou

@paschalidoud_1

Jun 3

It’s been a while since I posted here, but I’m very excited to share what our team at @nvidia has been building over the past year! After a year of active development, we’re getting ready to release SIL-Wheel to the world: a one-stop shop platform for data-centric workflows in large-scale video model training. Built by researchers, for researchers, SIL-Wheel brings together search, curation, annotation, evaluation, and analysis for large video datasets in one centralized framework. Want a sneak peek before the official release? Come by the NeXD26 Workshop @CVPR tomorrow at 10:30!🚀

10,671

Ruilong Li

Sven Elflein retweeted

Ruilong Li

@ruilong_li

Jun 3

World models are moving beyond offline generation towards interactive, real-time experiences. Introducing ⚡FlashDreams⚡: an open-source high-performance inference and serving library built for autoregressive world models: 🔥 Up to 3.10× faster LingBot-World inference 🔥 Up to 2.12× faster Self-Forcing inference 🔥 Up to 1.40× faster Wan2.1 inference 🔥 8 integrated models 🔥 Multi-GPU, streaming, low-latency serving 🔥 Agentic skills that teach you how to use it FlashDreams is designed for a new generation of AI systems that continuously evolve over time while responding to user interactions. It powers applications across robotics, autonomous vehicle simulation, gaming, and virtual worlds. Github: github.com/NVIDIA/flashdream… Docs: nvidia.github.io/flashdreams Research page: research.nvidia.com/labs/sil… Join the #flashdreams Discord channel at discord.gg/yTdHDqFP FlashDreams is also the runtime backbone behind NVIDIA OmniDreams (github.com/nv-tlabs/omni-dre…) 1/n #AI #WorldModels #FastInference #PhysicalAI #OpenSource #NVIDIA

0:47

367

87,430

Ashkan Mirzaei

Sven Elflein retweeted

Ashkan Mirzaei @ashmrz10

Jun 1

I’m excited to share what our team has been building at @NVIDIAAI since I joined: Cosmos 3, an omnimodal world model for Physical AI. Project: research.nvidia.com/labs/cos… HF: huggingface.co/collections/n… Code: github.com/NVIDIA/cosmos

0:19

157

12,250

Nikhil Keetha

Sven Elflein retweeted

Nikhil Keetha

@Nik__V__

May 30

Looks like I didn't do a good job of sharing this before but... Yes, you can infer, visualize, compare, train and finetune all the geometry foundation models in the MapAnything codebase‼️

3,978

Sven Elflein

Sven Elflein @s_elflein

May 30

Traditional 3D reconstruction pipelines like COLMAP operate in a loop, growing the scene piece-by-piece. 🧩🔄 With DejaView, we introduce this inductive bias for feed-forward 3D reconstruction—running a single alternating-attention block in a loop! Awesome work from the team 👇

Tobias Fischer @TobiasFischer11

May 30

Do 3D reconstruction transformers really need a billion parameters, or are most of those layers just doing the same thing over and over? Introducing Déjà View: a single transformer block, looped K times, that matches or beats models 8–10× its size with lower compute. 🧵

0:14

126

10,934

Sherwin Bahmani

Sven Elflein retweeted

Sherwin Bahmani @sherwinbahmani

May 27

📢 Stop by our #CVPR2026 workshop next week: GeoFreeNVS: Geometry-Free Novel View Synthesis and Controllable Video Models geofreenvs.github.io/ Especially excited to discuss how large generative models can solve 3D tasks like novel view synthesis!

6,731

Xuanchi Ren

Sven Elflein retweeted

Xuanchi Ren

@xuanchi13

May 26

The latent-vs-pixel debate misses the point. GPT Image 2 shows what users notice: pixel-level fidelity. Latent models show what scales: compact semantic structure. We connect them by replacing VAE/RAE decoders with a Pixel Diffusion Decoder. Code and Model available: research.nvidia.com/labs/sil… 🧵(1/N)

0:49

411

668,488

Shuhong Zheng

Sven Elflein retweeted

Shuhong Zheng @zhengshuhong

May 26

Exciting to share our work "Good Token Hunting" 🔍 (Yes, the name is inspired by the classic movie "Good Will Hunting" 🎬!), which focuses on accelerating visual geometry transformers 🚀 by limiting the number of keys/values each query can attend in global attention layers. [1/6]

16,748

Sven Elflein

Sven Elflein @s_elflein

May 25

We just released code and model! Go check it out! Code: github.com/nv-dvl/vgg-ttt Model: huggingface.co/nvidia/vgg-tt…

GitHub - nv-dvl/vgg-ttt: [CVPR'26] Official code for the paper "VGG-T³: Offline Feed-Forward 3D...

[CVPR'26] Official code for the paper "VGG-T³: Offline Feed-Forward 3D Reconstruction at Scale" - nv-dvl/vgg-ttt

github.com

Sven Elflein @s_elflein

Feb 27

472

45,301

Anagh Malik

Sven Elflein retweeted

Anagh Malik @anagh_malik

May 20

📢📢📢 Velox 🚀: Learning Representations of 4D Geometry and Appearance In our #CVPR2026 paper, we introduce a method for learning a native 4D representation, useful for many downstream tasks, such as video-to-4D, 3D tracking, cloth simulation, and others! 🌐: apple.github.io/ml-velox 📝: arxiv.org/abs/2605.04527

0:21

173

20,386

Kangxue Yin

Sven Elflein retweeted

Kangxue Yin @kangxue_yin

Apr 21

🚀We just released Asset Harvester, an image-to-3D model and end-to-end pipeline that extracts real object assets from autonomous driving videos! 🌐 Website: research.nvidia.com/labs/sil… 💻 Code: github.com/nvidia/asset-harv… [1/5] #AssetHarvester #AVSimulation #WorldModel #AutonomousDriving

0:19

130

793

106,888

Xuanchi Ren

Sven Elflein retweeted

Xuanchi Ren

@xuanchi13

Apr 15

We scaled up Lyra to generate explorable 3D worlds! 🚀 Introducing Lyra 2.0 — turning a single image into a 3D world you can walk through, look back, and even drop a robot into 🤖 Code and Model available today! 🌐 Website: research.nvidia.com/labs/sil… (1/N)

0:58

122

874

1,145,314

Jorge Condor

Sven Elflein retweeted

Jorge Condor @Arcanous98

Apr 9

Introducing Neural Harmonic Textures: our new method for real-time novel view synthesis that outperforms all 3DGS and NeRF derivatives including (finally) ZipNeRF in terms of quality across all benchmarks. The code is released (Apache 2.0): (research.nvidia.com/labs/sil…) 🧵

0:36

103

610

43,697

Zan Gojcic

Sven Elflein retweeted

Zan Gojcic @ZGojcic

Mar 16

A new generation in AV simulation is here! We are announcing AlpaDreams, a real time interactive generative world model for AV simualtion! Just a year ago it took minutes to generate a few seconds of video, today it is real time and interactive! research.nvidia.com/labs/sil…

OmniDreams (formerly AlpaDreams) — NVIDIA SIL

This page has moved. AlpaDreams is now OmniDreams.

research.nvidia.com

106

18,674

Zan Gojcic

Sven Elflein retweeted

Zan Gojcic @ZGojcic

Mar 2

We're releasing DiffusionHarmonizer, an online diffusion enhancer bridging neural reconstruction and photorealistic simulation by correcting artifacts, and harmonizing inserted objects so they truly belong in the scene: matching shadows, lighting & color research.nvidia.com/labs/sil…

0:05

273

45,731

Sven Elflein

Sven Elflein @s_elflein

Feb 27

551

84,115

more replies

Sven Elflein

Sven Elflein @s_elflein

Feb 27

6/7 ⚙️ Making it work: We find 1) it is critical to initialize from a pre-trained softmax-attention checkpoint. 2) TTT exhibits length generalization issues! Please check out the paper for more details on initialization and tricks towards closing the gap to softmax attention!

2,156

Sven Elflein

Sven Elflein @s_elflein

Feb 27

7/7 🥂 Huge congrats to the team: @ruilong_li, @sragostinho, @ZGojcic, @lealtaixe, @QunjieZhou, & @AljosaOsep! See you at #CVPR2026! 📄 Paper: arxiv.org/abs/2602.23361 🤗HuggingFace: huggingface.co/papers/2602.2… 🌐 Project Page: research.nvidia.com/labs/dvl…

VGG-T$^3$: Offline Feed-Forward 3D Reconstruction at Scale

We present a scalable 3D reconstruction model that addresses a critical limitation in offline feed-forward methods: their computational and memory requirements grow quadratically w.r.t. the number...

arxiv.org

2,096