Thomas Kipf

Thomas Kipf

281 Photos and videos

Tweets

Pinned Tweet

Thomas Kipf

@tkipf

4 May 2020

My PhD thesis "Deep Learning with Graph-Structured Representations" is now available for download: hdl.handle.net/11245.1/1b63b… -- It covers a range of emerging topics in Deep Learning: from graph neural nets (and graph convolutions) to structure discovery (objects, relations, events)

600

3,179

Lukasz Kaiser

Thomas Kipf retweeted

Lukasz Kaiser

@lukaszkaiser

Jun 3

I believe that exploring and making mistakes is key to learning and research.

1:00

10,371

Thomas Kipf

Thomas Kipf

@tkipf

May 29

Genuinely amazed by how many generalist visual capabilities one can squeeze out of this model

fofr

@fofrAI

May 29

A quick test of using Omni to edit a video and add labelled bounding boxes around objects. > Add a labelled bounding box around the monster truck and the flag

0:10

8,538

Jerrod Lew

Thomas Kipf retweeted

Jerrod Lew

@jerrod_lew

May 22

Gemini Omni can create action replays from different angles. I referenced a video clip with agent in Google Flow. Then asked it to give new angles that follow the original video timing, environment and movement. This test came really close to real-time consistency!

0:09

415

15,948

Dumitru Erhan

Thomas Kipf retweeted

Dumitru Erhan

@doomie

May 22

World Models ftw :)

CHRIS FIRST

@chrisfirst

May 21

I uploaded a screenshot of Google Maps to Gemini Omni with a route drawn on it. Then I prompted it to create a first person view of someone driving a taxi cab along the route in the reference image. Pretty close to the real thing.

0:08

8,526

Thomas Kipf

Thomas Kipf

@tkipf

May 19

Getting a sloth into the office for this interview was the hardest part I heard.

Shlomi Fruchter

@shlomifruchter

May 19

We sat down with @OfficialLoganK @nbrichtova @doomie @gbarthmaron to talk about Gemini Omni Flash. It was pretty wild.

0:33

4,158

Jay Whang

Thomas Kipf retweeted

Jay Whang @jaywhang_

May 19

Super excited to see Gemini Omni finally out in the world! Having been part of this project since its inception, I've seen how its native multimodal capabilities can redefine what's possible. We're truly entering the "Nano Banana era" for video generation. Give it a try!

0:10

4,103

Thomas Kipf

Thomas Kipf

@tkipf

May 19

Gemini Omni allows me to step into an alternative timeline where Graph Convolutional Nets (GCNs) made it to the big stage 🙃 Jokes aside: excited to finally share how far we've come with multimodal reference conditioning.

0:10

143

9,762

Thomas Kipf

Thomas Kipf

@tkipf

May 19

... and it is *so* fast ⚡️⚡️⚡️

Logan Kilpatrick

@OfficialLoganK

May 19

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

3,727

Thomas Kipf

Thomas Kipf

@tkipf

May 13

Confession: I never had a single work-related sleepless night or ever pulled an all-nighter during my career incl. PhD. Don’t sacrifice your health. Sleep is a superpower — your brain on 8hrs of sleep is a lot smarter than your brain on sleep deprivation. Don’t listen to people who tell you to chronically sacrifice sleep for work. Sacrificing sleep for your kids/family is a different story.

Sarvesh Gharat @SarveshGharat12

May 12

Replying to @npparikh

I doubt all those things are really possible. Infact I believe, you are not doing a good PhD unless you have sleepless nights. Definitely just working on your thesis is possible if you follow a 9-6 schedule, but a good PhD which involves exploring, colabs, etc needs extra hours

1,077

105,780

Thomas Kipf

Thomas Kipf

@tkipf

May 13

You can work long hours (if you want to) and still prioritize sleep.

4,539

Ravid Shwartz Ziv

Thomas Kipf retweeted

Ravid Shwartz Ziv

@ziv_ravid

Apr 19

This week on The Information Bottleneck, we are hosting @wellingmax 🥳🥳🥳 Max is one of the most influential ML researchers of the last two decades - Professor at UvA, ex-VP at Qualcomm and MSR, co-founder of ELLIS, two Test of Time awards, and advisor to a long line of people who shaped modern generative modeling (@dpkingma, @TacoCohen , @tkipf). His own work is behind a lot of the machinery the field is still building on - VAEs, graph convolutional networks, and equivariant neural networks, to name a few. He's now CTO and co-founder of CuspAI, designing new materials for carbon capture and PFAS removal using generative models plus physics-based simulation. What would you ask him? Drop questions below.

10,879

Logan Kilpatrick

Thomas Kipf retweeted

Logan Kilpatrick

@OfficialLoganK

Apr 14

Introducing Gemini Robotics ER 1.6, our new SOTA robotics model 🤖 which excels at visual and spacial reasoning, now available via the Gemini API!

180

2,211

117,866

Thomas Kipf

Thomas Kipf

@tkipf

Apr 9

Veo's Reference-to-Video capability is still #1 👑

Design Arena

@Designarena

Apr 7

BREAKING: Veo 3.1 Fast and Veo 3.1 by @GoogleDeepMind are in 1st and 2nd place on Multi-Image to Video Arena These models can successfully reference multiple input images to create a video that users love At an average generation time of 48 seconds, they are also the two fastest video generation models Huge congrats to the @GoogleDeepMind team for this achievement!

2,412

Anthropic

Thomas Kipf retweeted

Anthropic

@AnthropicAI

Apr 6

We've signed an agreement with Google and Broadcom for multiple gigawatts of next-generation TPU capacity, coming online starting in 2027, to train and serve frontier Claude models.

614

1,321

20,823

3,017,621

Ziyi Wu

Thomas Kipf retweeted

Ziyi Wu @Dazitu_616

Apr 3

To build multi-player games with video models, we likely need a map. One challenge here is the action binding problem, which we solve with simple RoPE-based attention biasing. While existing multi-actor models specilize in one game, we generalize to 46 games and diverse actions!

Alexander Pondaven @alexpondaven

Apr 3

Introducing ActionParty: the first video world model that controls up to 7 players simultaneously on the same screen across 46 game environments. We tackle the action binding problem in video diffusion, ensuring each player's action is applied to the right subject. 🧵

0:10

6,751

Demis Hassabis

Thomas Kipf retweeted

Demis Hassabis

@demishassabis

Apr 3

Gemma 4 outperforms models over 10x their size! (note the x-axis is log scale!)

146

237

2,950

217,972