Bilawal Sidhu

Bilawal Sidhu

160 Photos and videos

Tweets

Robert Tercek retweeted

Bilawal Sidhu

@bilawalsidhu

18 Feb 2024

I’ve used the Apple Vision Pro for 2 weeks now and here are my unfiltered thoughts — you might even call it a hot take 🌶️ 😅 Overall: I'm blown away, absolutely hyped... but also? Frustrated. Why is Apple making it SO HARD to tap into the existing VR media scene? There is a plethora of VR 180 and 360 content out there. And they've got this sweet immersive video player tucked away in Apple TV, with that signature Apple polish. Great I can watch an Alicia Keys video. YouTubeVR has millions of more videos I can’t currently watch. So I try to drop manually converted spatial videos directly into the Mac Photos app – nope! Gotta jump through iCloud hoops to retain the metadata? It's ridiculous. Almost feels like Apple wants to gate keep ALL immersive content on the headset. Ok fine but you can shoot stuff on iPhone spatial videos or on the headset itself. Awesome, but to edit it you need to use a 3rd party converter to turn it back into well adopted VR media formats to edit in Adobe or Resolve lol. There are killer ARKit apps, it might’ve made sense to get some of those ported over! Missed layup. And don’t even get me started on the broken webXR support. 3D websites like Luma AI and Polycam should’ve been immersive 3d on day 1. It feels like they just shipped an unfinished product. Which might be the case, considering how much of that mind-blowing WWDC 2023 stuff is STILL missing. Remember those awesome shared SharePlay experiences they demoed? Poof, gone. Even simple stuff, like shared spatial anchors that ARKit already supports for multiplayer AR, is nowhere to be found. I mean they can pass through reality with imperceptible latency, but I can’t have shared experiences with users who also have Apple Vision Pro in the room, and I am relegated to iPads on screens? Not much better than zoom and a massive underutilization of what this hardware is capable of in terms of realistic co-presence and remote collaboration. Look, maybe this is all growing pains. Maybe they shipped with 20% of the roadmap ready, and we'll just have to wait. But right now, the Apple Vision Pro feels like this super shiny walled garden in the middle of a sprawling VR playground. Here's hoping they open the gates soon... 🔐 #AppleVisionPro #MetaQuest3 #frustratedfanboy

370

90,246

Jim Fan

Robert Tercek retweeted

Jim Fan

@DrJimFan

16 Feb 2024

Apparently some folks don't get "data-driven physics engine", so let me clarify. Sora is an end-to-end, diffusion transformer model. It inputs text/image and outputs video pixels directly. Sora learns a physics engine implicitly in the neural parameters by gradient descent through massive amounts of videos. Sora is a learnable simulator, or "world model". Of course it does not call UE5 explicitly in the loop, but it's possible that UE5-generated (text, video) pairs are added as synthetic data to the training set.

747

191,667

Jim Fan

Robert Tercek retweeted

Jim Fan

@DrJimFan

16 Feb 2024

I see some vocal objections: "Sora is not learning physics, it's just manipulating pixels in 2D". I respectfully disagree with this reductionist view. It's similar to saying "GPT-4 doesn't learn coding, it's just sampling strings". Well, what transformers do is just manipulating a sequence of integers (token IDs). What neural networks do is just manipulating floating numbers. That's not the right argument. Sora's soft physics simulation is an *emergent property* as you scale up text2video training massively. - GPT-4 must learn some form of syntax, semantics, and data structures internally in order to generate executable Python code. GPT-4 does not store Python syntax trees explicitly. - Very similarly, Sora must learn some *implicit* forms of text-to-3D, 3D transformations, ray-traced rendering, and physical rules in order to model the video pixels as accurately as possible. It has to learn concepts of a game engine to satisfy the objective. - If we don't consider interactions, UE5 is a (very sophisticated) process that generates video pixels. Sora is also a process that generates video pixels, but based on end-to-end transformers. They are on the same level of abstraction. - The difference is that UE5 is hand-crafted and precise, but Sora is purely learned through data and "intuitive". Will Sora replace game engine devs? Absolutely not. Its emergent physics understanding is fragile and far from perfect. It still heavily hallucinates things that are incompatible with our physical common sense. It does not yet have a good grasp of object interactions - see the uncanny mistake in the video below. Sora is the GPT-3 moment. Back in 2020, GPT-3 was a pretty bad model that required heavy prompt engineering and babysitting. But it was the first compelling demonstration of in-context learning as an emergent property. Don't fixate on the imperfections of GPT-3. Think about extrapolations to GPT-4 in the near future.

0:09

230

430

2,666

990,765

Jim Fan

Robert Tercek retweeted

Jim Fan

@DrJimFan

15 Feb 2024

If you think OpenAI Sora is a creative toy like DALLE, ... think again. Sora is a data-driven physics engine. It is a simulation of many worlds, real or fantastical. The simulator learns intricate rendering, "intuitive" physics, long-horizon reasoning, and semantic grounding, all by some denoising and gradient maths. I won't be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be! Let's breakdown the following video. Prompt: "Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee." - The simulator instantiates two exquisite 3D assets: pirate ships with different decorations. Sora has to solve text-to-3D implicitly in its latent space. - The 3D objects are consistently animated as they sail and avoid each other's paths. - Fluid dynamics of the coffee, even the foams that form around the ships. Fluid simulation is an entire sub-field of computer graphics, which traditionally requires very complex algorithms and equations. - Photorealism, almost like rendering with raytracing. - The simulator takes into account the small size of the cup compared to oceans, and applies tilt-shift photography to give a "minuscule" vibe. - The semantics of the scene does not exist in the real world, but the engine still implements the correct physical rules that we expect. Next up: add more modalities and conditioning, then we have a full data-driven UE that will replace all the hand-engineered graphics pipelines. openai.com/sora

0:15

533

2,602

12,883

6,181,669

Robert Tercek

Robert Tercek @Superplex

18 Oct 2023

This is well worthwhile reading

Nathan Benaich

@nathanbenaich

12 Oct 2023

🪩The @stateofai 2023 is now here. Our 6th installment is one of the most exciting years I can remember. The #stateofai report covers everything you *need* to know, covering research, industry, safety and politics. There’s lots in there, so here’s my director’s cut 🧵

359

Nathan Benaich

Robert Tercek retweeted

Nathan Benaich

@nathanbenaich

12 Oct 2023

488

1,639

971,267

TomLikesRobots🤖

Robert Tercek retweeted

TomLikesRobots🤖

@TomLikesRobots

21 Jul 2023

I'm absolutely blown away by @runwayml's #Gen2 using image input. The movement is so natural. Using it with @midjourney is a winning combination. If you want your video to stay true to your image, don't use a text prompt. (Thanks to @Uncanny_Harry and @Merzmensch for the tip!). This shows huge potential for creating #aicinema

0:09

324

106,493

Voidz

Robert Tercek retweeted

Voidz @voidzto

16 Jan 2023

I think my reality is broken… #mixedrealityart #NFTCommunity #Web3 #nftarti̇st #NFTs #digitalartists

1:28

172

21,753

Jim Fan

Robert Tercek retweeted

Jim Fan

@DrJimFan

30 Jun 2023

Google is hosting the first "Machine Unlearning" challenge. Yes you heard it right - it's the art of forgetting, an emergent research field. GPT-4 lobotomy is a type of machine unlearning. OpenAI tried for months to remove abilities it deems unethical or harmful, sometimes going a bit too far. Unlike deleting data from disk, deleting knowledge from AI models (without crippling other abilities) is much harder than adding. But it is useful and sometimes necessary: ▸ Reduce toxic/biased/NSFW contents ▸ Comply with privacy, copyright, and regulatory laws ▸ Hand control back to content creators - people can request to remove their contribution to the dataset after a model is trained ▸ Update stale knowledge as new scientific discoveries arrive Check out the machine unlearning challenge: ai.googleblog.com/2023/06/an…

512

2,291

588,564

Andrej Karpathy

Robert Tercek retweeted

Andrej Karpathy

@karpathy

30 Jun 2023

I think this is mostly right. - LLMs created a whole new layer of abstraction and profession. - I've so far called this role "Prompt Engineer" but agree it is misleading. It's not just prompting alone, there's a lot of glue code/infra around it. Maybe "AI Engineer" is ~usable, though it takes something a bit too specific and makes it a bit too broad. - ML people train algorithms/networks, usually from scratch, usually at lower capability. - LLM training is becoming sufficently different from ML because of its systems-heavy workloads, and is also splitting off into a new kind of role, focused on very large scale training of transformers on supercomputers. - In numbers, there's probably going to be significantly more AI Engineers than there are ML engineers / LLM engineers. - One can be quite successful in this role without ever training anything. - I don't fully follow the Software 1.0/2.0 framing. Software 3.0 (imo ~prompting LLMs) is amusing because prompts are human-designed "code", but in English, and interpreted by an LLM (itself now a Software 2.0 artifact). AI Engineers simultaneously program in all 3 paradigms. It's a bit 😵‍💫

swyx

@swyx

30 Jun 2023

🆕 Essay: The Rise of the AI Engineer latent.space/p/ai-engineer Keeping up on AI is becoming a full time job. Let's get together and define it.

140

706

4,073

2,009,267

Robert Tercek

Robert Tercek @Superplex

14 Jun 2023

I had a lively discussion with Jim Rutt about the WGA and copyright in the age of AI. Check it out!

Jim Rutt

@jim_rutt

14 Jun 2023

🎙️ w/ @Superplex on the writers' strike and IP in the era of generative AI. The history of Hollywood union negotiations, likely impacts on writers, the threat to influencers, why ChatGPT empowers writers in the near term, AI for education, & much more. jimruttshow.com/robert-terce…

814

Drake Facts

Robert Tercek retweeted

Drake Facts @NewsIn6ix

8 Jun 2023

AI generated QR Code Art will be the next big thing. Heres whats possible. 1. Snowy Village

158

508

5,764

1,379,616

Matt Wolfe

Robert Tercek retweeted

Matt Wolfe

@mreflow

6 Jun 2023

We've seen text-to-image, text-to-3d object, and even text-to-video... Now check out text-to-3d character from @daz3d. Use natural language to create any character you can imagine in near-AAA game quality and then export that character directly into Blender, Unreal or Unity!

1:00

263

1,282

287,121

Moritz Kremb

Robert Tercek retweeted

Moritz Kremb

@moritzkremb

7 Jun 2023

You can easily create your own animated avatar. It takes less than 10 minutes. I'll show you how in 3 simple steps:

0:11

144

991

4,651

1,512,480

fofr

Robert Tercek retweeted

fofr

@fofrAI

4 May 2023

🧵 A big #Midjourney thread on how to write prompts to get good cinematic images. In this thread I’ll build up a single prompt with cinematic elements, and show their effects. Each prompt will use a 16:9 aspect ratio, and to minimise variation I've locked in a seed.

413

2,553

795,637

Robert Tercek

Robert Tercek @Superplex

20 May 2023

These examples don’t reveal anything that could plausibly “disrupt Hollywood” any time soon. But the progress is impressive and the trajectory is clear.

Nathan Lands

@NathanLands

18 May 2023

AI video has started to produce mindblowing results and could eventually disrupt Hollywood. (PT9) Here are the best AI videos I've found:

201

Robert Tercek

Robert Tercek @Superplex

8 May 2023

Prompt tips for MJ 5.1. Enjoy

fofr

@fofrAI

4 May 2023

331

Robert Tercek

Robert Tercek @Superplex

8 May 2023

Good info and charts here

Misha

@mishadavinci

7 May 2023

By 2025, ChatGPT will destabilize hundreds of millions of white-collar jobs. It could mean mass job loss for the highly educated. Here's what you need to know:

213

Rowan Cheung

Robert Tercek retweeted

Rowan Cheung

@rowancheung

21 Apr 2023

Another huge day in the world of AI with announcements from: Snapchat 'My AI' Synthesis AI Google Brain and Deepmind Martin Shkreli Here's a rundown on everything you need to know:

167

1,210

904,324

Sully

Robert Tercek retweeted

Sully

@SullyOmarr

20 Apr 2023

Stability just released their new LLM. It's open-source, has 7bparameters, and its entirely free to use commercially. And its a MASSIVE deal, that has the potential to change up everything in AI Here's why:

286

1,893

741,624