Saad Khan

Saad Khan

158 Photos and videos

Tweets

Pinned Tweet

Saad Khan @saadventures

30 Jun 2015

"There are decades where nothing happens; then there are weeks where decades happen." -- Vladimir Ilich Lenin

Saad Khan

Saad Khan @saadventures

19 Oct 2024

Wow. Beautiful.

David Rowe @mrdavidrowe

15 Aug 2024

'Hopefulness is not a neutral position. It is adversarial. It is the warrior emotion that can lay waste to cynicism. Each redemptive or loving act, as small as you like – such as reading to your little boy... keeps the Devil down in the hole.' Nick Cave.

1:57

268

Saad Khan

Saad Khan @saadventures

22 Feb 2024

Thinking of Brother Malcolm (X) today. إِنَّا ِلِلَّٰهِ وَإِنَّا إِلَيْهِ رَاجِعُونَ

312

Saad Khan

Saad Khan @saadventures

13 Jan 2024

Gangster

Andrej Karpathy

@karpathy

12 Jan 2024

I touched on the idea of sleeper agent LLMs at the end of my recent video, as a likely major security challenge for LLMs (perhaps more devious than prompt injection). The concern I described is that an attacker might be able to craft special kind of text (e.g. with a trigger phrase), put it up somewhere on the internet, so that when it later gets pick up and trained on, it poisons the base model in specific, narrow settings (e.g. when it sees that trigger phrase) to carry out actions in some controllable manner (e.g. jailbreak, or data exfiltration). Perhaps the attack might not even look like readable text - it could be obfuscated in weird UTF-8 characters, byte64 encodings, or carefully perturbed images, making it very hard to detect by simply inspecting data. One could imagine computer security equivalents of zero-day vulnerability markets, selling these trigger phrases. To my knowledge the above attack hasn't been convincingly demonstrated yet. This paper studies a similar (slightly weaker?) setting, showing that given some (potentially poisoned) model, you can't "make it safe" just by applying the current/standard safety finetuning. The model doesn't learn to become safe across the board and can continue to misbehave in narrow ways that potentially only the attacker knows how to exploit. Here, the attack hides in the model weights instead of hiding in some data, so the more direct attack here looks like someone releasing a (secretly poisoned) open weights model, which others pick up, finetune and deploy, only to become secretly vulnerable. Well-worth studying directions in LLM security and expecting a lot more to follow.

293

Saad Khan

Saad Khan @saadventures

1 Oct 2023

Damn.

Epic Maps 🗺️

@theepicmap

30 Sep 2023

1000 years of history in 1 image

406

Saad Khan

Saad Khan @saadventures

19 Sep 2023

Oh snap! Finally some shoes for my wild children :) (cc @sidraqasim )

Atoms @Atoms

19 Sep 2023

Introducing Kids Model 123 – comfortable & durable, made with a redesigned outsole that flexes with every movement. This has become a personal project for us, as we set out to make the best shoes that our kids will love wearing everyday! atoms.com/model123

1,118

Pillars Fund

Saad Khan retweeted

Pillars Fund @pillars_fund

1 May 2023

Applications for the 2024 Pillars Artist Fellowship are now open! Don’t miss out on this incredible opportunity if you are a Muslim director or screenwriter living in the U.S. or U.K. Apply here: pillarsfund.org/artist-fello…

0:28

129

39,758

Saad Khan

Saad Khan @saadventures

19 Jul 2023

AI is everywhere. Mubarak @FahadsEmpire !

fahadkhan @FahadsEmpire

17 Jul 2023

Excited about @unity's beta for Safe Voice, a project I worked on that uses #AI / #ML tech to detect and end player toxicity for in-game voice chat. A process that's traditionally been resource-intensive and highly-manual, much more automated, efficient and scalable for studios

310

Saad Khan

Saad Khan @saadventures

8 Jun 2023

Sweet! Let the games begin 🕹

.@adamgazz

7 Jun 2023

Excited about the release of EndeavorOTC, no-prescription required, non-drug, video game treatment for adults with ADHD! Built on the same technology as Akili’s EndeavorRx, the world’s first FDA-authorized pediatric video game treatment. Available on Apple’s App Store.

282

Saad Khan

Saad Khan @saadventures

31 May 2023

Still got @SynBioBeta on the brain. @johncumbers Reflecting on potential tracks for you next year. Question: Doesn’t AI Biology = ‘I’? Just saying :)

181

Saad Khan

Saad Khan @saadventures

23 May 2023

About to get our DNA on this week. Feels like the night before the first day of school. :) cc @SynBioBeta @johncumbers synbiobeta.com/

SynBioBeta - Synthetic Biology Events, Info, & Industry Information

Join SynBioBeta, the global community of biological engineers, entrepreneurs, investors, and innovators working to make biology easier to engineer.

synbiobeta.com

828

Saad Khan

Saad Khan @saadventures

21 May 2023

Dope.

Ayush Tiwari @sighyush

21 Feb 2023

Javed Akhtar's masterclass in Lahore on the problem with the idea of a 'pure language'. @Javedakhtarjadu

2:14

290

Saad Khan

Saad Khan @saadventures

19 May 2023

Anyone rolling to @SynBioBeta next week? Programming DNA is just better with friends :) (cc @johncumbers @PaulStamets ) synbiobeta.com.

SynBioBeta - Synthetic Biology Events, Info, & Industry Information

Join SynBioBeta, the global community of biological engineers, entrepreneurs, investors, and innovators working to make biology easier to engineer.

synbiobeta.com

1,814

Saad Khan

Saad Khan @saadventures

13 May 2023

Warriors lost. I am in need of an angel

201

Saad Khan

Saad Khan @saadventures

22 Apr 2023

Gonna miss this Ramadan. Eid Mubarak y’all.

159

Saad Khan

Saad Khan @saadventures

21 Apr 2023

Eid Mubarak, Eastern Hemisphere :)

267

The Berkeley Scanner

Saad Khan retweeted

The Berkeley Scanner

@BerkeleyScanner

5 Jan 2023

Three lightning "flashes" hit Berkeley almost simultaneously early Thursday morning, setting off car alarms and sending fear and excitement through the city. berkeleyscanner.com/2023/01/…

Lightning strikes Berkeley during 'bomb cyclone' storm

According to the National Weather Service, there were three lightning "flashes" — two that hit the ground and a "cloud pulse."

berkeleyscanner.com

102

22,123

Saad Khan

Saad Khan @saadventures

1 Jan 2023

Happy New Year, world :)

357

Saad Khan

Saad Khan @saadventures

1 Jan 2023

2022, I’m gonna miss you (kinda sorta).

548

Saad Khan

Saad Khan @saadventures

25 Dec 2022

Merry Christmas, world :) 🎁

832

Chris Evangelista

Saad Khan retweeted

Chris Evangelista @cevangelista413

11 Dec 2022

Every James Cameron story is like “Some asshole came into my office to complain about the budget and I shot him with a harpoon gun” and it’s funny every time

902

14,142