There is NO SITUATION that is NOT improved by bringing ENERGY and PASSION to it. There is no future. There is no past. There is ONLY an ever unfolding NOW. Throw your heart and soul into THIS present moment.

161

1,546

24,747

Enigma (insane arc)

SunLight retweeted

Enigma (insane arc)@TuringsEye

24 Oct 2025

Superman advises to not be stuck in the past.

GRITCULT

SunLight retweeted

GRITCULT

@GRITCULT

23 Oct 2025

525

15,251

Richard Sutton

SunLight retweeted

Richard Sutton

@RichardSSutton

20 Oct 2025

To learn more about temporal difference learning, you could read the original paper (incompleteideas.net/papers/s…) or watch this video (videolectures.net/videos/dee…).

Khurram Javed

@kjaved_

18 Oct 2025

The Dwarkesh/Andrej interview is worth watching. Like many others in the field, my introduction to deep learning was Andrej’s CS231n. In this era when many are involved in wishful thinking driven by simple pattern matching (e.g., extrapolating scaling laws without nuance), it’s refreshing to hear an influential voice that is tethered to reality. One clarification for the podcast is that when Andrej says humans don’t use reinforcement learning, he is really saying humans don't use returns as learning targets. His example of LLMs struggling to learn to solve math problems from outcome-based rewards also elucidates the problem with learning directly from returns. Fortunately for RL, this exact problem is solved by temporal difference (TD) learning. All sample-efficient RL algorithms that show human-like learning (e.g., sample-efficient learning on Atari, and our work on learning from experience directly on a robot) rely on TD learning. Now Andrej is not primarily an RL person; he is looking at RL through the lens of LLMs these days, and all RL done in LLMs uses returns as targets, so it’s understandable that he is assuming that RL is all about learning from observed returns. But this assumption leads him to the incorrect conclusion that we need process-based dense rewards for RL to work. If you embrace TD learning, then you don't necessarily need a dense reward. Once you have learned a value function that encodes useful knowledge about the world, you can learn on the fly in the absence of rewards, just like humans and animals. This is possible because in TD learning there is no difference between learning from an unexpected reward and learning from an unexpected change in perceived value.

119

1,061

159,611

Dylan Madden

SunLight retweeted

Dylan Madden

@Dylanmadden

20 Oct 2025

For the next 24 hours, only speak to yourself as you would to a friend you highly respect. Then do it another 24 hours. Day by day you'll get better and better.

122

4,929

SunLight

SunLight @SunLightGrow

20 Oct 2025

Happy Diwali fam 🪔🎇

Atlas

SunLight retweeted

Atlas

@DentesLeo

21 Sep 2025

"This wise man observed that wealth is a tool of freedom. But the pursuit of wealth is the way to slavery." ~ Dune.

205

8,258

DR22 Ω 🪬🎭

SunLight retweeted

DR22 Ω 🪬🎭

@DejaRu22

22 Sep 2025

860

33,573

Natan Zucchetti

SunLight retweeted

Natan Zucchetti @natan_zucchetti

1 Sep 2025

Never seen a great owner/manager procrastinating decisions. You take what’s on your table and you chose instantly what’s the smartest move. Stop being a p*ssy.

44,107

SunLight

SunLight @SunLightGrow

18 Sep 2025

this liquid *ss thing makes windows 11 look good @apple @tim_cook

DR22 Ω 🪬🎭

SunLight retweeted

DR22 Ω 🪬🎭

@DejaRu22

11 Sep 2025

Reminder: no matter what you are going through right now, you are eternally fortunate for even having the privilege of experiencing human life in the first place. Your existence is a mathematical impossibility, but a divine certainty. Your mere presence defies all odds.

127

1,323

28,410

SunLight

SunLight @SunLightGrow

8 Sep 2025

Was cleaning my gallery and half of it is filled with @abombayboy ‘s tweet screenshots! They stay forever!

241

Sidharth II सिद्धार्थ

SunLight retweeted

Sidharth II सिद्धार्थ

@sidharthgehlot

1 Sep 2025

Work In Progress....

575

13,226

462,965