~6000K, Milky Way

Joined July 2020
110 Photos and videos
Liquid G*ass
This guy should’ve been Apple’s new CEO
25
He likely approved Liquid glass
Live look at Craig Federighi after John Ternus announced as the next Apple CEO
1
41
might as well update to ios 26 to reduce my phone usage to none. it's unintuitive, unpredictabe, unnatural.
1
43
SunLight retweeted
There is NO SITUATION that is NOT improved by bringing ENERGY and PASSION to it. There is no future. There is no past. There is ONLY an ever unfolding NOW. Throw your heart and soul into THIS present moment.
11
161
1,546
24,747
SunLight retweeted
Superman advises to not be stuck in the past.
1
2
70
SunLight retweeted
23 Oct 2025
9
49
525
15,251
SunLight retweeted
To learn more about temporal difference learning, you could read the original paper (incompleteideas.net/papers/s…) or watch this video (videolectures.net/videos/dee…).

18 Oct 2025
The Dwarkesh/Andrej interview is worth watching. Like many others in the field, my introduction to deep learning was Andrej’s CS231n. In this era when many are involved in wishful thinking driven by simple pattern matching (e.g., extrapolating scaling laws without nuance), it’s refreshing to hear an influential voice that is tethered to reality. One clarification for the podcast is that when Andrej says humans don’t use reinforcement learning, he is really saying humans don't use returns as learning targets. His example of LLMs struggling to learn to solve math problems from outcome-based rewards also elucidates the problem with learning directly from returns. Fortunately for RL, this exact problem is solved by temporal difference (TD) learning. All sample-efficient RL algorithms that show human-like learning (e.g., sample-efficient learning on Atari, and our work on learning from experience directly on a robot) rely on TD learning. Now Andrej is not primarily an RL person; he is looking at RL through the lens of LLMs these days, and all RL done in LLMs uses returns as targets, so it’s understandable that he is assuming that RL is all about learning from observed returns. But this assumption leads him to the incorrect conclusion that we need process-based dense rewards for RL to work. If you embrace TD learning, then you don't necessarily need a dense reward. Once you have learned a value function that encodes useful knowledge about the world, you can learn on the fly in the absence of rewards, just like humans and animals. This is possible because in TD learning there is no difference between learning from an unexpected reward and learning from an unexpected change in perceived value.
19
119
1,061
159,611
SunLight retweeted
For the next 24 hours, only speak to yourself as you would to a friend you highly respect. Then do it another 24 hours. Day by day you'll get better and better.
7
7
122
4,929
20 Oct 2025
Happy Diwali fam 🪔🎇
15
SunLight retweeted
21 Sep 2025
"This wise man observed that wealth is a tool of freedom. But the pursuit of wealth is the way to slavery." ~ Dune.
25
33
205
8,258
SunLight retweeted
9
70
860
33,573
SunLight retweeted
Never seen a great owner/manager procrastinating decisions. You take what’s on your table and you chose instantly what’s the smartest move. Stop being a p*ssy.
3
54
44,107
18 Sep 2025
this liquid *ss thing makes windows 11 look good @apple @tim_cook
40
SunLight retweeted
Reminder: no matter what you are going through right now, you are eternally fortunate for even having the privilege of experiencing human life in the first place. Your existence is a mathematical impossibility, but a divine certainty. Your mere presence defies all odds.
17
127
1,323
28,410
Was cleaning my gallery and half of it is filled with @abombayboy ‘s tweet screenshots! They stay forever!
1
1
2
241
Work In Progress....
61
575
13,226
462,965