EE PhD Student at Stanford University

Joined October 2021
Photos and videos
Karan Singh retweeted
Stanford CS25 Talk TOMORROW (Thurs, 5/21) at 4:30pm PST ๐Ÿค– Victoria Lin (@VictoriaLinML) from Thinking Machines Lab (@thinkymachines) on: From Language Models to Native Multimodal Intelligence What comes after LLMs? AI that natively understands the multimodal world! ๐Ÿ‘‡(1/6)
1
3
17
2,956
Karan Singh retweeted
Stanford CS25 Talk TODAY (Thurs, 5/14) at 4:30pm PST ๐Ÿค– Vivek Natarajan (@vivnat) from @GoogleDeepMind on: Advancing science and medicine with collaborative AI agents AI co-scientists. AI co-physicians. This is one of the most futuristic CS25 talks yet ๐Ÿ‘€๐Ÿ‘‡ (1/6)
1
4
16
911
Karan Singh retweeted
Stanford CS25 Talk TOMORROW (Thurs, 5/7) at 4:30pm PST ๐Ÿค– Andrew Lampinen (@AndrewLampinen) from @AnthropicAI on: How models generalize from parameters vs. context Turns out they behave very differently! ๐Ÿ‘‡ (1/6)
1
3
17
671
Karan Singh retweeted
This is the type of slop blowing up the number of submissions to conferences, screwing up the review process and wasting reviewers' time, and furthering the stochasticity of acceptances.
Such a great evening to start a brand new research for NeurIPS in 3.5 days.๐Ÿง˜โ€โ™‚๏ธ Day 1: planning. Night 1: running experiments and sending the abstract. Day 2: reading results fighting with Claude, and sending again. Night 2: sleep (optional). Day 3: opening Codex, and finally, write the pape in parallel. Night 3: resolving the โ€œbeefโ€ with Claude (temporary peace) and going to sleep. Day 4: final reading, last-minute fixes, submission then some relaxation, maybe a beach walk. Iโ€™ll keep you posted on the results. This will be my only single-author paper, so I canโ€™t hide behind other submissions if it gets rejected ๐Ÿ˜…
12
14
422
37,799
Karan Singh retweeted
Our @Stanford CS25 lectures are getting a lot of engagement! Thanks to @hazel_heejeong and @lucasmaes_ for the great talk about JEPA and world models, as proposed by @ylecun. Check our course website for recordings, slides, and more info: cs25.stanford.edu/ Also, CS25 is open to everyone! We feature talks from top researchers each week. Lectures are Thursdays at 4:30pm PDT at Skilling Auditorium (Stanford) and on Zoom: stanford.zoom.us/j/921967293โ€ฆ @_KaranPS_ @StanfordOnline @stanfordaiclub @stanfordnlp @StanfordAILab @agihouse_org @MongoDB @modal
Stanford's latest seminar is a deep dive into the evolution of world modeling in AI. Focuses on the shift in the world model from traditional reconstruction methods toward latent space prediction. Covers topics like: - Introduction to JEPA & World Models - Causal JEPA - LOWER Model - Practical Applications & Planning - Future Outlook
1
12
129
11,372
Karan Singh retweeted
Stanford CS25 Talk TODAY (Thurs, 4/30) at 4:30pm PST๐Ÿค– Shrimai Prabhumoye (@shrimai_) from @MistralAI (prev. @nvidia) on: The Future of Pretraining What comes after next-token prediction?๐Ÿ‘‡(1/6)
1
4
12
638
Karan Singh retweeted
Stanford CS25 Talk TODAY (Thurs, 4/23) at 4:30pm PST๐Ÿค– Nouamane Tazi (@Nouamanetazi) from @huggingface on: Scaling training to thousands of GPUs If you care about how frontier LLMs are actually trained - donโ€™t miss this๐Ÿ‘‡(1/6)
1
2
15
2,334
Karan Singh retweeted
Stanford CS25 Talk TOMORROW (Thurs, 4/16) at 4:30pm PST๐Ÿค– Albert Gu (@_albertgu) [CMU, Cartesia AI] on: Transformer alternatives - SSMs, Mamba, and beyond If you care about the future of sequence models, you won't want to miss this!๐Ÿ‘‡(1/5)
2
7
34
1,826
Karan Singh retweeted
Stanford CS25 Talk Today: Hazel Nam & Lucas Maes, Brown University & Mila [JEPA and World Models] Today (Thurs, 4/9) at 4:30pm PDT, @hazel_heejeong & @lucasmaes_ will be giving a talk for CS25 (cs25.stanford.edu) at Skilling Auditorium (Stanford). The talk will also be livestreamed on Zoom at stanford.zoom.us/j/921967293โ€ฆ. As always, we are *open to everybody*, so drop by! Presentation Title: From Representation Learning to World Modeling through Joint Embedding Predictive Architectures Presentation Abstract: World models are increasingly moving away from reconstruction and toward prediction in latent space. In this talk, we will present two recent JEPA-based approaches that illustrate this shift from complementary angles. Causal-JEPA induces object-level relational bias to promote representations that capture entities, and interactions, leading to stronger reasoning and more efficient planning. LeWorldModel shows that such predictive world models can also be trained stably end-to-end from raw pixels using a minimal objective and a clean architectural recipe, while remaining competitive on control tasks. Taken together, these works argue for a unified view of world modeling: predictive latent learning becomes most powerful when combined with both structural bias and architectural simplicity. This perspective suggests a promising path toward robust world models that support abstraction, reasoning, and control. Speaker Bios: Heejeong (Hazel) Nam (@hazel_heejeong) is a Master's student at Brown University, working on representation learning, causality, and self-supervised learning. Lucas Maes (@lucasmaes_) is a PhD student at Mila and the University of Montreal, working on JEPA and planning. Recordings, Slides, & More Info: The recordings will be released approx. 3 weeks after each talk on our YouTube playlist: youtube.com/playlist?list=PLโ€ฆ. Slides and more info are posted on our Discord server (discord.gg/2vE7gbsjzA) and course website (cs25.stanford.edu). Looking forward to seeing you all later today! @_KaranPS_ @Stanford @StanfordAILab @stanfordnlp @StanfordHAI @StanfordOnline @stanfordaiclub @agihouse_org @MongoDB @modal #AI #ArtificialIntelligence #ML #DeepLearning #NLP #NLProc #Transformers #Stanford #Education #Innovation #TechEd #Community #naturallanguageprocessing
5
25
7,629
Karan Singh retweeted
[CL] To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraining K Singh, M Yu, V Gangal, Z Taoโ€ฆ [Stanford University & Patronus AI] (2026) arxiv.org/abs/2604.00715
1
9
34
2,364
Karan Singh retweeted
To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraining Introduces a three-dimensional scaling framework modeling performance as a function of model size, pretraining tokens, and retrieval corpus size. ๐Ÿ“ arxiv.org/abs/2604.00715 ๐Ÿ‘จ๐Ÿฝโ€๐Ÿ’ป github.com/DegenAI-Labs/RAG-โ€ฆ
1
8
22
1,246
Karan Singh retweeted
23 Apr 2024
#watermarking helps to identify #AI-generated text. But how does it affect #LLM quality?๐Ÿค” Our new @TmlrOrg paper develops fine-grained tools to evaluate watermarks: popular watermarking methods can reduce coherence and depth of generations openreview.net/pdf?id=PuhF0hโ€ฆ Great work by @_KaranPS_๐Ÿ‘
7
32
9,369
Karan Singh retweeted
26 Dec 2023
As a Xmas present ๐ŸŽ„๐ŸŽ, super excited to announce the public release of the lectures for CS 25: Transformers United V3 (cs25.stanford.edu) held @Stanford See the course preview below ๐Ÿ‘‡: Our first two lectures are live on Youtube and the rest to follow after the breaks ๐Ÿ”ฅ youtube.com/watch?v=fz8wf9hNโ€ฆ #AI #Transformers
7
129
757
145,091
Karan Singh retweeted
Late tweet, but I've started my PhD at @Stanford! Excited to work with @StanfordAILab and @stanfordnlp. Grateful to all who've supported me, including @VarunGangal, @ehovy, @malihealikhani, @drjessehoey, and others at @LTIatCMU, @UWaterloo, @UWCheritonCS, @WaterlooMath, @Laurier
16
19
167