Jaskaran Singh

Jaskaran Singh

40 Photos and videos

Tweets

Jaskaran Singh

@jasksing

Jun 14

New Delhi when?

SemiAnalysis

@SemiAnalysis_

Jun 13

SITUATION DETECTED: The city of Rio de Janerio has post-trained a model. Based on Qwen 7/2, Rio 3.5 Open 397B adds SwiReasoning on top of the base Qwen model — a framework that dynamically switches between standard chain-of-thought and latent-space reasoning, guided by entropy-based confidence signals, so the model only "thinks out loud" when it needs to and otherwise reasons silently in hidden space for better token efficiency.

Jaskaran Singh

Jaskaran Singh

@jasksing

Jun 13

Not having soverign LLMs, will repeat history. The countries that adopted the Industrialization first in the 1900s, exploited the whole world who were late. Soverign LLMs - does not mean simply fine-tuning on open LLMs, but having access to the whole recipe reproducible.

Harveen Singh Chadha

@HarveenChadha

Jun 13

I wish I could do a podcast with people who were against building frontier models on why they didn't foresee this coming

Gaurav Aggarwal

Jaskaran Singh retweeted

Gaurav Aggarwal @fooobar

Jun 10

If there was ever any doubt, we cannot afford to not have our own models that compete with the best - never too late!

SemiAnalysis

@SemiAnalysis_

Jun 9

BREAKING NEWS: Anthropic's latest model will NOT help you if it thinks your ML research/ML engineering is interesting, and/or will secretly degrade its IQ so that the average engineer won't notice. We are already seeing Anthropic's latest model's moderation filters our GPU inference research and programming 😭

3,150

Siddhartha Saxena

Jaskaran Singh retweeted

Siddhartha Saxena

@siddsax

Jun 7

Michael Scott, the most qualified person to judge Claude vs Codex, has entered the chat.

1:49

348

57,335

Jaskaran Singh

Jaskaran Singh

@jasksing

May 29

Amazing!

Dongmin Park @dongmin_park11

May 27

Raon-OpenTTS paper is finally out! We fully open-sourced 615K hours of TTS data and a 1B model competitive with Qwen3-TTS-1B and Voxtral-TTS-4B. Like DCLM and DataComp, our work closed the gap towards SOTA closed-data models in TTS, which will help push the TTS community forward!

Siddhartha Saxena

Jaskaran Singh retweeted

Siddhartha Saxena

@siddsax

May 24

Anthropic onboarding day: Michael Scott introducing Karpathy like he just signed Wemby in free agency.

1:43

396

1,489

17,626

2,350,243

Jaskaran Singh

Jaskaran Singh

@jasksing

May 16

The space of AI is becoming gamma contractive

Jaynit

Jaskaran Singh retweeted

Jaynit

@jaynitx

May 3

Terence Tao: "Previously, you needed a PhD to contribute to math research. Now a high school student can." Dwarkesh asks the world's most famous mathematician: what's your advice for someone considering a career in math, especially in light of AI progress? Tao is honest about uncertainty: "We live in a time of change. A particularly unpredictable era. Things that we've taken for granted for centuries may not hold anymore. The way we do everything... not just mathematics... will change." He admits his preference: "In many ways, I would prefer a much more boring, quiet era where things are much the same as they were 10 or 20 years ago. But one just has to embrace this. There's going to be a lot of change. The things you study... some of them may become obsolete or revolutionized. But some things will be retained." On new opportunities: "Previously, you had to go through years and years of education and get a math PhD before you could contribute to the frontier of math research. But now it's quite possible at the high school level that you could get involved in a math project and actually make a real contribution... because of all these AI tools and Lean and everything else." His advice: "There will be a lot of non-traditional opportunities to learn. You need a very adaptable mindset. There'll be worth pursuing things just for curiosity and for playing around. Still go through traditional education and learn math and science the old-fashioned way for a while... credentials will still be important. But you should also be open to very, very different ways of doing science. Some of which don't exist yet." He concludes: "It's a scary time. But also very exciting."

2:29

130

671

77,241

Reyaa

Jaskaran Singh retweeted

Reyaa

@snr_boost

May 2

Replying to @kingofknowwhere

Dumb person's idea of a smart person. That doesn't mean he's not smart. He probably is given his credentials. But it's not his job to know nitty gritty of GANs or PCPs

1,567

Ravid Shwartz Ziv

Jaskaran Singh retweeted

Ravid Shwartz Ziv

@ziv_ravid

Apr 24

New episode of The Information Bottleneck is out, this time with @liuzhuang1234 (Princeton). We talked about ConvNeXt and whether architecture still matters; dataset bias and what "good data" actually looks like; ImageBind and why vision is the natural bridge across modalities; CLIP's blind spots; memory as the real bottleneck behind the agent hype; whether LLMs have world models; and Transformers Without Normalization. For years, the vision community debated what actually matters: architecture, inductive bias, self-attention vs convolution. After a lot of back-and-forth, we ended up in a funny place: ViT and ConvNet give roughly the same performance once you tune the details. What I find interesting is that once you reach a certain performance level, it becomes much easier to swap and tweak components without really changing the outcome. Talking to Zhuang on this episode, I kept wondering whether the same is now true for LLMs. If we wil spent serious time on an alternative architecture today, would you actually get a meaningfully different model, or just land on the same Pareto curve with extra steps? I'm starting to suspect it's the latter. Architecture matters less than we think. Data, compute, and a handful of pillars do most of the work.

1:27

25,983

Jaskaran Singh

Jaskaran Singh

@jasksing

Apr 13

THIS IS DEEPSEEK FOR TTS LITERALLY!!!

Shruti

@heyshrutimishra

Apr 13

🚨 Someone just open-sourced a 2B parameter TTS model that does what ElevenLabs charges $330/month for. > Zero-shot voice cloning. > 48kHz studio-grade audio. > 30 languages and including 8 Chinese dialects.

Jaskaran Singh

Jaskaran Singh

@jasksing

Apr 4

Open data open weights with detailed technical report in this economy??

Kangwook Lee

@Kangwook_Lee

Apr 3

My team has been cooking nonstop for a while... and I’m so excited to finally share what we’ve been building!!! Today, we’re releasing four open models, many of which are the best models of the same size 🥳!!! tldr; 1) Raon-Speech: 9B SOTA speech LLM 2) Raon-SpeechChat: 9B full duplex model 3) Raon-OpenTTS: 0.3B/1B open-data-open-weight SOTA TTS 4) Raon-VisionEncoder: 0.4B vision encoder trained only with public data huggingface.co/collections/K… === 1) Raon-Speech (9B) Raon-Speech is a speech LLM (LLM speech understanding speech generation). It's a bilingual model (English/Korean), and it's ranked #1 on both leaderboards 😎 tldr; it's the best open-model alternative to ChatGPT voice mode. Model: huggingface.co/KRAFTON/Raon-… Tech report: huggingface.co/KRAFTON/Raon-… Web demo: raon.krafton.ai/ ("Speech Chat" menu here. "auto" is a bit unstable, so use "manual" and choose the language!) 2) Raon-SpeechChat (9B) While a speech LLM is useful, it’s kind of like a walkie-talkie. A full-duplex model is more like a phone, so it is even more useful in many applications. That’s why we also built and are releasing Raon-SpeechChat. Again, on several quantitative evaluation metrics, Raon-SpeechChat scored the best on average. Model: huggingface.co/KRAFTON/Raon-… Tech report: huggingface.co/KRAFTON/Raon-… Web demo: raon.krafton.ai/ ("Full Duplex" menu here.) 3) Raon-OpenTTS (0.3B, 1B) We’re also releasing Raon-OpenTTS, a state-of-the-art open-data, open-weight TTS model. Model data: huggingface.co/KRAFTON/Raon-… The 1B model and a detailed tech report are coming soon! 4) Raon-VisionEncoder (0.4B) Last but not least, we’re releasing Raon-VisionEncoder, a vision encoder trained from scratch using only public data. It closely matchs the SOTA vision encoder quality too! Model: huggingface.co/KRAFTON/Raon-… Tech blog: krafton.ai/blog/posts/2026-0… === That’s it! I’m incredibly proud of what my team has built! My AI research team at KRAFTON (@Krafton_AI), which undoubtedly is the most cracked team in Korea, has been cooking nonstop for a while for this 😅... This is just the beginning of our planned model releases, so stay tuned! ps1/ Ah, by the way, you may ask why “Raon”? “Raon” is an old Korean word meaning happy. And, well, we’re kRAftON :-) ps2/ KRAFTON is one of the four teams participating in Korea’s national frontier-model project, together with SK Telecom. We’re training something very exciting together... and more to come soon!

Tanmay

Jaskaran Singh retweeted

Tanmay

@imnottanmay

Mar 27

Replying to @HarveenChadha

Pradhan Mantri har ghar backprop yojna

1,151

Jaskaran Singh

Jaskaran Singh

@jasksing

Mar 26

was waiting for JEPA to be in Audio. Clearly working in latent space prove to be effective!

This tweet is unavailable

Jaskaran Singh

Jaskaran Singh

@jasksing

Mar 26

Really cool Stuff!

This tweet is unavailable

158

Jaskaran Singh

Jaskaran Singh

@jasksing

Mar 19

IITM doesn;t get appreciated enough

IIT Madras

@iitmadras

Mar 18

Shaping the future of AI—responsibly and at scale. Meet Prof. Krishna Pillutla, Assistant Professor at the Wadhwani School of Data Science and AI, IIT Madras, whose research advances privacy-preserving and robust machine learning in the era of generative AI. His work focuses on building trustworthy, reliable AI systems designed for real-world impact. At WSAI, innovation goes beyond the lab. With cutting-edge academic programmes, interdisciplinary research centres, and industry collaborations, the school is driving AI breakthroughs while preparing the next generation of data scientists and AI leaders—firmly placing India on the global AI map. 🎥 Watch as he takes you inside WSAI and shares why this is a defining moment for Data Science and AI at IIT Madras. @iitmadras @WSAI_IITM

4:40

Pedro Domingos

Jaskaran Singh retweeted

Pedro Domingos

@pmddomingos

Mar 18

Geoff Hinton set out to figure out how the brain works and failed. Andrew Ng set out to build a complete robot and failed. Demis Hassabis set out to achieve AGI using deep RL and failed. Yet they all succeeded.

561

39,500

vittorio

Jaskaran Singh retweeted

vittorio

@IterIntellectus

Mar 14

this is actually insane > be tech guy in australia > adopt cancer riddled rescue dog, months to live > not_going_to_give_you_up.mp4 > pay $3,000 to sequence her tumor DNA > feed it to ChatGPT and AlphaFold > zero background in biology > identify mutated proteins, match them to drug targets > design a custom mRNA cancer vaccine from scratch > genomics professor is “gobsmacked” that some puppy lover did this on his own > need ethics approval to administer it > red tape takes longer than designing the vaccine > 3 months, finally approved > drive 10 hours to get rosie her first injection > tumor halves > coat gets glossy again > dog is alive and happy > professor: “if we can do this for a dog, why aren’t we rolling this out to humans?” one man with a chatbot, and $3,000 just outperformed the entire pharmaceutical discovery pipeline. we are going to cure so many diseases. I dont think people realize how good things are going to get

Séb Krier

@sebkrier

Mar 14

This is wild. theaustralian.com.au/busines…

2,423

19,354

116,206

17,660,403

Jaskaran Singh

Jaskaran Singh

@jasksing

Mar 10

LeGOAT

Yann LeCun

@ylecun

Mar 10

Unveiling our new startup Advanced Machine Intelligence (AMI Labs). We just completed our seed round: $1.03B / 890M€, one the largest seeds ever, probably the largest for a European company. We're hiring! [the background image is the Veil Nebula - a picture I took from my backyard, most appropriate for an unveiling] More details here: techcrunch.com/2026/03/09/ya…

Rahul

Jaskaran Singh retweeted

Rahul

@selfawareatom

Mar 10

Now that our 15 member llm team is infamous, time to expand for next time! If you have done one or more of the following, then please reach out. - pretrained a model of any size, from scratch - posttrained any base model, end to end (data curation, sft, rl) - are a pytorch wizard - are a cuda kernel master - you have any other relevant skills and work to back it up firstname<at>sarvam<dot>ai

693

83,223