simrat hanspal

simrat hanspal

16 Photos and videos

Tweets

Pinned Tweet

simrat hanspal @simsimsandy

31 Jul 2024

My recent blog with @hasgeek - “Decoding Llama3” is out. It’s a deep dive into the Llama3 model code released in April this year. This is a fun blog with a code-first approach. hasgeek.com/simrathanspal/th…

Decoding Llama3: An explainer for tinkerers

A not-so-quick 7-part guide to using the Llama3 open source AI model

hasgeek.com

364

Vizuara

simrat hanspal retweeted

Vizuara

@VizuaraAI

Jun 11

1. Diffusion Language Models Language models do not always have to generate text one token at a time. Diffusion language models start with a masked sentence, predict many words in parallel, and refine them over multiple rounds. Models like LLaDA and Mercury are exploring a new path for text generation. courses.vizuara.ai

1:43

423

simrat hanspal

simrat hanspal @simsimsandy

11 Nov 2025

🫣 Softmax is unstable with very large and very small numbers. 🤓 Here is a simple illustration of how (x-max) makes softmax stable for use.

simrat hanspal

simrat hanspal @simsimsandy

7 Nov 2025

What does it mean to have dropout in Attention computation? Dropouts are used to prevent overfitting. In case of attention, we drop some attention scores, which means that if the model learnt to attend to some token, it now has to focus on other related tokens. #LLM #Attention

simrat hanspal

simrat hanspal @simsimsandy

31 Oct 2025

Simple illustration of what token to word embedding conversion looks like.

119

simrat hanspal

simrat hanspal @simsimsandy

31 Oct 2025

I mean token to token embedding :’D

simrat hanspal

simrat hanspal @simsimsandy

30 Oct 2025

The tokeniser lies about how many tokens it holds ;) What the tokeniser returns is the size of the base vocabulary that it learnt during training. Everything after that are special tokens. Special tokens are like metadata and help structure context.

simrat hanspal

simrat hanspal @simsimsandy

31 Oct 2025

So, you use len(tokenizer) Not sure why colab is not recognising len() :D

simrat hanspal

simrat hanspal @simsimsandy

28 Oct 2025

Trivial but worth a reminder use np.matmul for dot product instead of np. dot. np. dot is meant to be a flexible function that will adjust according to the input shape, instead of raising an error. Example np. dot(np.array([[1, 2], [3, 4]]), 10)

simrat hanspal

simrat hanspal @simsimsandy

15 Jul 2024

Budding entrepreneur 🥹 I purchased more than I planned.

Zainab Bawa @zainabbawa

15 Jul 2024

And also assisting madame in coordinating logistics and order shipping the day after #fifthel @jackerhack @_waabi_saabi_

744

simrat hanspal

simrat hanspal @simsimsandy

10 Jul 2024

Looking forward to it.

anwesha @Anwxsha

5 Jul 2024

Tech x society enthusiasts, show up for The Fifth Elephant Annual Conference on 13th July! I'll be hosting the session on Deploying AI in Key Sectors: Robust Risk Mitigation Strategies with @jnkhyati, @bargava, @simsimsandy & @fooobar @fifthel @hasgeek @anthillin @zainabbawa

156

Bengaluru Systems (fka Bengaluru Systems Meetup)

simrat hanspal retweeted

Bengaluru Systems (fka Bengaluru Systems Meetup)@BengaluruSys

6 Jul 2024

First, @simsimsandy walked us through GPU architecture, optimizations, CUDA, and the challenges of running large ML models on GPUs, with a special look at the attention mechanism, KV-Cache optimizations, and PagedAttention!

887

simrat hanspal

simrat hanspal @simsimsandy

6 Jul 2024

Thank you for the shoutout @TheOtherRaghav. It was a lovely event. x.com/theotherraghav/status/…

286

simrat hanspal

simrat hanspal @simsimsandy

28 Jun 2024

If you are into GenAI, @hasgeek is organizing a call today to build a community on #ResponsibleAI. Join for cross-learning. 🔗 Meeting Link: Register here to confirm your participation - lnkd.in/gFSt8bYc 🕰Time: 7 PM IST Friday, 28 June (tonight)

This link will take you to a page that’s not on LinkedIn

lnkd.in

227

Tune AI

simrat hanspal retweeted

Tune AI @Tunehq_ai

26 Jun 2024

🚀Join us in Chennai next week for our hands-on workshop: "Building AI Agents with RAG and Functions" 🤖✨ Limited seats available, so hurry and secure your spot! 🏃‍♂️💨 🔗 Register now: lu.ma/6lrnyo1b #AIWorkshop #ChennaiEvents #llm #genai

Workshop: Building AI Agents with RAG and Function Calling · Luma

Instructors: Vikrant Guleria and team from Tune AI Venue: ChargeBee at Coworks Perungudi Agenda: 1. Introduction to LLM APIs and Function Calling 30 mins,…

luma.com

1,762

simrat hanspal

simrat hanspal @simsimsandy

22 Jun 2024

Thank you for the call out @zainabbawa :) Best wishes to all the speakers at FifthEl 2024, looking forward to networking in person.

Zainab Bawa @zainabbawa

22 Jun 2024

Replying to @anscombes4tet @Aditi_ahj @fifthel

.@simsimsandy introduced Bhumika Makwana @GalaxEyeSpace who will speak about multimodal fusion as the new game changer. Reach out to Simrat for review and feedback on #nlp work, and for simplifying complex AI concepts. 3/5

240

simrat hanspal

simrat hanspal @simsimsandy

15 May 2024

Really fun video on the basics - dot prd and inner prd. Also, potentially a great resource on Quantum Mechanics #QuantumSense YT channel. youtube.com/watch?v=3N2vN76E… Inner product is an important concept for Rotary Positional Embedding, which is used by #LLM like #Llama3 (#Llama).

Ch 4: What is an inner product? | Maths of Quantum Mechanics

Hello!This is the fourth chapter in my series "Maths of Quantum M...

youtube.com

128

1LittleCoder💻

simrat hanspal retweeted

1LittleCoder💻

@1littlecoder

13 May 2024

Surprised how Twitter Influencers have got more insider informations and conclusions than anyone else! Also, because Apple uses OpenAI means it couldn't make it on its own 🤦🏽

9,276

simrat hanspal

simrat hanspal @simsimsandy

12 May 2024

Concise end-to-end #RAG tutorial by @jasonzhou1993 youtu.be/u5Vcrwpzoz8?si=kPwf… This YouTube channel is a Gem with a lot of use case-based tutorials. Check it out :) #LLM #GenAI #AI

"I want Llama3 to perform 10x with my private knowledge" - Local...

Advanced RAG 101 - build agentic RAG with llama3Get free HubSpot ...

youtube.com

146

simrat hanspal

simrat hanspal @simsimsandy

9 May 2024

I discovered fairscale today, PyTorch extension by Meta that helps you with distributed model training. Looks like #Llama3 used it. github.com/facebookresearch/… I am definitely going to be reading more about it. Please drop in your thoughts if you have used it or read something.

GitHub - facebookresearch/fairscale: PyTorch extensions for high performance and large scale...

PyTorch extensions for high performance and large scale training. - facebookresearch/fairscale

github.com

121