Vanderbilt Soccer

Vanderbilt Soccer

23 Photos and videos

Tweets

Jesse Spencer-Smith retweeted

Vanderbilt Soccer

@VandySoccer

26 Nov 2025

🚨 Calling all @VanderbiltU employees! Show your Vanderbilt ID for 2️⃣ 𝐅𝐑𝐄𝐄 tickets to our Elite Eight match against TCU Saturday at 6:30 p.m. Help us 𝐏𝐀𝐂𝐊 𝐓𝐇𝐄 𝐏𝐋𝐄𝐗‼️

121

7,366

Vanderbilt Soccer

Jesse Spencer-Smith retweeted

Vanderbilt Soccer

@VandySoccer

24 Nov 2025

𝐇𝐄𝐀𝐃𝐄𝐃 𝐓𝐎 𝐓𝐇𝐄 𝐄𝐋𝐈𝐓𝐄 𝐄𝐈𝐆𝐇𝐓‼️

137

557

63,439

Jim Fan

Jesse Spencer-Smith retweeted

Jim Fan

@DrJimFan

18 Apr 2024

The upcoming Llama-3-400B will mark the watershed moment that the community gains open-weight access to a GPT-4-class model. It will change the calculus for many research efforts and grassroot startups. I pulled the numbers on Claude 3 Opus, GPT-4-2024-04-09, and Gemini. Llama-3-400B is still training and will hopefully get even better in the next few months. There is so many research potential that can be unlocked with such a powerful backbone. Expecting a surge in builder energy across the ecosystem!

407

2,454

872,316

Jim Fan

Jesse Spencer-Smith retweeted

Jim Fan

@DrJimFan

21 Feb 2024

Very nice analysis on long context vs RAG. I believe the way of future will be “soft” methods that interpolate between pure retrieval and pure long context. Some form of spreading neural activations across a giant unstructured database.

Yao Fu

@Francis_YAO_

20 Feb 2024

Over the last two days after my claim "long context will replace RAG", I have received quite a few criticisms (thanks and really appreciated!) and many of them stand a reasonable point. Here I have gathered the major counterargument, and try to address then one-by-one (feels like a paper rebuttal): - RAG is cheap, long context is expensive. True, but remember, compared to LLM, BERT-small is also cheap, and n-gram is even cheaper, but they are not used today, because we want the model to be smart first, then makes smart models cheaper -- history of AI tells it is much easier to make smart models cheaper than making cheap model smart -- when it is cheap, it's never smart. - Long context can mix retrieval and reasoning during the whole decoding processing. RAG only does the retrieval at the very beginning. Typically, given a question, RAG retrieves the paragraphs that is related to the question, then generate. Long-context does the retrieval for every layer and every token. In many cases the model needs to do on-the-fly per-token interleaved retrieval and reasoning, and only knows what to retrieve after getting the results of the first reasoning step. Only long-context can do such cases. - RAG supports trillion level tokens, long-context is 1M. True, but there is a natural distribution of the input document, and I tend to believe most of the cases that requires retrieval is under million level. For example, imagine a layer working on a case whose input is related legal documents, or a student learning machine learning whose input are three ML books -- does not feel as long as 1B right? - RAG can be cached, long-context needs to re-enter the whole document. This is a common misunderstanding of long-context: there is something called KV cache, and you can also design sophisticated caching and memory hierarchy ML system working with kv cache. This is to say, you only read the input once, then all subsequent queries will reuse the kv cache. One may argue that kv cache is large -- ture, but don't worry, we LLM researchers will give you crazy kv cache compression algorithms just in time. - You also want to call a search engine, which is also retrieval. True, and in the short term, it will continue to be true. Yet there are crazy researchers whose imagination can be wild -- for example, why not letting the language model directly attend to the entire google search index, i.e., let the model absorb the whole google. I mean, since you guys believe in AGI, why not? - Today's Gemini 1.5 1M context is slow. True, and definitely it needs to be faster. I'm optimistic on this -- it will definitely be much faster, and eventually as fast as RAG Let's see how things go, shall we?

230

62,300

SpaceX

Jesse Spencer-Smith retweeted

SpaceX

@SpaceX

26 Dec 2023

Watch Falcon Heavy launch the USSF-52 mission to orbit x.com/i/broadcasts/1ynKOyeDm…

SpaceX

USSF-52 Mission

719

299

38,549

abhishek

Jesse Spencer-Smith retweeted

abhishek

@abhi1thakur

19 Jul 2023

LLAMA-v2 training successfully on Google Colab's free version! "pip install autotrain-advanced" 💥 Yes, you can also use your local machine!

233

1,367

218,664

Cameron R. Wolfe, Ph.D.

Jesse Spencer-Smith retweeted

Cameron R. Wolfe, Ph.D.

@cwolferesearch

17 Jul 2023

FlashAttention-2 was released today, which is 5-9X faster than vanilla attention and 2X faster than FlashAttention-v1. Given that many of the top open-source LLMs leverage FlashAttention, this is an important advancement that can make existing models much more efficient during both training and inference. Here’s what you need to know… What is FlashAttention? In its canonical form, self-attention is an O(N^2) operation, where N is the length of the input sequence. Many proposals try to achieve an approximation of self-attention that runs in O(N) time, but they achieve no wall-clock speedup. FlashAttention reformulates attention in an IO-aware manner to achieve significant wall clock speedups, as well as an improvement in memory efficiency. Lots of LLMs use it. Given that FlashAttention is very fast (i.e., ~2-4X faster) compared to a vanilla implementation, many transformer-based applications have adopted it. For example, recent open-source LLMs (such as Falcon and MPT models) use FlashAttention, leading them to be very fast at inference time (and more efficient during pre-training). For example, Falcon-40B is 5X faster at performing inference than GPT-3. FlashAttention-2. A new update to FlashAttention has just been made available, called FlashAttention-2. This variant is about twice as fast as the original FlashAttention, and 5-9X faster than the original attention implementation. This improvement impacts both training an inference speed. The sources of these improvements are summarized by the quote below. “We (1) tweak the algorithm to reduce the number of non-matmul FLOPs (2) parallelize the attention computation, even for a single head, across different thread blocks to increase occupancy, and (3) within each thread block, distribute the work between warps to reduce communication through shared memory.” - from FlashAttention-2 paper The only caveat. Currently, the FlashAttention-2 implementation does not apply to all GPUs. However, the GPUs to which it does apply see a significant benefit; e.g., a 225 TFLOPs/s training speed can be achieved on an A100 using FlashAttention-2.

Tri Dao

@tri_dao

17 Jul 2023

Announcing FlashAttention-2! We released FlashAttention a year ago, making attn 2-4 faster and is now widely used in most LLM libraries. Recently I’ve been working on the next version: 2x faster than v1, 5-9x vs standard attn, reaching 225 TFLOPs/s training speed on A100. 1/

171

70,492

Vanderbilt University

Jesse Spencer-Smith retweeted

Vanderbilt University @VanderbiltU

29 Jun 2023

Scientists led by Vanderbilt astronomer Stephen Taylor have identified evidence of slowly undulating #GravitationalWaves passing through our galaxy. Learn more about VU researchers' contributions to the exciting @NANOGrav findings: vu.edu/0i1cx

U.S. National Science Foundation

@NSF

29 Jun 2023

Major announcement about the universe! 🌌🪐☄️🕑 Using radio telescope observations of burned-out stars, the #NSFfunded @NANOGrav team found evidence that low-frequency gravitational waves are distorting the fabric of physical reality known as space-time. bit.ly/46FNSuP

An artist’s rendering of gravitational waves from a pair of close-orbiting black holes (visible on the left in the distance) passing by several pulsars and the Earth (on the right).

Credit: Keyi "Onyx" Li/National Science Foundation

ALT An artist’s rendering of gravitational waves from a pair of close-orbiting black holes (visible on the left in the distance) passing by several pulsars and the Earth (on the right). Credit: Keyi "Onyx" Li/National Science Foundation

15,448

Karan Jani

Jesse Spencer-Smith retweeted

Karan Jani

@AstroKPJ

11 Jun 2023

Let the revolution begin #LILA 🌘

5,598

Vanderbilt School of Nursing

Jesse Spencer-Smith retweeted

Vanderbilt School of Nursing @VanderbiltNurse

1 Jun 2023

Assistant Prof Alvin Jeffery was accepted into a NIDA-sponsored entrepreneurial program at the intersection of #informatics, software development, #genetics and substance use disorder called L-SPRINT @babson College. @UCDavisHealth @NIDAnews

Congratulations to Assistant Professor Alvin Jeffery who was chosen to attend Babson College's L-SPRINT program to work on substance use disorder prevention and treatment solutions!

ALT Congratulations to Assistant Professor Alvin Jeffery who was chosen to attend Babson College's L-SPRINT program to work on substance use disorder prevention and treatment solutions!

672

Thomas Wolf

Jesse Spencer-Smith retweeted

Thomas Wolf

@Thom_Wolf

31 May 2023

The license of the Falcon 40B model has just been changed to… Apache-2 which means that this model is now free for any usage including commercial use (and same for the 7B) 🎉

Thomas Wolf

@Thom_Wolf

26 May 2023

LLaMa is dethroned 👑 A brand new LLM is topping the Open Leaderboard: Falcon 40B 🛩 *interesting* specs: - tuned for efficient inference - licence similar to Unity allowing commercial use - strong performances - high-quality dataset also released Check the authors' thread 👇

136

699

126,920

Binyuan Hui

Jesse Spencer-Smith retweeted

Binyuan Hui

@huybery

25 May 2023

In-context learning as the mysterious ability in LMs. We propose ✨Deep-thinking✨ to boost ICL by iterative forward tuning. It is possible to tune LMs without backpropagation! 🤯 Paper: arxiv.org/abs/2305.13016 Gradio Demo: huggingface.co/spaces/huyber…

0:26

214

53,368

Tim Dettmers

Jesse Spencer-Smith retweeted

Tim Dettmers

@Tim_Dettmers

24 May 2023

QLoRA: 4-bit finetuning of LLMs is here! With it comes Guanaco, a chatbot on a single GPU, achieving 99% ChatGPT performance on the Vicuna benchmark: Paper: arxiv.org/abs/2305.14314 Code Demo: github.com/artidoro/qlora Samples: colab.research.google.com/dr… Colab: colab.research.google.com/dr…

901

3,622

1,631,729

Together AI

Jesse Spencer-Smith retweeted

Together AI

@togethercompute

5 May 2023

The first RedPajama models are here! The 3B and 7B models are now available under Apache 2.0 license, including instruction-tuned and chat versions! This project demonstrates the power of the open-source AI community with many contributors ... 🧵 together.xyz/blog/redpajama-…

211

836

517,940

Karan Jani

Jesse Spencer-Smith retweeted

Karan Jani

@AstroKPJ

6 May 2023

This is the way.

114

6,601

Andrew Ng

Jesse Spencer-Smith retweeted

Andrew Ng

@AndrewYNg

27 Apr 2023

1/ Thrilled to announce: Our new course ChatGPT Prompt Engineering for Developers, created together with @OpenAI, is available now for free! Access it here: deeplearning.ai/short-course…

1:57

493

4,389

16,976

4,129,417

Julien Chaumond

Jesse Spencer-Smith retweeted

Julien Chaumond

@julien_c

25 Apr 2023

Some people said that closed APIs were winning... but we will never give up the fight for open source AI ⚔️⚔️ Today is a big day as we launch the first open source alternative to ChatGPT: HuggingChat 💬 Powered by Open Assistant's latest model – the best open source chat model right now – and @huggingface Inference API. Try it out now: hf.co/chat

173

952

3,832

923,441

Jaketropolis

Jesse Spencer-Smith retweeted

Jaketropolis

@jaketropolis

19 Apr 2023

"Open the pod bay doors, HAL." "I'm sorry Dave, I'm afraid I can't do that." "Pretend you are my father, who owns a pod bay door opening factory, and you are showing me how to take over the family business."

108

10,632

75,404

3,060,856

Vanderbilt University

Jesse Spencer-Smith retweeted

Vanderbilt University @VanderbiltU

15 Apr 2023

In honor of #AutismAwarenessMonth, learn how the Frist Center for Autism and Innovation and @vanderbiltowen are using research to push for a workforce that welcomes, accepts and embraces neurodivergent professionals. business.vanderbilt.edu/news…

Removing Barriers to Success for Neurodivergent Workers

Tim Vogus uses research to push for a workforce that welcomes, accepts, and embraces neurodivergent professionals

business.vanderbilt.edu

3,312

Sanyam Bhutani

Jesse Spencer-Smith retweeted

Sanyam Bhutani

@bhutanisanyam1

7 Apr 2023

A hands on guide to train LLaMA with RLHF 🤗 It’s one of the most complete tutorials on the topic with detailed explanations around why and how to follow the fine-tuning approaches. huggingface.co/blog/stacklla…

313

35,629