AI DevAdvocate for AI Dev Platform at Intuit. I blog about AI, Machine Learning and wine. Tweets are my own

Joined March 2008
Photos and videos
Thank you @michelleefang ♥️
Replying to @michelleefang
Thursday 4/16 ‣ Agent Builders Breakfast - Founders & Builders in SOMA, SF luma.com/xmv1m6qt @miradu ‣ AI Breakfast: Build your AI workspace in one morning luma.com/aibreakfast2 @rajoshighosh ‣ Platform Engineering & AI luma.com/intuitossmtvapr2026 @cbergman ‣ Voice AI Meetup: Medical Mode luma.com/609tv1po @ryanseams @theaievangelist ‣ Claws Out🦞 GMI ClawHub Demo Night luma.com/tklag2kv @nicoleegong @gmi_cloud @yuqih ‣ Voice AI builders night luma.com/voice_builders @modal @braintrust ‣ AI Meets HumanX Social w/ Rootly AI, MongoDB, Runpod, & More! luma.com/n22qf90w ‣ Slides Down: AI Founders Party by DigitalOcean & Zendesk Apr 16 luma.com/jzopt9nv @neffko @julianachyzhova @yfilipch
49
Christy Bergman retweeted
Thursday 4/16 ‣ Agent Builders Breakfast - Founders & Builders in SOMA, SF luma.com/xmv1m6qt @miradu ‣ AI Breakfast: Build your AI workspace in one morning luma.com/aibreakfast2 @rajoshighosh ‣ Platform Engineering & AI luma.com/intuitossmtvapr2026 @cbergman ‣ Voice AI Meetup: Medical Mode luma.com/609tv1po @ryanseams @theaievangelist ‣ Claws Out🦞 GMI ClawHub Demo Night luma.com/tklag2kv @nicoleegong @gmi_cloud @yuqih ‣ Voice AI builders night luma.com/voice_builders @modal @braintrust ‣ AI Meets HumanX Social w/ Rootly AI, MongoDB, Runpod, & More! luma.com/n22qf90w ‣ Slides Down: AI Founders Party by DigitalOcean & Zendesk Apr 16 luma.com/jzopt9nv @neffko @julianachyzhova @yfilipch
2
3
8
1,779
Christy Bergman retweeted

76
418
2,867
623,022
💓@AndrewYNg Note to self: look here before next CFP submission or helping others. Ask the model to summarize best advice per conference CFP rules and topic submitter wants to talk about...
24 Nov 2025
Releasing a new "Agentic Reviewer" for research papers. I started coding this as a weekend project, and @jyx_su made it much better. I was inspired by a student who had a paper rejected 6 times over 3 years. Their feedback loop -- waiting ~6 months for feedback each time -- was painfully slow. We wanted to see if an agentic workflow can help researchers iterate faster. When we trained the system on ICLR 2025 reviews and measured Spearman correlation (higher is better) on the test set: - Correlation between two human reviewers: 0.41 - Correlation between AI and a human reviewer: 0.42 This suggests agentic reviewing is approaching human-level performance. The agent grounds its feedback by searching arXiv, so it works best in fields like AI where research is freely published there. It’s an experimental tool, but I hope it helps you with your research. Check it out here: paperreview.ai
66
Don't🍷about #OOM running out of memory! @huggingface is making it easier to run huge #TransformerandDiffuser models on consumer GPUs w quantization, tensor parallelism, offloading. Hear from @stevhliu how to fit these models on your setup. lu.ma/taf3lmvj #HuggingFace
1
94
Christy Bergman retweeted
27 Feb 2025
GPT-4.5 is ready! good news: it is the first model that feels like talking to a thoughtful person to me. i have had several moments where i've sat back in my chair and been astonished at getting actually good advice from an AI. bad news: it is a giant, expensive model. we really wanted to launch it to plus and pro at the same time, but we've been growing a lot and are out of GPUs. we will add tens of thousands of GPUs next week and roll it out to the plus tier then. (hundreds of thousands coming soon, and i'm pretty sure y'all will use every one we can rack up.) this isn't how we want to operate, but it's hard to perfectly predict growth surges that lead to GPU shortages. a heads up: this isn’t a reasoning model and won’t crush benchmarks. it’s a different kind of intelligence and there’s a magic to it i haven’t felt before. really excited for people to try it!
3,124
3,598
40,327
5,525,319
Christy Bergman retweeted
28 Feb 2025
🚀 Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks. ⚡ 6.6 TiB/s aggregate read throughput in a 180-node cluster ⚡ 3.66 TiB/min throughput on GraySort benchmark in a 25-node cluster ⚡ 40 GiB/s peak throughput per client node for KVCache lookup 🧬 Disaggregated architecture with strong consistency semantics ✅ Training data preprocessing, dataset loading, checkpoint saving/reloading, embedding vector search & KVCache lookups for inference in V3/R1 📥 3FS → github.com/deepseek-ai/3FS ⛲ Smallpond - data processing framework on 3FS → github.com/deepseek-ai/small…
523
1,236
10,193
3,211,678
I just published a blog in #DataScienceCollective, the new free open version of @TDataScience. Here, I look at 9 different discords and prompt #LLMs to do #Clustering on user messages. linkedin.com/posts/christybe…

1
1
88
Thanks @pacoid ! I'd better get started preparing my talk for that! #SonomaAI #FoodWineAI
1
96
🤔hmm, but this paper shows w8a8-fp (symmetric weight and dynamic per-token activation quantization in fp8) is "essentially lossless" in accuracy. arxiv.org/pdf/2411.02355

Interesting! The most common inference quantization int8/fp8 is not necessarily the best. bf16 #quantization is a way better accuracy/latency tradeoff.
1
168
Seems devil is in the details for accuracy/latency tradeoff decisions. #w8a8fp: 1. Weights quantized using usual symmetric fp8 method. 2. Activations quantized without pre-calibration i.e. symmetric quantization parameters calculated on-the-fly during model inference.
120
Interesting! The most common inference quantization int8/fp8 is not necessarily the best. bf16 #quantization is a way better accuracy/latency tradeoff.
aidan bench update: i ran llama 3.1 405b at bf16 (shoutout to @hyperbolic_labs) and we got a *way* better score. 405b fp8 is around gpt-4o-mini-level 405b bf16 beats claude-3.5-sonnet give me bf16 or give me death
1
235
Nice to meet and chat w/you too! @adamse @felipehoffa It was fun to get some hands-on time and see what's new with @awscloud Bedrock.
16 Oct 2024
Replying to @felipehoffa
@felipehoffa @cbergman so great to see you at the @awscloud GenAI Loft today!
1
3
530
I just tried this hack. Thanks, I really needed that! 😂
29 Sep 2024
Self-care life hack: if you feel a bit down/tired, paste the url of your website/linkedin/bio in Google's NotebookLM to get 8 min of realistically sounding deep congratulations for your life and achievements from a duo of podcast experts 😂
2
108
Christy Bergman retweeted
21 Sep 2024
CUDA MODE hackathon today! Here's @karpathy on the 🏖️ origin story of llm.c, and what it hints at for the fast, simple, llm-compiled future of custom software.
13
53
616
97,444
Interesting take-down how to do LoRA properly, quickly, with less memory, on all layers @danielhanchen's tweet and blog unsloth.ai/blog/contpretrain… ! > For continued pretraining, I advise people to train on all layers (inc gate) lm_head, embed_tokens, use RS LoRA, use rank>=256
My take on "LoRA Learns Less and Forgets Less" 1) "MLP/All" did not include gate_proj. QKVO, up & down trained but not gate (pg 3 footnote) 2) Why does LoRA perform well on math and not code? lm_head & embed_tokens wasn't trained, so domain shifts not modelled. Also reason why "LoRA Forgets Less". Use "modules_to_save" in HF PEFT or "lm_head", "embed_tokens" in @UnslothAI 3) Code rank=256 used α=32 (too small!) (pg 18), but Maths α=2*r=512. RS LoRA paper showed α/sqrt(r) needed for larger ranks. & common practice is 2*r. So also why Code did worse than Maths 4) Extrapolating Maths vs fft looks good. Small datasets LoRA>fft, but I theorize that's because of reason 2 5) LoftQ & PiSSA paper init LoRA from SVD(W) => papers show comparable perf of LoRA 6) LoRA paper shows B matrix needs larger lr. DoRA (mentioned in paper) learns these scalars. TLDR: Code worse since α=32 is too small. No embed_tokens, lm_head (or layernorms), not even gate_proj? Better init & lr scaling can help For continued pretraining, I advise people to train on all layers (inc gate) lm_head, embed_tokens, use RS LoRA, use rank>=256 LoRA paper: arxiv.org/abs/2405.09673 RS LoRA paper: arxiv.org/pdf/2312.03732 LoRA paper: arxiv.org/pdf/2402.12354 PiSSA paper: arxiv.org/pdf/2404.02948 DoRA paper: arxiv.org/pdf/2402.09353
109
Christy Bergman retweeted
🌟Join our expert panel at The AI Conference 2024 to explore advanced RAG (Retrieval-Augmented Generation) techniques. Learn how integrating information retrieval with generative models is revolutionizing AI, making it more contextually rich and useful in real-world applications. Don’t miss out—register now to be part of the future of AI! aiconference.com #developers #TAIC2024 #data #programmers #software #innovators #techindustry #engineer #scientists #theaiconference
1
5
7
2,189
Christy Bergman retweeted
30 Jul 2024
Monday Meetup is right around the corner! 🗣 Join us in SF on August 5 for exciting talks: 🔢 Using Ray Data for Multimodal Embedding Inference with @cbergman 📐 A Different Angle: Retrieval Optimized Embedding Models @marqo_ai 🛠 Building the Future of Neural Search: How to Train State-of-the-Art Embeddings with @mixedbreadai 🔗 Save your spot: lu.ma/3q2brqp8 #Meetup #AI #RAG #SFevents
5
7
551