Rishi Khare

Rishi Khare

17 Photos and videos

Tweets

Rishi Khare

@rishiskhare

May 9

100% agree on the Context Hub. Developers constantly tweak their approaches to manage context in their prompts with each new model and tool suite release. I would even say that the core problem we face is a "Config Hub" problem - different configurations of models, harnesses, and data augmentation techniques are making managing agents in prod much more brittle and difficult.

Harrison Chase

@hwchase17

May 9

x.com/i/article/205315683745…

5,659

Lakshya A Agrawal

Rishi Khare retweeted

Lakshya A Agrawal

@LakshyAAAgrawal

Apr 29

Excited to share that my ICLR 2026 Oral Talk for GEPA is available on YouTube. I go deeper into why GEPA works better than prior optimization techniques, along with touching on many aspects of GEPA! youtu.be/HbGah-uP1fI

Lakshya A Agrawal

@LakshyAAAgrawal

Apr 23

Thrilled to present GEPA as an Oral Talk and Poster at ICLR 2026 this Friday in Rio! 🇧🇷 Apr 24 Oral Session 3A (Agents), 10:30 AM BRT, Amphitheater Poster Session 4, 3:15 PM, Pavilion 3 x.com/LakshyAAAgrawal/status… Let's recap what's happened since we released GEPA last year 🧵

245

30,223

Rishi Khare

Rishi Khare

@rishiskhare

Apr 26

"we have reached agi..."

Lakshya A Agrawal

Rishi Khare retweeted

Lakshya A Agrawal

@LakshyAAAgrawal

15 Aug 2025

Very excited to share that GEPA is now live on @DSPyOSS as dspy.GEPA! This is an early code release. We’re looking forward to community feedback, especially about any practical challenges in switching optimizers.

323

45,582

Rishi Khare

Rishi Khare

@rishiskhare

5 Aug 2025

Can't wait to try this on my mac... @lmstudio gpt-oss-20b and 120b from @OpenAI just released <1hr ago

384

Marc Andreessen 🇺🇸

Rishi Khare retweeted

Marc Andreessen 🇺🇸

@pmarca

4 Aug 2025

S&P 10 vs S&P 490 🫣

492

1,013

10,150

1,679,108

DSPy

Rishi Khare retweeted

DSPy

@DSPyOSS

4 Aug 2025

🤯 must watch

Connor Shorten

@CShorten30

4 Aug 2025

GEPA is a SUPER exciting advancement for @DSPyOSS and a new generation of optimization algorithms re-imagined with LLMs! 🧩🚀 Starting with the title of the paper, the authors find that Reflective Prompt Evolution can outperform Reinforcement Learning!! 🤯 Using LLMs to write and refine prompts (for another LLM to complete a task) is outperforming (!!) highly targeted gradient descent updates using cutting-edge RL algorithms! ⚖️ GEPA makes three key innovations on how exactly we use LLMs to propose prompts for LLMs -- (1) Pareto Optimal Candidate Selection, (2) Reflective Prompt Mutation, and (3) System-Aware Merging for optimizing Compound AI Systems. 🧠🧠 The authors further present how GEPA can be used for training at test-time, one of the most exciting directions AI is evolving in! 🚀 Here is my review of the paper! I hope you find it useful! 🎙️

191

10,066

Justine Moore

Rishi Khare retweeted

Justine Moore

@venturetwins

3 Aug 2025

Guy has his fingers positioned wrong on his keyboard and types gibberish into ChatGPT. It decodes what he meant to say by assuming his hand was shifted right and re-mapping his fingers 🤯 (from u/mimic751)

136

237

4,698

328,001

Rishi Khare

Rishi Khare

@rishiskhare

4 Aug 2025

Strongly recommend ML practitioners try @lmstudio. Tried it for the first time today and found it remarkable how far private local models have come since the last time I used @ollama with Llama 3.

3,225

alphaXiv

Rishi Khare retweeted

alphaXiv

@askalphaxiv

30 Jul 2025

Iterative reflections for LLMs can outperform heavy RL? This paper shows that having the LLM reflects on its own trajectories, rewrite its own prompts, and evolve a diverse pool of candidates beats RL w/ GRPO so far on four reasoning tasks . 10% improv with 35x fewer rollouts!

223

15,320

Dexerto

Rishi Khare retweeted

Dexerto

@Dexerto

2 Aug 2025

Elon Musk has uncovered Vine video archives that were thought to be lost media He's restoring access so users can watch their favorite Vines again

859

3,408

103,624

6,158,557

Y Combinator

Rishi Khare retweeted

Y Combinator

@ycombinator

30 Jul 2025

Infrastructure for Multi-Agent Systems @koomen Multi-agent systems are powerful but hard to build. Think agentic MapReduce with thousands of subagents running in parallel. We're looking for folks who've felt the pain of scaling these systems and want to make operating fleets of agents as easy as deploying a web service.

1:00

11,498

DSPy

Rishi Khare retweeted

DSPy

@DSPyOSS

31 Jul 2025

Our latest optimizer GEPA writes beautiful prompts, even with a “mini” model. Stay tuned for a lot more these coming days.

Lakshya A Agrawal

@LakshyAAAgrawal

28 Jul 2025

Replying to @LakshyAAAgrawal

We implemented GEPA as a new @DSPyOSS optimizer (release soon!). This means that it works for even sophisticated agents or compound systems you've already implemented. GEPA outperforms the MIPROv2 optimizer by as much as 11% across 4 tasks for Qwen3 and GPT-4.1-mini. Of course: Weight updates remain necessary to teach the models completely new tasks and still excel at general-purpose (massively multi-task!) post-training! However, we show that for specialization to downstream systems, reflective prompt optimization can go really far with tiny data sizes and rollout budgets! (2/n)

391

30,521

Jack Morris

Rishi Khare retweeted

Jack Morris

@jxmnop

31 Jul 2025

this seems really important: it is totally plausible that a model could get IMO gold without *any* reinforcement learning, given a perfectly-crafted prompt we just don't know, and lack tools to efficiently search through prompt space. glad to see at least someone is trying

Lakshya A Agrawal

@LakshyAAAgrawal

28 Jul 2025

How does prompt optimization compare to RL algos like GRPO? GRPO needs 1000s of rollouts, but humans can learn from a few trials—by reflecting on what worked & what didn't. Meet GEPA: a reflective prompt optimizer that can outperform GRPO by up to 20% with 35x fewer rollouts!🧵

440

42,086

Omar Khattab

Rishi Khare retweeted

Omar Khattab

@lateinteraction

29 Jul 2025

Methods that may not have even existed when you wrote your DSPy program... Policy gradient RL (GRPO) vs Bayesian search over grounded instruction/fewshot proposals (MIPRO) vs Reflective prompt learning (GEPA) All optimizing identical DSPy programs! And better optimizers to come

Lakshya A Agrawal

@LakshyAAAgrawal

28 Jul 2025

179

16,251

Omar Khattab

Rishi Khare retweeted

Omar Khattab

@lateinteraction

28 Jul 2025

New paper: Reflective Prompt Evolution Can Outperform GRPO. It's becoming clear that learning via natural-language reflection (aka prompt optimization) will long be a central learning paradigm for building AI systems. Great work by @LakshyAAAgrawal and team on GEPA and SIMBA.

Lakshya A Agrawal

@LakshyAAAgrawal

28 Jul 2025

487

70,566

Rohan Paul

Rishi Khare retweeted

Rohan Paul

@rohanpaul_ai

29 Jul 2025

Brilliant paper. GEPA (Genetic-Pareto), a prompt optimizer that thoroughly incorporates natural language reflection to learn high-level rules from trial and error. GEPA shows that evolving prompts with natural‑language feedback outperforms reinforcement learning by up to 19% and needs 35× fewer rollouts. It teaches a multi‑step AI system by rewriting its own prompts instead of tweaking model weights. The standard approach, reinforcement learning (RL), particularly Group Relative Policy Optimization (GRPO), is effective but computationally expensive. A full rollout is expensive, which is 1 complete try where the language model tackles a task, gets judged, and sends that score back to the training loop. Algorithms like GRPO need a huge batch of these episodes, often 10 000 – 100 000, because policy‑gradient math works on averages. A reliable gradient estimate appears only after you have sampled lots of different action paths across the problem space. Fewer samples would leave the update too noisy, so the optimiser keeps asking for more runs. 🧬 What GEPA actually is GEPA flips the script by reading its own traces, writing natural‑language notes, and fixing prompts in place, so the learning signal is richer than a single score After each run it asks the model what went wrong, writes plain‑text notes, mutates the prompt, and keeps only candidates on a Pareto frontier so exploration stays broad. The Pareto‑based rule chooses a candidate that looks good but has room to improve. This avoids wasting time on hopeless or already‑perfect options. Quick tests on tiny batches spare rollouts, and winning changes migrate into the live prompt. Across 4 tasks it beats GRPO by up to 19% while needing up to 35X fewer runs, also overtaking MIPROv2. 🧵 Read on 👇

52,772