Professor, MIT EECS and Chief Scientist, Cambridge Mobile Telematics

Joined May 2011
25 Photos and videos
Cool new work from my student Joshua.
Recent agentic systems (Claude Code, Codex, RLM, etc.) push context out of the prompt and into the environment (e.g., as files). This helps them maintain long-term knowledge about their goals and functionality. ๐Ÿšจ While this is a good idea, we show a surprising result: systems that use external environments like this perform much better when given a small, fixed-size, in-context, agent-managed cache that "๐˜ฑ๐˜ฆ๐˜ฆ๐˜ฌ๐˜ด ๐˜ช๐˜ฏ๐˜ต๐˜ฐ" these environments. ๐Ÿš€ Our paper, ๐—ฃ๐—˜๐—˜๐—ž: ๐™– ๐™จ๐™ฎ๐™จ๐™ฉ๐™š๐™ข ๐™›๐™ค๐™ง ๐™—๐™ช๐™ž๐™ก๐™™๐™ž๐™ฃ๐™œ ๐™–๐™ฃ๐™™ ๐™ข๐™–๐™ž๐™ฃ๐™ฉ๐™–๐™ž๐™ฃ๐™ž๐™ฃ๐™œ ๐—ฎ๐—ป ๐—ผ๐—ฟ๐—ถ๐—ฒ๐—ป๐˜๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ฐ๐—ฎ๐—ฐ๐—ต๐—ฒ ๐™›๐™ค๐™ง ๐™‡๐™‡๐™ˆ ๐™–๐™œ๐™š๐™ฃ๐™ฉ๐™จ, introduces this idea. Compared with strong baselines, including RAG, Compaction Agents, and SOTA prompt-learning frameworks, PEEK dominates the costโ€“quality Pareto frontier: achieving 6.3โ€“34.0% in quality, with fewer iterations and lower cost. Paper: arxiv.org/abs/2605.19932 GitHub: github.com/zhuohangu/peek More in the thread below! (1/N)
4
22
6,836
Sam Madden retweeted
๐Ÿ“ข CfP: Conference on Innovative Data Systems Research #CIDR2027 (Amsterdam, The Netherlands). Papers due: August 4, 2026. @ViktorLeis, @samrmadden, and I are looking forward to your paper & demo submissions! cidrdb.org/cidr2027/

3
12
619
Interesting article about #SemBench, a new benchmark for semantic AI operators; our Palimpsest system does super well on it!
๐Ÿ“ฃ We're spreading the word about #SemBench -- a brand new benchmark for semantic query processing over multimodal workloads including text, image, audio, and tabular data! ๐Ÿ“œPaper: bit.ly/3WGsZf6 ๐Ÿ’ปWebsite: sembench.org ๐Ÿ’พCode: bit.ly/49DeuAg
1
673
21 Jun 2025
Start thinking about your CIDR submissions now!
17 Jun 2025
๐Ÿ“ข CfP: Conference on Innovative Data Systems Research #CIDR2026 (Chaminade, USA). Papers due: August 5, 2025. Gustavo Alonso, @samrmadden, and I are looking forward to your paper & demo submissions! cidrdb.org/cidr2026/
1
2
1,338
Sam Madden retweeted
Henry Corrigan-Gibbs, a professor in @MITEECS, has been named a recipient of the 2024 Junior Bose Award.
5
11
1,291
19 Dec 2024
Last chance to register for CIDR 2025 in Amsterdam!
19 Dec 2024
The CIDR 2025 program is online: cidrdb.org/cidr2025/program.โ€ฆ Registration closes today -- Join us in Amsterdam: cidrdb.org/cidr2025/registraโ€ฆ
1
4
1,775
Sam Madden retweeted
27 Aug 2024
Vol:17 No:12 โ†’ Databases Unbound: Querying All of the Worldโ€™s Bytes with AI vldb.org/pvldb/vol17/p4546-mโ€ฆ
1
5
30
4,097
Sam Madden retweeted
It took three years to finish, but our follow-up to the 2006 "What Goes Around Comes Around" is finally out! Stonebraker and I examine the last 20 years in databases and discuss why relational databases SQL will continue to remain on top. ๐Ÿ“„PDF: db.cs.cmu.edu/papers/2024/whโ€ฆ
24
335
1,262
172,952
Sam Madden retweeted
13 Jun 2024
Sam Madden: โ€The world is the databaseโ€ @samrmadden @SIGMODConf #SIGMOD2024 ๐ŸŽ‰
5
51
10,394
30 May 2024
Excited to announce the release of our Palimpzest system and paper!
Replying to @RussoMatthew
If this (very high-level) summary of our work has piqued your interest -- go read our full paper! ๐Ÿ“„Paper: arxiv.org/pdf/2405.14696 ๐Ÿ’ปCode: github.com/mitdbg/palimpzestโ€ฆ We would love to hear any feedback, ideas for more use cases, and/or opportunities for collaboration.
9
1,554
Sam Madden retweeted
Submit to #NEDB2024 by 4/12 (11:59pm EST): cmt3.research.microsoft.com/โ€ฆ * Talks (2-page abstracts) on recent/new research or industry experience * Posters (1-page abstract) Stay tuned for more details on registration! bu-disc.github.io/nedbday/20โ€ฆ @samrmadden @vkalavri @__ssarkar

4
6
2,397
Sam Madden retweeted
Amazon Redshift Serverless is launching next gen AI-driven scaling and optimizations. With the great work of @tim_kraska and his team, Redshift pushes the capabilities of serverless data analytics! #amazon #redshift #serverless #aidrivenscaling
6
16
3,405
Sam Madden retweeted
It's official โ€” our company, @ponderdata, is all set to join Snowflake! So delighted that we're joining forces with the leading cloud data warehouse โ€” and bringing our data science capabilities to all their customers.
23 Oct 2023
Snowflake announced its intent to acquire @ponderdata to further enable Python data scientists in the Data Cloud. We look forward to welcoming the Ponder team and the Modin community to Snowflake. Learn more: okt.to/I6WXlc
26
15
220
47,051
21 Sep 2023
I make a guest appearance!
5
861
Sam Madden retweeted
31 Jul 2023
What are the most influential database papers of all time? Everyone has an opinion, but here's some data! I ran PageRank on the VLDB/SIGMOD/CIDR citation graph. Search for a paper, view a specific year, or look up an author's most "influential" papers. rmarcus.info/blog/2023/07/25โ€ฆ

7
23
116
17,087
Sam Madden retweeted
19 Jun 2023
Paper is available here: github.com/tli2/tli2.github.โ€ฆ DARQ is joint work with @badrishc @sebburckhardt @samrmadden and will soon be available in open source as part of the Microsoft FASTER project. Come hang out on Wednesday for the talk if you are interested!

1
6
1,676
27 Feb 2023
By the way, registration deadline is this Wednesday! Hope to see you all there!
25 Feb 2023
Program is up and registration is now open for Northeast Database Day, March 10, 2023, at Northeastern University in Boston! northeastern-datalab.github.โ€ฆ
1
1
5
2,267
25 Feb 2023
Program is up and registration is now open for Northeast Database Day, March 10, 2023, at Northeastern University in Boston! northeastern-datalab.github.โ€ฆ

2
9
3,769