Assistant Prof. @ColumbiaCompSci working on Systems Agents. Former Google, @Stanford. Athenian.

Joined September 2010
5 Photos and videos
Kostis Kaffes retweeted
May 23
The models causing failures cannot be the systems we rely on to prevent them. A Cursor-based coding agent deleted a production database and its backups. Amazon’s Kiro was reportedly involved in a 13-hour outage after deleting and recreating an environment. A zero-click vulnerability in ChatGPT Deep Research enabled Gmail data exfiltration from a malicious email. Better prompts and guardrails will not prevent disasters because, once agents can read and write state, failures become data loss, leakage, outages, and irreversible actions. A probabilistic model cannot be the only safety boundary around its own actions. No single component or checkpoint can reliably contain these risks because unsafe side effects emerge across interconnected systems, data flows, and external tools in ways difficulty to fully anticipate. Safety therefore has to be enforced by the surrounding data environment itself, which must constrain, isolate, and manage how agents interact with state. Agents need environments designed for agent workloads. At Columbia Data Agents Process Lab (DAPLab), we call these Agentic Data Environments: systems that actively prepare, expose, constrain, and version the state agents operate over. An Agentic Data Environment must: 🚀 Amplify agent capabilities: deliver relevant information in forms agents can use to complete real tasks. 🛡️ Bound downside risks: enforce constraints on what agents can read, combine, modify, and release, while supporting safe exploration over branched state. Building this requires new systems infrastructure: branch-native environments, deterministic data-flow controls, agent-oriented information management, and retrieval that works over real data lakes. Over the next couple weeks, we are publishing a series that lays out this agenda and the open research problems behind it. First overview post here: daplab.cs.columbia.edu/gener… #agentautomation #safety #agenticdataenironments
1
6
317
Kostis Kaffes retweeted
Apr 24
📢 Upcoming AI Entrepreneurship Series Talk Title: Empowering Future Gen-AI Enterprise and Research Through AI-Native Cloud: Together AI's Perspective Speaker: Leon Song Location: Davis Auditorium Date/Time: Thursday, April 30, 2026, 11:30 AM ET Bio: Leon Song is the Vice President of Research at Together AI, where he leads the R&D organization to develop large-scale, industry-leading inference system solutions. Prior to Together AI, he was a Senior Principal Research Manager at Microsoft, working on DeepSpeed and Brainwave, and served as Chief Scientist for the DeepSpeed4Science initiative. Earlier in his career, he was a tenured professor of computer science and also worked for the U.S. Department of Energy. He is an ACM Distinguished Speaker. Abstract: We are living in the era of GenAI, which has transformed not only the computing industry but also our daily lives. In this talk, Leon Song will discuss the development of the Together AI Native Cloud, designed to empower next-generation enterprise-scale GenAI through customized, end-to-end solutions across the entire AI lifecycle, powered by open-source models. He will highlight innovations in inference system research and their impact on real-world applications, and explore future trends in GenAI, including large-scale agentic systems, multi-silicon adoption, and the evolution of AI-native cloud infrastructure.
1
2
246
Kostis Kaffes retweeted
Calling all academic researchers working on AI Agents!! 👇 We’ve extended the deadline to April 6 for North East AI Agents Day — a one-day workshop in NYC bringing together _academic_ researchers across ML, Systems, and HCI pushing the frontier of agentic AI. If you're building, studying, or questioning agents — this is your crowd. 📍 May 8 📍 NYC (Jane Street offices) 📝 Submit a short extended abstract by April 6: ne-agents-day.github.io/#sub… More details: ne-agents-day.github.io/ --- Come meet the people shaping the future of agents. #AIAgents #AgenticAI #MachineLearning #SystemsResearch #HCI #AIResearch #AcademicTwitter #NYCTech

3
6
1,167
Kostis Kaffes retweeted
👀👀New open-source package out alert! If you're building AI agents, you're probably overpaying by 10–100x. Not because of bad prompts. Because you're running the same expensive model on every step, and nobody's questioned it. We built AgentOpt: an open-source tool that finds the best model combination for your agent, cutting costs while maintaining performance. Works with almost any agent framework (LangChain, CrewAI, OpenAI, LlamaIndex, and more). Zero changes to your agent code. ⁉️What we found out using this package 1⃣Cost Reduction: Matching frontier-model accuracy costs 20–118x less with the right model combination 2⃣Counter-intuitive Combos: The best combo doesn’t contain the “best” individual models. On HotpotQA, the weakest planner (Mistral 3 8B) paired with Claude Opus outperforms Opus-as-both by delegating correctly to search tools 3⃣Search efficiency: Our bandit-based Arm Elimination algorithm finds near-optimal combos using 40–60% less evaluation budget than brute force Docs blog: agentoptimizer.github.io/age… GitHub: github.com/AgentOptimizer/ag… Such a pleasure to work with @HuaWenyue31539, @kkaffes, Qian Xie, Sripad Karne, Armaan Agrawal, Nikos Pagonas to make this tool possible! This project is supported by @DAP__Lab at Columbia University. #LLM #AIAgents #OpenSource
2
3
7
753
Kostis Kaffes retweeted
Mar 11
📢[New AI Entrepreneurship Series Talks] Title: AI Attacks Speaker: Dr. Neil Daswani Location: Davis Auditorium Date/Time: Thursday, March 19, 2026, 11:30 AM ET Bio: Dr. Neil Daswani is a CISO-in-Residence at Firebolt Ventures and Co-Academic Director of Stanford’s Advanced Cybersecurity Program. After completing his PhD at Stanford University and leading security initiatives at Google, he co-founded Dasient, a cybersecurity company funded by Google Ventures and later acquired by Twitter/X. After his time at Twitter, he served as CISO of several public companies including LifeLock, Symantec’s Consumer Business Unit, and QuantumScape. Today, he advises multiple venture capital funds and focuses on both securing artificial intelligence and applying AI to cybersecurity. Dr. Daswani has co-authored two books, Big Breaches: Cybersecurity Lessons for Everyone and Foundations of Security: What Every Programmer Needs to Know. He holds over a dozen patents, has published numerous technical articles, and earned his PhD and MS in Computer Science from Stanford and his BS in Computer Science with honors and distinction from Columbia University. Abstract: In this talk, Dr. Daswani will discuss the emerging era of non-human adversaries, where AI does not merely assist hackers but autonomously executes the majority of attack workflows. He will examine key developments in AI-driven cyber threats, from AI-orchestrated espionage campaigns to multimillion-dollar deepfake fraud incidents, and discuss what these developments mean for the future of cybersecurity and artificial intelligence.
2
1
249
Kostis Kaffes retweeted
Feb 24
📢[Upcoming AI Entrepreneurship Series Talks] Guest Name: Ivan Burazin Title of the speech: Scaling RL Rollouts: Agent-Native Infrastructure with Daytona Bio of the guests: Ivan Burazin is the co-founder and CEO of Daytona, one of the fastest-growing infrastructure companies of its generation. Daytona is building agent-native cloud infrastructure that enables AI agents to securely run, fork, and manage stateful runtime environments at scale. Backed by $31M, including a $24M Series A led by FirstMark Capital, Daytona powers millions of sandboxes per day for startups and Fortune 500 companies building autonomous AI systems. Previously, Ivan co-founded Codeanywhere, one of the first cloud IDEs (2009), and created Shift, Europe’s leading developer conference, acquired by Infobip in 2021. He later joined Infobip’s executive board as Chief Developer Experience Officer. Abstract of the talk: In this talk, we’ll outline why a new class of agent-native infrastructure is emerging, what problems it is designed to solve, and the core use cases driving it, from autonomous coding agents to large-scale evaluation and training workloads. Daytona is an agent-native control plane designed to orchestrate isolated, stateful sandbox environments at scale. We’ll break down the infrastructure challenges behind isolation, state management, and massive parallelism, and why traditional VM and container stacks fall short. As a concrete example, we’ll walk through scaling RL rollouts, showing how tens of thousands of environments can be provisioned and orchestrated in minutes as part of a high-throughput RL pipeline. Location Davis Auditorium, 530 W 120th St, New York, NY 10027, USA Talk time Date: March 5, 2026 Time: 11:30 - 1:00 PM
2
792
Kostis Kaffes retweeted
Feb 11
🤖 Calling all academic researchers in Agents! We are excited to announce North East AI Agents Day, a one-day workshop bringing together communities in ML, Systems, and HCI! 📅 May 8th 📍 New York 💡 Submit your extended abstract (DDL: Apr 1st)! More: ne-agents-day.github.io

12
38
17,087
Kostis Kaffes retweeted
📡Columbia Engineering AI Entrepreneurship Series Title: A Talk about Parallel.AI (TBD) Speaker: Parag Agrawal Location: Davis Auditorium Date/Time: Thursday, February 5, 2026, 11:00 AM ET Bio: Parag Agrawal is the founder of Parallel Web Systems, a company unlocking the web for AI agents. Previously, he spent 11 years at Twitter, where he joined as an engineer before serving as CTO, and then CEO. Parag has a PhD from Stanford University in Computer Science and a Bachelor’s degree in Computer Science and Engineering from IIT, Bombay.
1
4
6
3,010
Kostis Kaffes retweeted
Jan 18
Why Vibe Coding Fails and How to Fix It Everyone is talking about how AI agents will 10x developer productivity. But anyone who has actually built a real app with Cursor, Cline, or Replit knows the reality: The first draft looks amazing. But as soon as you try to iterate? The application starts breaking. This is the struggle of Vibe Debugging. It starts out looking great. But then you encounter silent errors and buggy logic. You realize the AI doesn't actually understand what you are building, and you are stuck trying to fix a black box. At Columbia DAPLab, we are investigating exactly why this happens. We have written a blog series on the reality of Vibe Debugging and how to close the gap between demo and production. Read our first part here! daplab.cs.columbia.edu/gener…
1
4
7
681
I am at #NeurIPS2025 and excited to chat about DAPLab’s projects. Find me or message me!
3 Dec 2025
🚀 Excited to share that DAP Lab has 6 papers accepted at #NeurIPS2025 — covering multi-agent reasoning, LLM caching, persona risks, system tuning via LLM agents, simulation-first agent training, and RL theory 👇 🔍Check them out if you are at #NeurIPS2025! We’d love feedback, discussions, and potential collaborations. Paper list here: • Multi-agent Markov Entanglement (Shuze Chen, Tianyi Peng) — Spotlight winner of INFORMS JFIG & 2nd place in George Nicholson Student Paper Competition 🏆 • Tail-Optimized Caching for LLM Inference (Wenxin Zhang, Yueying Li, Ciamac C. Moallemi, Tianyi Peng) — improving LLM inference efficiency 👏 • LLM Generated Persona Is a Promise With a Catch (Ang Li, Haozhe Chen, Hongseok Namkoong, Tianyi Peng) — a position paper reflecting on strengths & caveats of LLM-derived personas 👩‍👩‍👦‍👦 • LLM Agents for Always-On Operating System Tuning (Georgios Liargkovas, Vahab Jabrayilov, Hubertus Franke, Kostis Kaffes) — leveraging LLMs for live OS tuning, showing better performance than classical ML tuning.🔧 • RAISE: Reliable Agent Improvement via Simulated Experience (Sahar Omidi Shayegan, Joshua Meyer, Victor Shih, Sebastian Sosa, Tianyi Peng, Kostis Kaffes, Eugene Wu, Andi Partovi, Mehdi Jamei) — simulation-first AI-agent training framework 🔄. • Q-learning with Posterior Sampling (Priyank Agrawal, Shipra Agrawal, Azmat Azati) — a new RL algorithm achieving near-optimal theory guarantees in tabular episodic MDPs 🎯 #MachineLearning #AI #LLM #Systems #MultiAgent #NeurIPS
2
168
28 Nov 2025
Heading to NeurIPS for the first time this year and I am excited to share that we will be presenting two workshop papers on agents infrastructure over the weekend. If you want to chat about systems, agents, and environments reach out, I would love to connect.
1
1
7
288
28 Nov 2025
An Expert in Residence: LLM Agents for Always-On Operating System Tuning (led by @gliargko) @ ML4Sys. We explore an online LLM-driven “expert” loop for live OS tuning, and what it takes to make this kind of control safe auditable in real systems.
1
1
122
28 Nov 2025
RAISE: Reliable Agent Improvement via Simulated Experience (with @DAP__Lab partners @veris_ai) @ SEA. We present a simulation-first framework for improving reliability and policy compliance of enterprise agents via high-fidelity tool/user/policy environments and trajectories.
2
119
14 Oct 2025
Check out the first exciting batch of systems work coming out of @DAP__Lab
14 Oct 2025
This week, we are presenting a slate of new research at SOSP workshops, spanning agentic infrastructure and self-tuning kernels. Our work lays the foundation for a future agentic infrastructure that will enable the safe, reliable and efficient operation of LLM agents in real-world environments. Highlights below and all papers available on our website.
1
5
533
Kostis Kaffes retweeted
📢Call for Papers 📢 The CFP for the 2025 eBPF workshop is out! 📅 Deadline: May 8th 🔗 More info: lnkd.in/dFfpTcYf Join @ngsrinivas, @get_gianni_up, @apanda, Paul Chaignon, and me to share your #eBPF 🐝use cases or research challenges!
5
7
1,507
Kostis Kaffes retweeted
19 Feb 2025
Exciting post-doc opportunity at the Columbia DAPlab, co-directed by Eugene Wu and myself! We are looking for AI Agents for Fall 2025. 🔥 Check out the details in the job post here: daplab.cs.columbia.edu/posit… Please retweet 🙏 hashtag#ColumbiaUnversity hashtag#AI hashtag#AIAgents
9
34
5,550
Kostis Kaffes retweeted
Artificial intelligence is changing our world. And you can be in the room with the people leading the way. Submit a poster for inclusion at the Columbia AI Summit. Form: bit.ly/40IcHnD Learn more and register for the summit: ai.columbia.edu/ai-summit
1
3
8
1,011
Kostis Kaffes retweeted
23 Dec 2024
The program of the 8th Computing Systems Research Day (January 7th, NTUA, Zografou Campus) with Kostis Kaffes @kkaffes Nikos Vasilakis @nikosvasilakis Chloe Alverti @chpoppins Stratos Psomadakis @ps0mas and Demos Masouros is here: cslab.ece.ntua.gr/ See you there!
1
5
16
1,737
Kostis Kaffes retweeted
10 Dec 2024
We are excited to be starting a new lab that rethinks the systems, ML, and HCI stack to support agent-based automation! Columbia's PhD deadline is 12/15 -- please share with potential applicants! columbia-dap-lab.github.io
4
16
1,532
Kostis Kaffes retweeted
🚀Join the DAPLab to make agent-based automation more accountable, reliable, and efficient. 🌐🤖 For more info on the group - columbia-dap-lab.github.io/. Apply to our #computerscience PhD program bit.ly/CSPhDprogram by December 15.
2
5
1,039