๐ **Thrilled to announce** that our paper **"VeriStruct: AI-assisted Automated Verification of Data-Structure Modules in Verus"** (arXiv:2510.25015) has been **accepted to #TACAS2026**! ๐
๐ **Key results** โ VeriStruct tackles complex Rust data-structure modules in Verus and crushes the benchmarks:
- Successfully verifies **10 out of 11** modules
- Verifies **128 out of 129** functions overall (**99.2%** coverage!)
- Baselines manage only **4/11** modules and **52** functions
๐ค Compared to **Claude Code (Sonnet 4.5)** (which uses autonomous Verus calls):
- Claude verifies **102** functions across **8** benchmarks
- VeriStruct still outperforms it โ with **~22k tokens per benchmark** vs. **~24k** for Claude
๐ Takeaway: **Structured AI workflows beat single-shot prompting** โ delivering better verification coverage, higher success rates, and comparable (or even lower) token costs!
Huge thanks to my amazing co-authors: Yican Sun, Daneshvar Amrollahi, Ethan Zhang, Shuvendu Lahiri, Shan Lu, David Dill, and Clark Barrett!
Paper: arxiv.org/abs/2510.25015
Code: github.com/ChuyueSun/VeriStrโฆ#FormalVerification#AI4Code#Verus#ProgramVerification#TACAS2026#RustLang
LLM shows โgreat promiseโ in code synthesis. Can LLM keep "the promise" to ensure that its synthesis code is formally correct ?
Check our @FSEconf 2024 paper. "Towards AI-Assisted Synthesis of Verified #Dafny Methods"
๐:arxiv.org/abs/2402.00247
๐ฆ:github.com/Mondego/dafny-synโฆ