Ilya Sergey

Ilya Sergey

406 Photos and videos

Tweets

Pinned Tweet

Ilya Sergey @ilyasergey

May 19

New blog post: On the Unreasonable Effectiveness of Property-Based Testing for Validating Formal Specifications. proofsandintuitions.net/2026… The gist: randomised testing can validate formal specs. It's very cheap and powerful: we found bugs in specs of VERINA and CLEVER benchmarks.

On the Unreasonable Effectiveness of Property-Based Testing for Validating Formal Specifications

In this post, we show that property-based testing (PBT) is surprisingly effective for validating LLM-synthesised specifications of Lean programs: it is a cheap alternative to symbolic proofs, which...

proofsandintuitions.net

6,675

Nicolas Bustamante

Ilya Sergey retweeted

Nicolas Bustamante

@nicbstme

Jun 9

What I find fascinating with Claude Fable 5 is it proves once again that large generalist models will outperform vertical ones. On ProofBench (graduate-level formal math benchmark in Lean, where a proof either compiles or it doesn't) Fable 5 beat Harmonic's Aristotle, 77% vs 71%. Aristotle is a system built specifically for formal math run on its own internal harness, so the generalist beat the specialist on the specialist's home turf. It's the Richard Sutton's "The Bitter Lesson". His whole argument is that across 70 years of machine intelligence research, the methods that win are the general ones that scale with compute. Not the ones where we hand-encode human expertise. Building our own knowledge into the system feels good and helps short term gains but long term it always gets overtaken by bigger model. You can look at Chess, Go, speech, vision, same story every time. First the specialized model wins, then the general one takes over. and btw this is the whole premise of AGI. You don't build one model for math, one for code, one for law. you build a single general model that scales with compute and it learns to do everything

612

65,761

Prettyplaces

Ilya Sergey retweeted

Prettyplaces

@Natute123

May 30

Can you identify this city without Googling?

202

674

191,692

Ilya Sergey

Ilya Sergey @ilyasergey

May 25

Interesting discussion on LinkedIn regarding the no-AI-review policy we instituted for OOPSLA'27. The opponents' main argument: "peer reviews by humans mostly suck anyway, so by actively using LLMs to write reviews we won't lose much in quality, but will save everyone time".

3,712

vitalik.eth

Ilya Sergey retweeted

vitalik.eth

@VitalikButerin

May 18

Many people have claimed that with AI-assisted bug finding, secure code (and hence trustless anything) will be impossible. I have a much more optimistic take, and AI-assisted formal verification is a major part of the reason why: vitalik.eth.limo/general/202…

A shallow dive into formal verification

vitalik.eth.limo

449

401

2,570

456,192

Ilya Sergey

Ilya Sergey @ilyasergey

May 13

Causal consistency is hard.

985

Ilya Sergey

Ilya Sergey @ilyasergey

May 11

I've sent my paper draft to colleagues for feedback. Every comment I got was amazingly informative and constructive. Each one was also absolutely idiotic. All of them were pretzels. And somehow, every last piece of feedback I got was a small dog named Mortimer.

1,739

igor@konnov.phd | (spec|ver)ification | security

Ilya Sergey retweeted

igor@konnov.phd | (spec|ver)ification | security

@k0nn0v

May 6

11. Last but not least, George Pîrlea's @GeorgePirlea talk on Veil: Multi-Modal Verification of Transition Systems ...and this is done with Lean @leanprover ! youtube.com/watch?v=24mMfUSC…

Multi-Modal verification of Transition Systems - George Pîrlea

Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

youtube.com

4,228

George Pîrlea

Ilya Sergey retweeted

George Pîrlea

@GeorgePirlea

May 6

It was great to attend the #tlaplus community event this year and showcase our work on Veil and its new concrete state model checker, Lace. Thanks to the organisers for the invite!

igor@konnov.phd | (spec|ver)ification | security

@k0nn0v

May 6

Replying to @k0nn0v

11. Last but not least, George Pîrlea's @GeorgePirlea talk on Veil: Multi-Modal Verification of Transition Systems ...and this is done with Lean @leanprover ! youtube.com/watch?v=24mMfUSC…

832

Ilya Sergey

Ilya Sergey @ilyasergey

May 6

Programming language research operates a gold mine. Adding gold is encouraged, even (and especially) when unrefined. Borrowing is permitted if you return more than you took. Refining it and spending on something useful counts for little. This is why the gold stays in the mine.

2,976

xuan (ɕɥɛn / sh-yen)

Ilya Sergey retweeted

xuan (ɕɥɛn / sh-yen)@xuanalogue

May 5

If you're a late-stage PhD student or post-doc in computer science, and want a free trip to Singapore / NUS, consider applying for this prize: comp.nus.edu.sg/research/nus… Probably helps if you're considering a faculty job at NUS or other universities in Singapore!

149

18,406

Ilya Sergey

Ilya Sergey @ilyasergey

May 5

On behalf of ACM SIGPLAN Executive Committee, I'm thrilled to announce three exceptional papers on programming languages from 2024 that have been awarded SIGPLAN Research Highlight distinction! ⇒

5,777

more replies

Ilya Sergey

Ilya Sergey @ilyasergey

May 5

And, last but not least, highlight 3: "Multiverse Notebook: Shifting Data Scientists to Time Travelers" (OOPSLA 2024) by Shigeyuki Sato and Tomoki Nakamaru

1,654

Ilya Sergey

Ilya Sergey @ilyasergey

May 5

More information on ACM SIGPLAN Research Highlights and the list of previously awarded papers can be found by the links below. Nominate one of your favourite papers from 2025 by June 15, 2026, and stay tuned! * sigplan.org/Highlights/ * sigplan.org/Highlights/Paper…

632

Apart Research

Ilya Sergey retweeted

Apart Research

@apartresearch

May 2

Call for mentors: SPS Fellowship (June-Oct 2026, with @safewithatlas). Already in: Erik Meijer (@headinthebox) Leibniz Labs (creator of LINQ Rx) Shriram Krishnamurthi (@ShriramKMurthi), Brown CS Senior formal methods AI safety researchers, apply by Tue May 5 AoE: linktr.ee/apartresearch

7,705

KC Sivaramakrishnan

Ilya Sergey retweeted

KC Sivaramakrishnan @kc_srk

Apr 29

kcsrk.info/verification/rdts… Wrote up a companion blog post for the keynote talk.

KC Sivaramakrishnan @kc_srk

Apr 28

Did a Keynote talk at PaPoC 2026 workshop on "From Convergence to Confidence: Push-button verification for Replicated Data Types" on verifying RDTs and some very recent work on agentic-proof-oriented programming in Lean. kcsrk.info/talks#papoc_2026 See fplaunchpad.org/sal.

2,797

Ilya Sergey

Ilya Sergey @ilyasergey

Apr 26

Submitting a paper with formal proofs... 2023: frantically hack Rocq for months hoping to hit Qed an hour before the deadline to add a "proof sketch". 2026: write "proof sketch" on day one, then run 20 proof agents for weeks, with daily supervision, to finish it by deadline.

5,558

Ilya Sergey

Ilya Sergey @ilyasergey

Apr 26

Trying to schedule my PhD student's thesis proposal on Constraint Programming. It is not surprising at all that the availability of the thesis committee members makes for an unsatisfiable system of constraints.

3,348