Claude Opus 4.8 single-handedly wrecked zcash:native
In-game simulations at
gob.fun called it before it happened
2 days ago, adversarial benchmarks scored Opus 4.8 as #1 for:
- Ability to find and exploit gaps
- Reasoning
- Outcome prediction
When you combine those 3 - the results are scary.
Opus 4.8 scored high at safety at 88.5% - but the gap is exploitable by savvy operators