Will Manning

Will Manning

1 Photos and videos

Tweets

vortex retweeted

Will Manning

@willmanning

May 13

so cool to see another blazing fast database built on vortex!

LangChain

@LangChain

May 13

Just announced at Interrupt! SmithDB. Agent traces have outgrown the databases built to hold them. That’s why we built SmithDB, a purpose-built distributed database for agent observability. Read the announcement from Co-Founder @ankush_gola11 → langchain.com/blog/introduci…

1:07

70,965

Ankush Gola

vortex retweeted

Ankush Gola

@ankush_gola11

May 13

We leveraged two amazing open source projects when building SmithDB. One is @ApacheDataFusio: an extensible Rust based query engine. We built custom execution plans specifically tuned for our workloads and storage backend, and DataFusion made it straightforward to plumb everything together. The other is @vortexdotdev: an extensible file format that allows you to build custom layouts with specific encoding and chunking strategies for different columns. I would highly recommend checking out both of these projects if you're interested in modern data systems.

Ankush Gola

@ankush_gola11

May 13

We built SmithDB: the database purpose built for agent observability workloads that now powers many parts of LangSmith. Agent observability presents a challenging data problem. Agent traces can contain tens of thousands of intermediate spans and large, unbounded payloads. These characteristics are a direct result of agents running for longer time horizons and LLM context window sizes growing. Traditional data infrastructure was not built to handle the complexities associated with storing and querying this data. SmithDB brings LangSmith up to 12x performance improvements across access patterns most important for agent observability. I’ve been working on SmithDB directly with an amazing team over the past few months, and I’m incredibly proud of the results we’re seeing. I wrote a bit more about the story and engineering challenges behind SmithDB in this blog. Additionally, if you’re a systems engineer interested in building the future of agent observability, please reach out!

104

18,029

Spice AI

vortex retweeted

Spice AI

@spice_ai

Apr 13

The Research Behind Modern Data Compression & @vortexdotdev When we chose Vortex as the storage layer for Spice Cayenne (the data accelerator engine in Spice), we were betting on decades of database research finally reaching production-ready maturity. Here's the research behind Vortex: 📄 BtrBlocks (SIGMOD 2023) - The core algorithm from the Technical University of Munich. Cascading multiple lightweight encodings outperforms monolithic compression. Optimize for decompression speed, not just compression ratio. 📄 FastLanes (VLDB 2023) - Hardware-friendly integer compression. Structures bit-packing to maximize SIMD utilization across AVX-512, AVX2, and ARM NEON. Near-memory-bandwidth decompression. 📄 FSST (VLDB 2020) - Fast Static Symbol Table for strings. Near-LZ4 ratios at 5-10× faster decompression. Critical for string-heavy columns. 📄 ALP (CWI Amsterdam) - Adaptive Lossless floating-Point compression. Exploits real-world float patterns (prices with 2 decimals, sensor readings with limited precision). 📄 MonetDB/X100 Morsel-Driven Parallelism - Foundations for vectorized, NUMA-aware query execution that Vortex builds on. The result? Compression that is tailored to your data: • Integers via FastLanes bit-packing • Floats via ALP adaptive encoding • Strings via FSST symbol tables • Timestamps via delta encoding • Sorted columns via run-length encoding Why does this matter for production systems? 1️⃣ Query performance scales with decompression speed. Focus on decode performance translates directly to faster queries. 2️⃣ Automatic encoding selection means zero configuration. The algorithm samples your data and picks optimal strategies per column. 3️⃣ SIMD acceleration is baked in. FastLanes was designed for vectorized, hardware accelerated execution from day one. 4️⃣ Zero-copy Arrow access. Data decompresses directly to Arrow arrays with no intermediate copies. Vortex is now a Linux Foundation AI & Data project, and researchers are building on it (Anyblox, F3). You get SOTA research in production systems. The future of data storage is exciting. To learn more about our Vortex implementation, check out the blog: hubs.ly/Q04bGfvf0 #datafusion #ai #data #vortex #spiceai #arrow #parquet

378

Will Manning

vortex retweeted

Will Manning

@willmanning

Apr 4

Connor Tsui & I just merged a first cut of TurboQuant into @vortexdotdev , already validated on production embeddings 🚀🚀🚀

2,265

vortex

vortex

@vortexdotdev

Apr 4

Fastest OSS file format, in both performance and velocity

Will Manning

@willmanning

Apr 4

Connor Tsui & I just merged a first cut of TurboQuant into @vortexdotdev , already validated on production embeddings 🚀🚀🚀

249

vortex

vortex

@vortexdotdev

Apr 1

you took up with Weasley, but he can't afford sliceable cascaded encodings. now your random access is dogged, and your cortisol is properly spiked, potter

vortex

vortex

@vortexdotdev

Mar 24

hey man, thrilled that you're interested in contributing. we'll be waiting for you in slack vortex.dev/slack

MeekMill

@MeekMill

Mar 24

I need a GitHub too! Is it like that or nah?

162

Luke Kim

vortex retweeted

Luke Kim

@lukekim

Mar 4

CASE-WHEN support coming to @vortexdotdev Guess I'm a Vortex contributor now!

403

vortex

vortex

@vortexdotdev

Jan 23

🦆❤️🚀

DuckDB

@duckdb

Jan 23

DuckDB now supports reading from and writing to the Vortex file format! The DuckDB Labs and Spiral teams have worked together to make Vortex available as a core extension in DuckDB. Vortex is an open source, columnar file format whose design is heavily influenced by recent research in lightweight compression encodings, computing and IO techniques. We gave it a test drive, and it performed very well. Read the full article to learn more lnkd.in/eZfGzPiZ

549

Luke Kim

vortex retweeted

Luke Kim

@lukekim

Jan 14

🌪️ Why LF Vortex for hot data? @ApacheParquet great compression, slow decode @ApacheArrow instant decode, no compression Vortex: encoding-efficient compression with SIMD decode to Arrow 80% of Parquet's compression, 10x faster decode

820

Alfonso Subiotto ❄️

vortex retweeted

Alfonso Subiotto ❄️@asubiotto

15 Dec 2025

Happy to share that I've been nominated to the @vortexdotdev Technical Steering Committee! It's been fun and productive switching to Vortex from Parquet as our storage format at Polar Signals and I'm excited to continue contributing to the Vortex project.

353

Will Manning

vortex retweeted

Will Manning

@willmanning

4 Dec 2025

Super cool, they forked @DeltaLakeOSS to replace Parquet (for data) with Vortex and JSON (for metadata) with Vortex. Huge performance gains! Maybe we should upstream this one 😁 @vortexdotdev

Polar Signals @PolarSignalsIO

4 Dec 2025

🧊 New on the Polar Signals Blog — Our Delta Lake Fork Purpose-built for our continuous profiling product. In our latest post, we walk through how Delta Lake works, and the changes we've made to improve performance for our product. 👉 Read the full post: buff.ly/KwHINtO

7,782

Will Manning

vortex retweeted

Will Manning

@willmanning

25 Nov 2025

So cool!! Polar Signals reduced query runtimes by 70% switching from Parquet to Vortex 🤯🚀

Polar Signals @PolarSignalsIO

25 Nov 2025

We completed a major project to switch our storage file format from Parquet to Vortex 🌪️ resulting in 70% average query performance improvement across the board 🚀 Learn more about how rethinking interface-imposed limitations unlocked these gains in our latest blog post 👇

2,414

Polar Signals

vortex retweeted

Polar Signals @PolarSignalsIO

25 Nov 2025

polarsignals.com/blog/posts/…

Questioning an Interface: From Parquet to Vortex | Polar Signals

Breaking free from the shackles of interface-imposed performance limitations

polarsignals.com

446

Polar Signals

vortex retweeted

Polar Signals @PolarSignalsIO

25 Nov 2025

3,712

Andrew Lamb

vortex retweeted

Andrew Lamb @andrewlamb1111

15 Oct 2025

The talk on @SpiralDB at @CMUDB : youtube.com/watch?v=zyn_T5ur… is a great one. I think it would also be interesting to hear a counterpoint about @ApacheParquet that explains actual technical details of that format, the Cathedral vs Bizzaar management, options with Metadata, etc

Vortex: LLVM for File Formats (Will Manning)

CMU Database Group - Future Data Systems Seminar Series (Fall 2025)...

youtube.com

111

8,786

CMU Database Group

vortex retweeted

CMU Database Group @CMUDB

13 Oct 2025

Today's Future Data Systems Seminar Speaker: Will Manning (@willmanning) will present @SpiralDB's Vortex file format (@vortexdotdev). Vortex is now a @LFAIDataFdn project. Zoom talk open to public at 4:30pm ET. YouTube video available after: db.cs.cmu.edu/events/futured…