Author of "In-Memory Analytics with Apache Arrow" | Co-Founder at columnar.tech Lover of Randomness and cats. @ApacheArrow PMC Member and member of the ASF
Time for @oredev !! If you're here don't miss my talk today at 10:10 on using @ApacheArrow with #ml workflows! Looking forward to a day of interesting talks and discussions
In SF during Snowflake Summit June 1-3? Duck out (ha!) to The Dive! Hear from rockstars at Anthropic, Braintrust, Lovable, Hex, & more.
See the future of lakehouses with @J_ , creator of Apache Parquet, and @zeroshade, founder at Columnar (& me!)
Register! thedive.motherduck.com/
The fastest operation is the one you don’t have to do.
When a database natively supports @ApacheArrow, ADBC can speed up fetching and ingestion by eliminating costly row/column conversions.
How much faster is it in practice? We ran some benchmarks to find out. Link below 👇
ALT An abstract hyperspace warp image inspired by the comedic "going plaid" effect from the 1980s cult film "Spaceballs".
If you aren't paying attention to some of the Apache Spark Acceleration projects like Gluten, you should!
Gluten just graduated as a Top-Level Project @TheASF
Fetch query results without ODBC / JDBC bottlenecks. The new ADBC driver for @databricks is now in early release. Install it with dbc. Details in comments.
Data and AI are evolving fast, but much of today’s infrastructure still runs on standards from the 90s.
@columnar_tech, from the team behind Apache Arrow, is bringing an Arrow-native protocol (ADBC) that moves data 10–100× faster across systems like Snowflake and DuckDB.
We're excited to lead Columnar's $4M seed round.
Read the full Q&A to learn more: bessemervp.team/47d8gWk
The future of data connectivity is columnar. Today we launched @columnar_tech to accelerate the shift from slow, row-oriented APIs like ODBC and JDBC to >10x faster alternatives powered by @ApacheArrow. Learn more 👉 columnar.tech/blog/announcin…⚡️
ODBC is getting tired. It can't keep up with the fast new kids in the data world these days. The next generation is ready to take the torch. Meet ADBC, a fast, modern data connectivity standard built on @ApacheArrow. Watch my talk from the @CMUDB seminar: youtu.be/TjlmNGNx77E
We're building the data infrastructure that AI actually needs.
Current systems were built for humans reading dashboards. But an H100 can consume 4 million images per second.
The future isn't human-scale. It's machine-scale.
Introducing Spiral: Data 3.0 🌀
1/8
I'd like to start using this platform as a place to post about open source work I do on my off time.
To lead it off, I have posted a hash join spilling proposal in Apache Datafusion. Check it out if you're interested 😀:
github.com/apache/datafusion…
In September the @columnar_tech crew are headed to @PyDataParis 2025 and the first ever @ApacheArrow Summit. The organizer @QuantStack is a dedicated supporter of Apache Arrow. We’re delighted to be sponsoring the event.
🚀 Introducing Bauplan
A serverless, code-native platform for building data and AI pipelines — directly on your object store. No clusters. No notebooks. No GUI based workflows.
Just Python SQL S3.
👉 bauplanlabs.com/blog/hello-b…
I’m excited about xorq! Ibis and DataFusion brought together to orchestrate multi-engine data pipelines, all powered by @ApacheArrowgithub.com/xorq-labs/xorq
A lot of people are ignoring that Go is becoming a commonly used language for prompting pipelines. Python in prototypes and Go in production is another common combo.
Me and the team at @lovable just spent two months rewriting 42,000 lines of code from Python to Go.
Technical deep dive of why we did it what this means:
// 1