morgan gallant

morgan gallant

269 Photos and videos

Tweets

morgan gallant @morgallant

Jun 2

excited to get this out there in the hopes that other folks can benefit from our work here! a lot goes into making a good tokenizer also wanna make sure I thank the tantivy folks, we've built our earlier tokenizers (everything before word_v4) atop their lovely open sourced work

Simon Eskildsen

@Sirupsen

Jun 2

so @morgallant has optimized FTS tokenization throughput to 423 MiB/s and open-sourced it (github.com/turbopuffer/alyze) I keep telling him that it would be really high agency to get to DRAM bandwidth (~100 GiB/s), and he keeps getting annoyed at me and making it faster

1,523

morgan gallant

morgan gallant @morgallant

May 11

fangirling over being on dash 8 for the first time in a while — i think my first ever flight was on a dash 8!!!

647

turbopuffer

morgan gallant retweeted

turbopuffer

@turbopuffer

Apr 27

BM25 efficiently scores text, but relevance often depends on more than text (recency, popularity, PageRank) we score non-text attributes as clauses in the same MAXSCORE plan that evaluates BM25 → better first-stage relevance, still scales to 100M tpuf.link/rank-by-attr

Mixing numeric attributes into text search for better first-stage relevance

turbopuffer now allows you to combine attribute values into the scoring function of text queries. Ranking by attribute helps achieve better relevance in the first stage with the same scalability...

tpuf.link

114

16,631

morgan gallant

morgan gallant @morgallant

Mar 9

hoping to write a blog post on this eventually, it's a fascinating topic.. so many different possible approaches, each with their own tradeoffs. for v1, as is always the case at tpuf, we've optimized for simplicity

turbopuffer

@turbopuffer

Mar 9

new in tpuf: regex indexes regex and glob filters can now use a trigram index to avoid full-table scans

A chart showing p90 query latency performance gains for regex and glob filters after the introduction of a trigram regex index on turbopuffer. Latency improved from 115 milliseconds to 33.8 milliseconds for a representative regex filter and from 267 milliseconds to 34.2 milliseconds for a representative globbing filter. The chart suggests that while these kinds of filters already worked, now they puff!

ALT A chart showing p90 query latency performance gains for regex and glob filters after the introduction of a trigram regex index on turbopuffer. Latency improved from 115 milliseconds to 33.8 milliseconds for a representative regex filter and from 267 milliseconds to 34.2 milliseconds for a representative globbing filter. The chart suggests that while these kinds of filters already worked, now they puff!

881

morgan gallant

morgan gallant @morgallant

Jan 26

btw hiring for a systems eng. role at @turbopuffer soon on the text team, you’d work closely w/ @jpountz and I on billion-scale {storage,ranking,query eval}. DMs open!

215

37,306

morgan gallant

morgan gallant @morgallant

Jan 14

joint work with @jpountz, covers some of the storage work behind the ftsv2 launch!

turbopuffer

@turbopuffer

Jan 14

for FTS v2, we redesigned our inverted index structure • tighter compression • less KV overhead • better MAXSCORE interaction up to 10x smaller indexes → up to 20x faster text search! tpuf.link/fts-index

2,190

morgan gallant

morgan gallant @morgallant

Jan 6

still in awe of the talent density on the @turbopuffer team… truly such an honour and a privilege to get to work with these folks every day

8,721

morgan gallant

morgan gallant @morgallant

30 Dec 2025

rewrote my website, again: morgangallant.com/blog/websi…

1,518

morgan gallant

morgan gallant @morgallant

10 Dec 2025

I really hope that the continued success of Zig will help convince more systems programmers that treating memory allocations as infallible is a terrible idea for PLs, and that OS-level memory overcommits in Linux server environments is similarly terrible!

1,275

morgan gallant

morgan gallant @morgallant

4 Dec 2025

joint work with @jpountz and @nikhilbenesch; excited to start rolling this out more broadly! insane amount of progress on FTS recently :)

turbopuffer

@turbopuffer

4 Dec 2025

FTS v2: up to 20x faster full-text search turbopuffer is now on par with Tantivy and Lucene for many queries, more to come v2 now in beta. 2 PRs away from all query plans being implemented. will be enabled in prod for all, shortly.

ALT Horizontal bar chart comparing turbopuffer FTS v1 vs v2 latencies for five queries on English Wikipedia, v2 much faster (3–20ms) vs v1 (8–174ms).

10,480

Adrien Grand

morgan gallant retweeted

Adrien Grand @jpountz

21 Nov 2025

Something that's great with Lucene is how you can run luceneutil benchmarks on a PR to assess the performance impact. So I built something similar for turbopuffer (off the Tantivy benchmark!). Here, baseline is what runs in production now, contender is what will get deployed soon

9,251

morgan gallant

morgan gallant @morgallant

11 Nov 2025

hand-rolled a fully uax#29-compliant tokenizer today, brain is a little fried but it was lowkey pretty fun

870

morgan gallant

morgan gallant @morgallant

9 Nov 2025

one of my favourite ways to deploy to @Railway! build the binary locally, package it up in a light alpine image and ship it off

1,242

morgan gallant

morgan gallant @morgallant

9 Nov 2025

11001001

338

morgan gallant

morgan gallant @morgallant

24 Oct 2025

ContainsAllTokens (shipped a while back) uses posting lists to evaluate filters, this is the opposite (filters as postings)

turbopuffer

@turbopuffer

24 Oct 2025

new: rank by filter! boost scores when docs match a condition (e.g. scale > 1pib). plugs straight into rank_by, and works alongside full-text search

647

morgan gallant

morgan gallant @morgallant

15 Oct 2025

this wework has cold brew on tap holy moly

435

morgan gallant

morgan gallant @morgallant

5 Oct 2025

starting to learn a bit more about unicode and its intraticies. wrote a zig program to parse out property tables.. tried to do it at comptime, succeeded, but unfortunately made compilation times unreasonable. so instead, had to embed .txt in binary and parse it on the fly (sigh)

478

morgan gallant

morgan gallant @morgallant

3 Oct 2025

life of a showgirl!!!!!!

2,791

morgan gallant

morgan gallant @morgallant

30 Sep 2025

stay calm and drink milk

423

morgan gallant

morgan gallant @morgallant

24 Sep 2025

for any sf folks interested in search! luma.com/y6d6hke8

Semantic search in prod w/ Notion, Cursor & turbopuffer · Luma

Panel Discussion Moderated by Braintrust co-founder and CEO Ankur Goyal Notion will show how they built AI-powered search across millions of workspaces. Cursor…

luma.com

556