Another day of digging interesting facts from open-source codebases!
💡 I just discovered how ClickHouse stores Arrays efficiently using values offsets
ClickHouse has a reputation for being ridiculously fast at analytical queries. Part of that speed comes from its columnar storage design, but one detail that surprised me is how it stores arrays and nested data structures.
Instead of keeping each row’s array as is, ClickHouse explodes arrays into two separate columns:
👉🏻 Values column: a flat sequence of all elements from all arrays.
👉🏻 Offsets column: an array of integers marking where each row’s array ends.
📋 A Naive Approach (Why Not Just Store JSON?)
Imagine you have a table with arrays:
CREATE TABLE example (
id UInt32,
tags Array(String)
) ENGINE = MergeTree();
And you insert some rows:
id tags
1 [ "db", "metrics" ]
2 [ "ai", "ml", "genai" ]
3 [ "infra" ]
A naive database might:
👉🏻 Store each row’s array as a blob (["db", "metrics"]), bad for compression.
👉🏻 Store each row in a separate subtable, bad for query speed.
✅ The ClickHouse Way: Exploding Arrays
ClickHouse stores the same data in this format:
tags.values = [ "db", "metrics", "ai", "ml", "genai", "infra" ]
tags.offsets = [ 2, 5, 6 ]
👉🏻 tags.values holds all elements in sequence, concatenated across rows.
👉🏻 tags.offsets tells you how many elements belong to each row, by marking the ending index.
So how do you reconstruct the original row arrays?
✓ Row 1: take values from index 0 to 1 (offset 2) → [ "db", "metrics" ]
✓ Row 2: take values from index 2 to 4 (offset 5) → [ "ai", "ml", "genai" ]
✓ Row 3: take values from index 5 to 5 (offset 6) → [ "infra" ]
That’s it! Two flat columns, but the ability to map back to nested arrays.
✅ Why This Is Brilliant
1. Append-friendly
Adding a new row is just appending more values and a new offset. No reshuffling required.
2. Cache-friendly scans
Columnar queries (like arrayJoin(tags)) only need to scan the flat tags.values, which is contiguous in memory.
3. Compression wins
Flat columns compress better than row-level blobs because similar data (like repeated tags) sit close together.
4. Vectorized execution
Queries operate on contiguous memory regions, enabling SIMD acceleration and fast parallel scans.
📋 Attaching the exact source code for reference:
github.com/ClickHouse/ClickH…
#DatabaseInternals #Clickhouse