Joined October 2023
39 Photos and videos
#ClickHouse mistake 3: treating PRIMARY KEY like Postgres. It doesn't enforce uniqueness. It's a sparse index for skipping data ranges. The real lever is ORDER BY: lowest → highest cardinality, matching your filters. Learn more 👇 glassflow.dev/blog/clickhous… #DataAnalytics
22
We went through hundreds of Stack Overflow threads, GitHub issues, and Reddit posts about #ClickHouse. The same 5 mistakes keep coming up. Every. Single. Time. Here are the first two 🧵
1
47
Mistake 2: Using MergeTree because it's the default engine. ClickHouse has many MergeTree variants for a reason: Building CDC or event tracking? → ReplacingMergeTree Pre-computing metrics? → AggregatingMergeTree & Mat Views Choose the engine wisely before your first insert.
1
27
GlassFlow retweeted
Best tool out there for Kafka to dedup
Stop writing glue code to connect #OTel to #ClickHouse. 🛑 GlassFlow now ingests OTLP data natively: deduplicating, masking PII, and delivering traces/logs/metrics query-ready. ✅ No #Kafka ✅ No custom transforms ✅ Just clean data See how: glassflow.dev/blog/clean-enr… #OpenTelemetry
1
2
27
Before v3.1.0, when a downstream stage couldn't keep up, GlassFlow had no coordinated way to slow down. Events would pile up, NATS memory would fill, pipeline would fail. That's fixed now. 🧵
1
1
22
Also in v3.2.0: OTLP receiver concurrency caps, operator reconciles 4 pipelines in parallel, and full backpressure signals from every component to the control plane.
1
1
7
ReplacingMergeTree deduplicates eventually in #ClickHouse. Your counts might be wrong until the merge runs. Fix it upstream with GlassFlow → glassflow.dev #DataEngineering #GlassFlow #OpenSource #DataPipelines
51
If you're sending #OTel data to ClickHouse without a processing layer in between, you're probably storing 15–30% duplicate logs and fighting nested JSON schema issues. We built GlassFlow to fix this:dedup, enrichment, & schema mapping for #ClickHouse. → glassflow.dev/use-cases/real…
33
Still looking for a Founding Engineer 🚀🚀 Not a "senior dev" role. Not a backlog ticket role. A "here's the problem, let's build for it" role. → Low-latency #eventstreaming infra → Billions of events → GlassFlow stack: #Go, #Kafka, #NATS Apply here: join.com/companies/getglassf…
1
145
Your Kafka → #ClickHouse pipeline shouldn't be a 2 AM nightmare. ReplacingMergeTree doesn't deduplicate on insert. It merges "eventually", leading in messy count()s GlassFlow fixes this by handling dedup, PII masking, and more upstream ➡️ Details: glassflow.dev #Kafka
106
Stop writing glue code to connect #OTel to #ClickHouse. 🛑 GlassFlow now ingests OTLP data natively: deduplicating, masking PII, and delivering traces/logs/metrics query-ready. ✅ No #Kafka ✅ No custom transforms ✅ Just clean data See how: glassflow.dev/blog/clean-enr… #OpenTelemetry
2
123
ClickHouse for #observability is great, until you have to deal with: • Duplicate spans from collector retries • PII leaking into trace attributes New guide: #OTel Collector → GlassFlow → #ClickHouse Dedup, PII masking, tail sampling. 🔗 Link in replies #OpenTelemetry
1
43