Last week, a customer nearly left Telebugs because of a mistake I made at the design level.
Their queue database grew to 95GB in a single day. Their app was processing ~40,000 jobs daily, and I had designed the main ingestion job to carry large payloads directly in the job arguments. On SQLite, that combination became toxic - even with cleanup running on schedule.
Here's what went wrong, why SQLite made it worse, and the fix 👇