#sqlhelp does a parallel batch mode hash aggregate need a lot more memory than what SQL Server normally estimates? 44 billion row columnstore table. Query is simple: SELECT DISTINCT column_one, column_two. Data types are bigint and nvarchar(500). Optimizer guesses about 5 million unique combinations when in actuality there are about 2.5 million. Memory grant is 1.5 GB which seems like it ought to be enough to store the hash table based on the amount and size of the data. When it spills, it uses about 2TB of tempdb (it must be spilling the data pages of the table?) to avoid the spill, I actually need about 23GB of memory. Any explanation of this behavior would be helpful!
When it spills, it gets to spill level 3 and reports about 263 million pages spilled.