our members ran a great reading group on major architecture changes in deepseek v4!!
they covered: hyper connections manifold hyper connections, KV cache MQA/GQA intro, Deepseek Sparse attention (prerequisite to understanding the new CSA) & a walk through of CSA and HCA