When I'm introduced to a new board game, each friend takes a turn explaining their most essential strategy, so I end up with virtually every single game detail. Sometimes, we even use the first 20 minutes reading said details from the rulebook... As soon as we start playing, however, nearly all these details end up being irrelevant for actually playing the game. We really only need to look at the rulebook, so I can convince Jonas that it is indeed possible to achieve world domination in a single turn, although I seemingly did nothing for 3 hours and swept in at the very last minute to take it all (Board game: Risk, 6-players).
Anyways, for that reason, I skip almost all details when introducing someone to a new game, and try to mention only the 2 or 3 compressed scenarios I think are the most important to understand (filling in details as needed). When it works, it's certainly much more effective, yet my friends' approach, in contrast, always works. But I don't think it's a lack of detail, more a question of how the scenarios are structured (also in the rulebooks). And even when it fails, just repeating the information, the same (word-for-word) information, once or twice, usually gets everyone on the same page.
To the same tune, many approaches towards making neural networks more efficient identify seemingly redundant or repeated details and remove them. Here, as with many other things in life, we might've slightly overlooked the significance of the details, especially those that are repeated to us. If we can't see the forest for the trees, we tend to burn a few of them, increasing our FOV. Rather, I'd argue, that many times, the forest is easier to see, were we to plant trees among those already there. In that sense, repeatability and similarity between details can also help reveal the bigger picture, not just obfuscate it.
Besides this (more personal) lens, we take up the perspective of algorithmic complexity in our recent work, Algorithmic Simplification of Neural Networks with Mosaic-of-Motifs, asking why neural networks, much like our board game rules for world domination, are suited for compression. We demonstrate that parameters of trained models have more structure and, hence, exhibit lower algorithmic complexity compared to the weights at (random) initialization. In turn, we present a constrained model parameterization (MoMos) that induces repeatability and structure in neural networks, yielding models with lower algorithmic complexity, including a theoretical justification for how the parameterization settings control this complexity.
Paper:
arxiv.org/pdf/2602.14896
ABC:
abitofcomplexity.com,asking why neural networks, much like our board game rules for world domination, are suited for compression. We demonstrate that parameters of trained models have more structure and, hence, exhibit lower algorithmic complexity compared to the weights at (random) initialization. In turn, we present a constrained model parameterization (MoMos) that induces repeatability and structure in neural networks, yielding models with lower algorithmic complexity, including a theoretical justification for how the parameterization settings control this complexity.
A repeated, but not redundant, thank you to my co-authors Tong Chen, Jonathan Wenshøj, Erik B Dam, and Raghavendra Selvan, for making it easier to see the forest for the trees, and a special thanks to Eduardo Yuji Sakabe for planting some more along the way.