Turns out it's only effective for small-palette images. The 64 color images actually got worse, so I had to make it multi-pass; checking which 2D-tapping model compresses the best. Slightly slower, but <=16-color images compress a LOT, especially bitmaps.
Tool link below:
This Saturday evening I was thinking about the pixel compression format I made, and had a stark realization that I restricted the prediction encoder to 1D and not as a 2D plane!!
Quick test just now compressed further from 849 bytes down to 545 bytes 😳 It finally beats JPEG XL