I just pushed a big refactoring of DS4 backends with CUDA support and single direction activation steering. The Metal path should be unaffected. Note: I only support hardware I have own (or have full access to): so just M3 (no M5 NE for now), DGX Spark.
Soon in DS4: 1. CUDA support (14 t/s, 350 t/s prefill on DGX Spark), 2. Single direction steering support. 3. Huge refactoring to support Metal / CUDA / CPU in a more sensible way.