I use mostly frontier models with a $200 sub from openai/anthropic, but I also run my DGX spark for custom cuda accelerated pipelines for work as well as I maintain some local models such as Qwen3.6:35b on the hardware, all connected to my unified local/remote SSH fleet across my 5 systems. My main desktop is a 5070TI desktop with 64gb ram and 16tb storage, then my macbook pro 16 inch 48gb is the main memory hub, then my dgx spark/windows desktop/windows laptop with a 5070 12gb and 24 core intel cpu with 32gb all connect through the same memories across claude/codex/grokbuild. I guess basically what I'm saying is might as well take advantage of the best of both worlds and have your own 3d knowledge graph memory archaeologist to help you manage your fleet. These arguments are kind of pointless when the only thing that matters is being able to execute.