There were a lot of algorithmic innovations in offline RL recently, along with a silent evolution of minor design choices.
What if we applied these seemingly minor modifications to an established minimalistic baseline by
@shaneguML? Turns out, gains are enormous.