5/ We really do mean an unmodified Transformer: no explicit calculation of pairwise distances, no graph-based features, no rotational equivariance, etc. Leveraging modern software and hardware, a 1B parameter Transformer trains and runs inference faster than a 6M parameter equivariant GNN.