The problem of ensuring the same results of machine learning when trained on different hardware is solved by Reproducible Operators (RepOps), a framework by Gensyn that offers deterministic algorithms of popular machine learning tasks, making sure that they provide the same output.
Although both models are the same, differences in the architecture of the gpu and the driver versions may result in differences in the bits generated when carrying out machine learning functions.
Output errors are caused by driver changes, concurrent engineering, and floatingpoint roundoff error.
This is not a problem with centralized training, but with decentralized networks, where you must ensure work is done by anonymous machines, it becomes a severe problem.
Repops eliminates this problem by applying deterministic execution. In Brazil, when a matrix is multiplied using a GPU it produces the same result as it would using a CPU in Korea.
Results.ge means that distributed workloads do not deviate or change inconsistently because it can predict its results with accuracy on a bit.
The calculation is correct when the results of two nodes are identical. Otherwise, Gensyn will be able to spot the problem step immediately.
RepOps provides a basis of trust; in contrast to traditional decentralized computing which can experience inconsistencies, RepOps has deterministic operations and can, on the fly, punish dishonest nodes and check the accuracy of its operation cheaply.
It is a small aspect, which allows the possibility of converting global hardware into a reliable and integrated engine that can train serious AI models with ease, without a necessity of any single machine, just math.
@gensynai