Got somewhat confused by this article from Microsoft: I expected to see scaling laws apply to cell biology, based on prior readings from leaders in the space. See in particular Noetik’s TARIO piece:
noetik.blog/p/scaling-behavi…. Microsoft and Noetik reach opposite verdicts on whether AI scales in cell biology, but the contradiction is actually superficial:
- Microsoft froze compute and varied only data, on a public dissociated-cell corpus, scored against easy tasks that regular PCA already solves, and plateaus at 1% of the training set.
- TARIO co-scaled parameters, context, and data together, on proprietary spatial tumor data, scored against a hard generative target with significant headroom, and saw no plateau.
Key differences: compute design, modality, data quality, and task difficulty.
One point of convergence though: that scaling within a narrow distribution doesn't buy out-of-distribution generalization. Most reasonable conclusion seems to be that in bio and elsewhere, scaling works when compute co-scales, the data is high-quality and distribution-matched, and the task has room to improve.