I finally read the Kosmos "AI Scientist" paper from FutureHouse. Here is a bit about what they did and what I think about it.
> The general idea behind this paper, and others like it, is that science follows a series of steps and that much of these steps can be automated. Those steps are:
- Search the literature. Read stuff.
- Use your reading to come up with new hypotheses. Try to draw connections between things.
- Analyze data to draw conclusions. Write up your results.
- Repeat.
Kosmos uses two separate agents — one for data analysis and another for literature searches — to go out and do these tasks while sharing information with each other. The agents can see what the other agents have learned, in other words, which is super useful. They exist within a single "world model." A single run of Kosmos can execute up to 42,000 lines of code across 166 different data analysis agents, and also read 1,500 scientific papers using 36 literature review agents. Each run takes up to 12 hours.
So that’s the gist. You spin this thing up, give it a huge prompt, and then let it cook. In this preprint, they report seven discoveries that they say were made by Kosmos; “three discoveries made by Kosmos reproduce findings from preprinted or unpublished manuscripts,” which are not in its training dataset, “while the remaining four make novel contributions to the scientific literature.”
FutureHouse handed Kosmos to researchers around the world, working in myriad fields (electronics, neurology, materials, etc.), and let them test it out. Here are some of the “discoveries” they reported:
1. By feeding Kosmos some mouse brain metabolomics data, it suggested that cooling the brain’s temperature might activate nucleotide-salvage pathways, which basically preserves neurons during hypothermia. This had been shown in an unpublished paper and was later re-confirmed.
2. Using environmental sensor data from a recent arXiv paper, it identified a linear relationship between the solvent vapor on a solar cell and that cell’s current. In other words, humidity matters a lot? Not sure if this is surprising or not, as I have no background in this field. But again, it was a sort of “re-discovery” to see if Kosmos could find results that humans had already identified (but had not yet published.)
3. Higher levels of an enzyme, called superoxide dismutase 2, in the blood may reduce myocardial fibrosis. Published papers had previously identified a correlation between SOD2 and myocardial fibrosis, but Kosmos re-pointed at it and humans followed up to show it’s causal.
Here are my quick thoughts:
1. Many other AI scientists (both at nonprofits and for-profits, which have not yet been released) are trying to do the same thing. We clearly need better benchmarks to know what is real and what is fake. It seems like Kosmos is real, but how does this compare to Google etc?
2. I’m not wholly convinced that the idea of extremely long runs will be palatable to most biology researchers. My take is that researchers are looking for more of a real-time collaborator, where you’re constantly prompting and getting immediate feedback, rather than just delegating huge, open-ended tasks to agents. If a “general user” tests out Kosmos, pays the large price tag, and is disappointed by the results, will they keep using it? The wait time is a huge barrier, as is the price (even though academics get generous access.) Also difficult to prompt engineer?
3. This paper tries to quantify “the time it would take for a human scientist to complete the work that Kosmos performs in an individual run,” but I find it a bit hand-wavy. They say it takes a typical researcher 15 minutes to read a paper and 2 hours to write a Jupyter notebook for data analysis and, since Kosmos can read 1,500 papers per run, it offers a huge time savings.
But human scientists don’t need to read hundreds of papers to make a discovery! The best scientists have an innate ability to “triangulate to innovation;” to find the right combo of papers and discussions that enable them to make conceptual advances. This seems difficult to replicate.
I'd like to have more discussions about AI Scientists, if any of you are interested.