Introducing BioCLIP: A Vision Foundation Model for the Tree of Life
imageomics.github.io/bioclip…
A foundation model that strongly generalizes on the tree of life (2M species), outperforming OpenAI CLIP by 18% in zero-shot classification, and supports open-ended classification over almost the entire tree of life
What's the secrete ingredients?
> Data: we curate and release TreeOfLife-10M, the largest and most diverse ML-ready dataset of organism images to date. It contains 10.4M images for over 450K taxa, sourced from iNaturalist, BIOSCAN, and Encyclopedia of Life.
> Modeling: we creatively repurposes CLIP's multimodal contrastive learning objective for hierarchical image classification. The autoregressive language model naturally encodes the hierarchy of the tree of life taxonomy, which in turn bakes the hierarchical representation into the vision transformer encoder.
Key results
> Strong zero/few-shot classification for animals/plants/fungi, including rare species, outperforming CLIP by avg 16-18% absolute.
> T-sne visualization shows that BioCLIP's vision encoder has captued the fine-grained hierarchical structure of the tree of life
> BioCLIP is a kind of universal classifier for the tree of life. Just give it an organism image and it will likely find the correct species (among top 5)! But use it with caution; it's not perfect yet..
Final remarks
> AI for Science is really hard but extremely rewarding! It took us a ton of time (1 year) and frustration trying to find a plausible way to integrate the tree of life taxonomy into foundation model training. But when the "Eureka!" moment came and the idea hit us (by the great
@weilunchao) that CLIP's multimodal contrastive learning objective can be repurposed for that, everything just follows naturally. It was truly a moment of joy and excitement!
> BioCLIP is our first attempt at foundation models for biology, but it certainly won't be the last! There's so much more to do at the intersection of one of the oldest scientific disciplines and the young but thriving field of AI. Biological intelligence is the foundation for artificial intelligence, and artificial intelligence will in turn become the most important tool for us to unraval the mysteries of biological intelligence.
We are hiring postdocs and PhDs in the NSF
@imageomics institute to explore this exciting field! Drop us an email. also happy to chat about it at
#NeurIPS2023 with any of Tanya,
@weilunchao, or me.
- paper:
arxiv.org/abs/2311.18803
- project:
imageomics.github.io/bioclip…
- demo:
huggingface.co/spaces/imageo…
- model:
huggingface.co/imageomics/bi…
- data (TreeOfLife-10M): to be released on Hugging Face soon
joint work with the amazing
@imageomics team: @samstevens6860 Lisa Wu, Matt Thompson, Elizabeth Campolongo
@luke_ch_song @Carlyn2015 @donglixp @dahdulw Chuck Stewart, Tanya Berger-Wolf
@weilunchao @ysu_nlp