New open weights LLM from @MistralAI
params.json:
- hidden_dim / dim = 14336/4096 => 3.5X MLP expand
- n_heads / n_kv_heads = 32/8 => 4X multiquery
- "moe" => mixture of experts 8X top 2 👀
Likely related code:
github.com/mistralai/megablo…
Oddly absent: an over-rehearsed professional release video talking about a revolution in AI.
If people are wondering why there is so much AI activity right around now, it's because the biggest deep learning conference (NeurIPS) is next week.
Meta releases DINOv2 the first method for training computer vision models that uses self-supervised learning (no labeling needed) to achieve industry standard results.
Meta releases the Segment Anything Project:
- SA-1B dataset containing 1-Billion segmentation masks
- Segment Anything Model (SAM) for general image and video segmentation
Damo releases an open-source text to video model with 1.7B parameters. The demo only requires 16GB CPU RAM and 16GB GPU RAM to run. Try it out on Hugging Face below: