💡LxMLS 2025 is almost here!
From July 19 to 25, @istecnico will once again host the Lisbon Machine Learning Summer School (LxMLS), a flagship event in the ML community. IT is among the event supporters.
🔗 it.pt/Events/Event/5680#SummerSchool
˗ˏˋ bop and it's live! ˎˊ-
In a world full of cookie-cutter design, we're bringing character to products.
And, we just launched our first website!
Link below ↓
🦅 Eagle & 🐦 Finch
The RWKV v5 and v6 architecture paper is here
arxiv.org/abs/2404.05892
Both of which, improve over RWKV-4, scaled up to 7.5b and 3.1b billion multilingual models respectively
Open-source code, weights, and dataset
Apache 2 licensed, under Linux Foundation
RWKV: Reinventing RNNs for the Transformer Era
propose a novel model architecture, Receptance Weighted Key Value (RWKV), that combines the efficient parallelizable training of Transformers with the efficient inference of RNNs. Our approach leverages a linear attention mechanism and allows us to formulate the model as either a Transformer or an RNN, which parallelizes computations during training and maintains constant computational and memory complexity during inference, leading to the first non-transformer architecture to be scaled to tens of billions of parameters. Our experiments reveal that RWKV performs on par with similarly sized Transformers, suggesting that future work can leverage this architecture to create more efficient models. This work presents a significant step towards reconciling the trade-offs between computational efficiency and model performance in sequence processing tasks.
paper page: huggingface.co/papers/2305.1…
Everyone knows that transformers are synonymous with large language models… but what if they weren’t? Over the past two years @BlinkDL_AI and team have been hard at work scaling RNNs to unprecedented scales. Today we are releasing a preprint on our work
arxiv.org/abs/2305.13048
Next Friday, João Nadkarni presents the paper "COLT5: Faster Long-Range Transformers with Conditional Computation" in @OutSystems
AI RG.
Check out outsyste.ms/ai-reading-grp for archives, zoom link, and contact.
We launche a petition to democratize AI research by establishing an international, publicly funded supercomputing facility equipped with 100,000 state-of-the-art AI accelerators to train open source foundation models.
laion.ai/blog/petition/openpetition.eu/petition/onl…
Next Friday, @SSamDav presents the paper "JPretraining Language Models with Human Preferences" in @OutSystems AI RG.
More details on the thread 1/2.
Check out outsyste.ms/ai-reading-grp for archives, zoom link, and contact.
Next Friday, I'm presenting the paper "Jaint: A Framework for User-Defined Dynamic Taint-Analyses Based on Dynamic Symbolic Execution of Java Programs" in @OutSystems AI RG.
More details on the thread 1/2.
Check out outsyste.ms/ai-reading-grp for archives, zoom link, and contact.
Today, Bartłomiej Matejczyk presents the paper: "Relational Memory Augmented Language Models" in @OutSystems AI RG.
More details on the thread 1/2.
Check out outsyste.ms/ai-reading-grp for archives, zoom link, and contact info.
Friday, @joanacoutinho01 presents the paper: "KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs" in @OutSystems AI RG.
More details on the thread 1/2.
Check out outsyste.ms/ai-reading-grp for archives, zoom link, and contact info.
Friday, A Menezes presents the paper: "Hungry Hungry Hippos: Towards Language Modeling with State Space Models" in @OutSystems AI RG.
🦛
More details on the thread 1/2.
🦛
Check out outsyste.ms/ai-reading-grp for archives, zoom link, and contact info.
Friday, J. Nadkarni presents the paper: "REPLUG: Retrieval-Augmented Black-Box Language Models" in @OutSystems AI RG.
More details on the thread 1/2.
Check out outsyste.ms/ai-reading-grp for archives, zoom link, and contact info.
Friday, @SSamDav presents the paper: "A Generalization of ViT/MLP-Mixer to Graphs" in @OutSystems AI RG.
More details on the thread 1/2.
Check out outsyste.ms/ai-reading-grp for archives, zoom link, and contact info.