AI Paper Daily

AI Paper Daily

Photos and videos

Tweets

18 Mar 2025

Zhu, Jiachen, Xinlei Chen, Kaiming He, Yann LeCun, and Zhuang Liu. "Transformers without Normalization." arxiv.org/abs/2503.10622

Transformers without Normalization

Normalization layers are ubiquitous in modern neural networks and have long been considered essential. This work demonstrates that Transformers without normalization can achieve the same or better...

arxiv.org

123

AI Paper Daily

AI Paper Daily @AIPapers

14 Mar 2025

Kudugunta, Sneha, Aditya Kusupati, Tim Dettmers, Kaifeng Chen, Inderjit Dhillon, Yulia Tsvetkov, Hannaneh Hajishirzi, Sham Kakade, Ali Farhadi, and Prateek Jain. "Matformer: Nested transformer for elastic inference." arxiv.org/abs/2310.07707

MatFormer: Nested Transformer for Elastic Inference

Foundation models are applied in a broad spectrum of settings with different inference constraints, from massive multi-accelerator clusters to resource-constrained standalone mobile devices....

arxiv.org

106

AI Paper Daily

AI Paper Daily @AIPapers

13 Mar 2025

Dao, Tri, Dan Fu, Stefano Ermon, Atri Rudra, and Christopher Ré. "Flashattention: Fast and memory-efficient exact attention with io-awareness." arxiv.org/abs/2205.14135

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Transformers are slow and memory-hungry on long sequences, since the time and memory complexity of self-attention are quadratic in sequence length. Approximate attention methods have attempted to...

arxiv.org

AI Paper Daily

AI Paper Daily @AIPapers

12 Mar 2025

Chu, Tianzhe, Yuexiang Zhai, Jihan Yang, Shengbang Tong, Saining Xie, Dale Schuurmans, Quoc V. Le, Sergey Levine, and Yi Ma. "Sft memorizes, rl generalizes: A comparative study of foundation model post-training." arxiv.org/abs/2501.17161

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation...

Supervised fine-tuning (SFT) and reinforcement learning (RL) are widely used post-training techniques for foundation models. However, their roles in enhancing model generalization capabilities...

arxiv.org

AI Paper Daily

AI Paper Daily @AIPapers

9 Mar 2025

Gekhman, Zorik, Gal Yona, Roee Aharoni, Matan Eyal, Amir Feder, Roi Reichart, and Jonathan Herzig. "Does fine-tuning LLMs on new knowledge encourage hallucinations?." arxiv.org/abs/2405.05904

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

When large language models are aligned via supervised fine-tuning, they may encounter new factual information that was not acquired through pre-training. It is often conjectured that this can...

arxiv.org

AI Paper Daily

AI Paper Daily @AIPapers

9 Mar 2025

Wu, Tianhao, Weizhe Yuan, Olga Golovneva, Jing Xu, Yuandong Tian, Jiantao Jiao, Jason Weston, and Sainbayar Sukhbaatar. "Meta-rewarding language models: Self-improving alignment with llm-as-a-meta-judge." arxiv.org/abs/2407.19594

Meta-Rewarding Language Models: Self-Improving Alignment with...

Large Language Models (LLMs) are rapidly surpassing human knowledge in many domains. While improving these models traditionally relies on costly human data, recent self-rewarding mechanisms (Yuan...

arxiv.org

AI Paper Daily

AI Paper Daily @AIPapers

24 Jan 2025

Zelikman, Eric, Yuhuai Wu, Jesse Mu, and Noah Goodman. "Star: Bootstrapping reasoning with reasoning." arxiv.org/abs/2203.14465

AI Paper Daily

AI Paper Daily @AIPapers

15 Jan 2025

Yao, Shunyu, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao, and Karthik Narasimhan. "Tree of thoughts: Deliberate problem solving with large language models." arxiv.org/abs/2305.10601

AI Paper Daily

AI Paper Daily @AIPapers

15 Jan 2025

He, Chaoqun, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Leng Thai, Junhao Shen, Jinyi Hu et al. "Olympiadbench: A challenging benchmark for promoting agi with olympiad-level bilingual multimodal scientific problems." arxiv.org/abs/2402.14008

OlympiadBench: A Challenging Benchmark for Promoting AGI with...

Recent advancements have seen Large Language Models (LLMs) and Large Multimodal Models (LMMs) surpassing general human capabilities in various tasks, approaching the proficiency level of human...

arxiv.org

AI Paper Daily

AI Paper Daily @AIPapers

14 Jan 2025

Huang, Zhen, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan et al. "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI." arxiv.org/abs/2406.12753

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning...

The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential...

arxiv.org

AI Paper Daily

AI Paper Daily @AIPapers

10 Aug 2024

Feng, Xidong, Yicheng Luo, Ziyan Wang, Hongrui Tang, Mengyue Yang, Kun Shao, David Mguni, Yali Du, and Jun Wang. "ChessGPT: Bridging Policy Learning and Language Modeling." arxiv.org/abs/2306.09200

AI Paper Daily

AI Paper Daily @AIPapers

6 Apr 2024

Azar, Mohammad Gheshlaghi, Mark Rowland, Bilal Piot, Daniel Guo, Daniele Calandriello, Michal Valko, and Rémi Munos. "A general theoretical paradigm to understand learning from human preferences." arxiv.org/abs/2310.12036

AI Paper Daily

AI Paper Daily @AIPapers

5 Apr 2024

Rafailov, Rafael, Archit Sharma, Eric Mitchell, Christopher D. Manning, Stefano Ermon, and Chelsea Finn. "Direct preference optimization: Your language model is secretly a reward model." arxiv.org/abs/2305.18290

AI Paper Daily

AI Paper Daily @AIPapers

3 Apr 2024

Chiang, Wei-Lin, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang et al. "Chatbot arena: An open platform for evaluating LLMs by human preference." arxiv.org/abs/2403.04132

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Large Language Models (LLMs) have unlocked new capabilities and applications; however, evaluating the alignment with human preferences still poses significant challenges. To address this issue, we...

arxiv.org

AI Paper Daily

AI Paper Daily @AIPapers

1 Apr 2024

Jaeger, Bernhard, and Andreas Geiger. "An Invitation to Deep Reinforcement Learning." arxiv.org/abs/2312.08365

An Invitation to Deep Reinforcement Learning

Training a deep neural network to maximize a target objective has become the standard recipe for successful machine learning over the last decade. These networks can be optimized with supervised...

arxiv.org

AI Paper Daily

AI Paper Daily @AIPapers

21 Aug 2023

Bubeck, Sébastien, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee et al. "Sparks of artificial general intelligence: Early experiments with gpt-4." arxiv.org/abs/2303.12712

AI Paper Daily

AI Paper Daily @AIPapers

4 Aug 2023

Wen, Yuxin, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, and Tom Goldstein. "Hard prompts made easy: Gradient-based discrete optimization for prompt tuning and discovery." arxiv.org/abs/2302.03668

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for...

The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be...

arxiv.org

117

AI Paper Daily

AI Paper Daily @AIPapers

1 Aug 2023

Qin, Haotong, Ge-Peng Ji, Salman Khan, Deng-Ping Fan, Fahad Shahbaz Khan, and Luc Van Gool. "How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges." arxiv.org/abs/2307.15016

AI Paper Daily

AI Paper Daily @AIPapers

19 Jul 2023

Chung, Hyung Won, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li et al. "Scaling instruction-finetuned language models." arxiv.org/abs/2210.11416

Scaling Instruction-Finetuned Language Models

Finetuning language models on a collection of datasets phrased as instructions has been shown to improve model performance and generalization to unseen tasks. In this paper we explore instruction...

arxiv.org

AI Paper Daily

AI Paper Daily @AIPapers

18 Jul 2023

Wei, Jason, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V. Le. "Finetuned language models are zero-shot learners." arxiv.org/abs/2109.01652

Finetuned Language Models Are Zero-Shot Learners

This paper explores a simple method for improving the zero-shot learning abilities of language models. We show that instruction tuning -- finetuning language models on a collection of tasks...

arxiv.org