A secret AI researcher's reading list.

Joined May 2020
Photos and videos
Kudugunta, Sneha, Aditya Kusupati, Tim Dettmers, Kaifeng Chen, Inderjit Dhillon, Yulia Tsvetkov, Hannaneh Hajishirzi, Sham Kakade, Ali Farhadi, and Prateek Jain. "Matformer: Nested transformer for elastic inference." arxiv.org/abs/2310.07707
106
Chu, Tianzhe, Yuexiang Zhai, Jihan Yang, Shengbang Tong, Saining Xie, Dale Schuurmans, Quoc V. Le, Sergey Levine, and Yi Ma. "Sft memorizes, rl generalizes: A comparative study of foundation model post-training." arxiv.org/abs/2501.17161
53
Wu, Tianhao, Weizhe Yuan, Olga Golovneva, Jing Xu, Yuandong Tian, Jiantao Jiao, Jason Weston, and Sainbayar Sukhbaatar. "Meta-rewarding language models: Self-improving alignment with llm-as-a-meta-judge." arxiv.org/abs/2407.19594
33
Zelikman, Eric, Yuhuai Wu, Jesse Mu, and Noah Goodman. "Star: Bootstrapping reasoning with reasoning." arxiv.org/abs/2203.14465

21
Yao, Shunyu, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao, and Karthik Narasimhan. "Tree of thoughts: Deliberate problem solving with large language models." arxiv.org/abs/2305.10601

24
He, Chaoqun, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Leng Thai, Junhao Shen, Jinyi Hu et al. "Olympiadbench: A challenging benchmark for promoting agi with olympiad-level bilingual multimodal scientific problems." arxiv.org/abs/2402.14008
29
Feng, Xidong, Yicheng Luo, Ziyan Wang, Hongrui Tang, Mengyue Yang, Kun Shao, David Mguni, Yali Du, and Jun Wang. "ChessGPT: Bridging Policy Learning and Language Modeling." arxiv.org/abs/2306.09200

31
Azar, Mohammad Gheshlaghi, Mark Rowland, Bilal Piot, Daniel Guo, Daniele Calandriello, Michal Valko, and Rémi Munos. "A general theoretical paradigm to understand learning from human preferences." arxiv.org/abs/2310.12036

37
Rafailov, Rafael, Archit Sharma, Eric Mitchell, Christopher D. Manning, Stefano Ermon, and Chelsea Finn. "Direct preference optimization: Your language model is secretly a reward model." arxiv.org/abs/2305.18290

32
Bubeck, Sébastien, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee et al. "Sparks of artificial general intelligence: Early experiments with gpt-4." arxiv.org/abs/2303.12712

82
Qin, Haotong, Ge-Peng Ji, Salman Khan, Deng-Ping Fan, Fahad Shahbaz Khan, and Luc Van Gool. "How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges." arxiv.org/abs/2307.15016

74