Engineer. Hard assets investor. 10 years in trenches.

Joined April 2014
274 Photos and videos
Jan 1
Who ages better over 100 years: a mathematician (Ramanujan) or a technologist (Elon Musk)?
1
32
Aj retweeted
23 Dec 2023
RAG for LLMs Great to see an overview of all the retrieval augmented generation (RAG) research that has been happening. Should be a great read for the end of the year. arxiv.org/abs/2312.10997v1
13
333
1,306
125,719
Aj retweeted
20 Sep 2023
The first model to surpass GPT-4 on AlpacaEval. Waiting for more details about the methods used.. Xwin 7B: huggingface.co/Xwin-LM/Xwin-… Xwin 13B: huggingface.co/Xwin-LM/Xwin-… Xwin 70B: huggingface.co/Xwin-LM/Xwin-… Codebase: github.com/Xwin-LM
22
94
536
117,252
30 Aug 2023
Huggingface TRL - Transformer Reinforcement Learning. A comprehensive toolbox for training transformer language models with Reinforcement Learning. From fine-tuning to Proximal Policy Optimization (PPO). Built on top of the 🤗 transformers library. github.com/huggingface/trl

1
92
30 Aug 2023
TRL Highlights: It's TRL's trainers. SFTTrainer for refining models, RewardTrainer for human preferences, and PPOTrainer for optimizing language models.
1
50
30 Aug 2023
TRL allows direct loading of pre-trained language models via transformers, seamlessly integrating into the existing workflow.
32
Aj retweeted
Essential read! Summary/survey from my colleague @Majumdar_Ani about opportunities and challenges in combinging large-scale AI models with robotics.
With seemingly endless progress in AI, I decided over the past few months to take a deep dive into the state of robotics 🤖, emerging trends, and research challenges. What do actual field-deployed robotic systems look like now? Will LLMs solve robotics? irom-lab.princeton.edu/wp-co… 🧵
4
17
7,552
29 Aug 2023
Humans evolve over generations, machines over iterations. At this rate, my ML model might outpace my self-improvement goals for the year. #RacingAgainstAlgorithms
44
Aj retweeted
🎇Introducing LongLLaMA-Instruct 32K!🎇 Inspired by @p_nawrot #nanoT5, we fine-tune LongLLaMA- on a *single GPU* for ~48h to improve upon OpenLLaMA: 55% on lm-eval (vs. 53%), better perf on long context and code! We open-source our optimized fine-tuning code in PyTorch/HF!🧵
7
64
279
69,198
Aj retweeted
28 Aug 2023
“HuggingFace’s leaderboards show how truly blind they are because they actively hurting the open source movement by tricking it into creating a bunch of models that are useless for real usage.” Ouch. semianalysis.com/p/google-ge…
39
173
1,064
439,203
Aj retweeted
17 Nov 2022
The founders of Z Library were arrested in Argentina.
131
105
1,016
Aj retweeted
DR, physical industrial processes: transient, partially efficient Steel plant - 96% load reduction, 2 hour max Cement plant - 70% load reduction, 3 hour max DR, synthetic industrial: fully interruptible, fully efficient Bitcoin mining - 97% load reduction, unlimited Hours
8
28
142
Aj retweeted
There are two types of Bitcoin owners. Ones who use it as a safe haven and ones who see it as a speculative risk asset. The ascension of the price of Bitcoin is a function of the process of the latter selling to the former over time, so there are always two opposing currents.1/2
54
127
804
Aj retweeted
Cambridge University Press has just made all 700 textbooks currently available in HTML format on Cambridge Core free to access until the end of May to assist readers during the Covid-19 outbreak. This also includes 58 textbooks in Language and Linguistics. cambridge.org/core/what-we-p…
147
12,041
20,728
Aj retweeted
Conventional wisdom: "Not enough data? Use classic learners (Random Forests, RBF SVM, ..), not deep nets." New paper: infinitely wide nets beat these and also beat finite nets. Infinite nets train faster than finite nets here (hint: Neural Tangent Kernel)! arxiv.org/abs/1910.01663

10
194
798
Aj retweeted
I believe I have written more papers than Alan Turing John Nash! Numbers of papers alone is a wrong misleading metric. Please focus instead on writing good papers that advance the field, help the world, and that you’ll be proud of when you look back in 20 or 50 years.
Yes, @GoogleAI (well, all of @AlphabetINC) produces a lot of awesome AI research, but @Stanford @MIT together produce more (judging by @NeurIPSConf papers!), and @Stanford @MIT @UCBerkeley @CarnegieMellon produces more than @AlphabetINC @Microsoft @facebook
22
359
1,775
Aj retweeted
Machine learning has the potential to accelerate science. This paper from folks at Harvard and Princeton shows how deep learning can speed up progress towards fusion energy: nature.com/articles/s41586-0… It's great to see Keras and TensorFlow being used here.

8
200
608
Aj retweeted
Keras turns 4 years old today 🎂 Congrats to all contributors and the entire community! We're only just getting started 👍
36
218
1,282