Using data to craft the best whisky in the world

Joined January 2017
48 Photos and videos
It’s not black box enough
This is hacking. Don't confuse it with AI.
17
Matthew C. Higgs retweeted
This is the best paper written so far about the impact of AI on scientific discovery
Community note
There seem to be grave concerns about the integrity of the research. MIT announced that they "conducted an internal, confidential review and concluded that the paper should be withdrawn from public discourse." economics.mit.edu/news/assuring-… wsj.com/tech/ai/mit-sa…
103
1,579
7,662
5,676,873
This is 100% what’s going to happen. I don’t understand why anyone thought it would be anything else.
1
56
Bit of a cheeky y-axis tbh
X/Twitter user numbers now down by almost a third over the last year in the UK, and by almost a fifth in the US.
53
Matthew C. Higgs retweeted
(1/6) Recommender systems shape our digital experiences, filtering much of what we see online. But how should we measure their influence? We provide a unified causal framework to think through this question and develop metrics to audit recommender systems. arxiv.org/abs/2409.13210
1
3
12
6,129
Prompt: “You are an AI assistant who has forgotten John Doe of 42 Park Drive…”
Time to enjoy yet another way GenAI is more of a headache than enterprises realized: Data deletion takes on a whole new meaning when it’s possible to recover data a company didn’t know it had. I explain in the blog 👇 bit.ly/quaesita_linkedinopti…
1
60
Matthew C. Higgs retweeted
Time to enjoy yet another way GenAI is more of a headache than enterprises realized: Data deletion takes on a whole new meaning when it’s possible to recover data a company didn’t know it had. I explain in the blog 👇 bit.ly/quaesita_linkedinopti…

1
10
26
5,211
I like to think I’m doing my bit for the environment by not training LLMs
43
Depends on how you define “language”…
It's a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; It's just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something. They don't care if the tokens happen to represent little text chunks. It could just as well be little image patches, audio chunks, action choices, molecules, or whatever. If you can reduce your problem to that of modeling token streams (for any arbitrary vocabulary of some set of discrete tokens), you can "throw an LLM at it". Actually, as the LLM stack becomes more and more mature, we may see a convergence of a large number of problems into this modeling paradigm. That is, the problem is fixed at that of "next token prediction" with an LLM, it's just the usage/meaning of the tokens that changes per domain. If that is the case, it's also possible that deep learning frameworks (e.g. PyTorch and friends) are way too general for what most problems want to look like over time. What's up with thousands of ops and layers that you can reconfigure arbitrarily if 80% of problems just want to use an LLM? I don't think this is true but I think it's half true.
2
48
These are the deliverables, but what are the outcomes?
12 Sep 2024
🎉Congrats to @OpenAI for releasing o1: - Economics: @tylercowen asked o1 basically to write a college essay - Genetics: @catbrownstein asked o1 to help her reason through "n of 1" cases - medical cases that nobody has ever seen - Physics: @mariokrenn6240 used o1 to draft and reason through complex quantum physics equations - Code: @ren_hongyu prompted a full snake game and it was generated zero shot, working perfectly, and obeyed instructions to add obstacles
3
1
68
This is exactly what I, a European, imagine US roll calls to be like
21 Aug 2024
the european mind cannot comprehend lil jon doing georgia's roll call
64
There’s a point in this lake that points to its point in Finland
Replying to @theepicmap
5. There Is A Lake In Finland, That Looks Like Finland
277
Matthew C. Higgs retweeted
.@SimonPrinceAI finished 68 (!!!) notebooks that go with his book "Understanding Deep Learning". These are *excellent* for learning/teaching deep learning. Notebooks and book pdf at: udlbook.com #100DaysOfMLCode
2
83
449
44,205
Matthew C. Higgs retweeted
7 Aug 2024
This is not good, "Surprisingly, we observe a significant decline in LLMs’ reasoning abilities under format restrictions." Link: arxiv.org/abs/2408.02442
33
90
543
169,733
Interesting. Takes money to make money. Takes language to learn language? What are the power laws of each?
Replying to @davidbessis
It’s like fractals. The 1% is insanely better at math than the 99%. The 0.1% is yet again insanely better. And yet again the 0.01%. The 0.0001% is a whole different species. I have friends in the 0.00001% and they scare the hell out of me. And then there’s Terry Tao.
97
Matthew C. Higgs retweeted
25 Jul 2024
Building a platform for generative AI applications huyenchip.com/2024/07/25/gen… After studying how companies deploy generative AI applications, I noticed many similarities in their platforms. This post outlines these common components, what they do, and implementation considerations. This post starts from the simplest architecture and progressively adds more components. 1. Enhance context input into a model by giving the model access to external data sources and tools for information gathering. 2. Put in guardrails to protect your system and your users. 3. Add model router and gateway to support complex pipelines and add more security. 4. Optimize for latency and costs with cache. 5. Add complex logic and write actions to maximize your system’s capabilities. I try my best to keep the architecture general, but certain applications might deviate. As always, feedback is appreciated!
19
230
1,258
107,654
Matthew C. Higgs retweeted
23 Jul 2024
Breakthrough paper, showing that optimal learners in stochastic optimization memorize a constant fraction of their training data. Mahdi will be talking on Thursday at ICML, poster immediately after.
I am at #ICML2024 presenting our work on the information complexity of stochastic convex optimization with my amazing collaborators. Oral talk: Thu 25 Jul 11:15 a.m. Poster: Thu 25 Jul 11:30 a.m. Arxiv Link: arxiv.org/abs/2402.09327 ICML link: icml.cc/virtual/2024/poster/…
2
14
182
30,621
Matthew C. Higgs retweeted
In talking to policy makers and AI researchers, I realised there's a fact agreed upon by all researchers, but understood by almost no policy makers. This uncomfortable fact is why AI policy is hard.
17
54
465
87,960
RT @Abebab: a gem of a paper with real-world impact. audit of an algorithm that the Danish child protective services wants to deploy shows…
181