Wir müssen wissen, wir werden wissen.

Joined September 2023
Photos and videos
Jun 10
Claude Coda
As far as the name of an AI model goes, "fable" really isn't the best. It sounds ominously like a cautionary tale. How did we go from a sonnet to an opus to a fable? What comes next? Fairytale?
7
2,393
May 17
In my 20s I really cared if I was the one initiating a lot, if I was thought of (not just thought _well_ of), etc. I learned to let go of that in my 30s. Initiating with a friend is vulnerability - be the change you want. Go text a friend you miss and make some plans.
May 17
A lot of the struggle with adult friendships is that they don't have a natural contract. That's okay - be relentlessly earnest. You'd be shocked how close you'll get if you consistently initiate, make it easy to say yes, and express curiosity. People are starved for sincerity.
1
13
419
209,776
May 17
A lot of the struggle with adult friendships is that they don't have a natural contract. That's okay - be relentlessly earnest. You'd be shocked how close you'll get if you consistently initiate, make it easy to say yes, and express curiosity. People are starved for sincerity.
The quiet grief of adult friendship: One of the most beautiful articles I've read in a while. Hits hard timesofindia.indiatimes.com/…
4
21
540
390,921
May 17
Don't respond with "let's catch up soon" - send dates and trust them to follow up with alternatives if those don't work. Text them when you miss them. Ask them what they're worried about and really listen. Send them dumb memes. Be okay with always being the one to make plans.
1
1
55
4,210
May 17
Obviously you need to read their temperature and respect boundaries. Also make it easy for them to say no to plans and don't keep pinging if they aren't responding warmly. But for most people that isn't the problem - it's that both sides are hoping for the other to initiate.
1
38
2,813
May 16
One of the most persistent misconceptions I see about the industry is that systematic = quant = stat arb = HFT. Jane Street has never been an HFT in the sense of Virtu or Tower, and at this point they even have fundamental long/short equity (but traded systematically).
May 16
Replying to @GoshawkTrades
For what it's worth, Jane Street is not a low latency specialist. They have plenty of desks running liquidity taking strategies at daily, weekly and monthly durations.
14
16
472
61,148
May 13
You compared trading - a desk job in an air conditioned office - to mishandling a gun. No one is saying clock in/clock out. They're saying you're being edgy about sitting in a chair for 8 hours a day moving numbers around. Go spend time talking to someone who works in an ER.
Replying to @Ksidiii
“You take what you do too seriously man, you just gotta clock in do your job and clock out man, if you are working that intensely something is wrong man” Oh give me a fucking break 🙄
9
2
129
26,354
May 13
World's most self-aware trader: "Trading is cancerous and dangerous and similar to handling an AK-47. My life is war and I may die."
This game has transformed my life, my family’s life, my friends’ lives, and even the lives of my future grandchildren. For that, I am eternally grateful to God for the success He has given me. At the same time, make no mistake, this game is one of the most cancerous and dangerous in the world. It can ruin and cripple you financially, socially, psychologically, and emotionally in ways you cannot imagine. If you are playing this game at the highest level, not casually investing, you have to be willing and aching to answer its call at every second of the day. And honestly, monetary advancement alone is not enough to keep you in it. Something has to be socially wrong with you to be obsessed in a way that is difficult to even characterize. You have to be obsessed with solving this puzzle every single day. If that is not you, it is probably a better choice to avoid it entirely. Very similar to mishandling an AK-47, small mistakes in this game can be life altering. For my fellow lunatics, another day, and into the fray we go. Blessings and love your way 🫡
4
3
209
30,936
May 9
Here are @METR_Evals long horizon results at 80%, log scale. Their trend line is a simple OLS fit; defensibly simple. But you can see a sharper slope starting after 2024, before which there are only five points. A naive thing to try would be a robust regression, with shrinkage; METR themselves say that doesn't really change much, which is credible because the first four data points have small residuals to the OLS fit. However we can test the hypothesis that a single exponential OLS fit is representative of the entire trend line using a changepoint-search analysis. This is better than just visually choosing a changepoint and testing if means before and after are distinct. When we do that search, we find a structural break in November 2024, at a p-value of 0.00033 (3000 bootstrap samples). This strongly rejects the null hypothesis. An interpretation of these results: reasoning models do have a faster rate of acceleration on long horizon software tasks than non-reasoning models. The implication: instead of considering the doubling time over the entire sample, it is worth considering doubling times before and after the reasoning epoch. Those are 275.5 days versus 124.3 days, respectively.
I think it's time to update the trendline
8
4
66
12,652
May 9
This isn't really groundbreaking stuff, but for posterity here is code that reproduces the analysis: gist.github.com/0xfdf/301846…
5
1,879
Apr 25
Game theory is fun, but there is actually a neat deontological basis for both choices, ethically. The "everyone survives" cases are 1) everyone presses red, 2) a majority press blue. Since "everyone presses blue" contains 2, both choices survive Kant's categorical imperative.
Everyone in the world has to take a private vote by pressing a red or blue button. If more than 50% of people press the blue button, everyone survives. If less than 50% of people press the blue button, only people who pressed the red button survive. Which button would you press?
30
17
829
128,132
Apr 25
This is interesting because the structure of questions like this usually favors utilitarians who do the math, and usually they are cast against the feels-good choice. But in this case, you can derive a feels-good ethical outcome from the aesthetically unethical choice.
2
140
13,858
Feb 4
First, the good part of the Anthropic ads: they are funny, and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we won’t do exactly this; we would obviously never run ads in the way Anthropic depicts them. We are not stupid and we know our users would reject that. I guess it’s on brand for Anthropic doublespeak to use a deceptive ad to critique theoretical deceptive ads that aren’t real, but a Super Bowl ad is not where I would expect it. More importantly, we believe everyone deserves to use AI and are committed to free access, because we believe access creates agency. More Texans use ChatGPT for free than total people use Claude in the US, so we have a differently-shaped problem than they do. (If you want to pay for ChatGPT Plus or Pro, we don't show you ads.) Anthropic serves an expensive product to rich people. We are glad they do that and we are doing that too, but we also feel strongly that we need to bring AI to billions of people who can’t pay for subscriptions. Maybe even more importantly: Anthropic wants to control what people do with AI—they block companies they don't like from using their coding product (including us), they want to write the rules themselves for what people can and can't use AI for, and now they also want to tell other companies what their business models can be. We are committed to broad, democratic decision making in addition to access. We are also committed to building the most resilient ecosystem for advanced AI. We care a great deal about safe, broadly beneficial AGI, and we know the only way to get there is to work with the world to prepare. One authoritarian company won't get us there on their own, to say nothing of the other obvious risks. It is a dark path. As for our Super Bowl ad: it’s about builders, and how anyone can now build anything. We are enjoying watching so many people switch to Codex. There have now been 500,000 app downloads since launch on Monday, and we think builders are really going to love what’s coming in the next few weeks. I believe Codex is going to win. We will continue to work hard to make even more intelligence available for lower and lower prices to our users. This time belongs to the builders, not the people who want to control them.
4
71
2,521
68,788
Jan 31
not sure what to make of Anthropic. are they really this earnest? the venn diagram of companies with extraordinary product market fit, and publish studies about dangers of their own product, has to be 1? they keep winning while trying to snatch defeat from the jaws of victory
AI can make work faster, but a fear is that relying on it may make it harder to learn new skills on the job. We ran an experiment with software engineers to learn more. Coding with AI led to a decrease in mastery—but this depended on how people used it. anthropic.com/research/AI-as…
17
3
304
43,834
Jan 20
daniel litt, tireless math ronin roaming the countryside, sparring with highway bandits peddling AI proofs
Replying to @Archivara
I’m really sorry to put you guys on blast again but you need to hire a subject-matter expert. The content of this paper is (1) Winograd’s 1980 result (basically CRT) that the number of multiplications you’re computing is at most 2*deg-#factors, and (3) 2*5-3=7.
1
4
91
26,304
fdf retweeted
Jan 19
Replying to @francoisfleuret
all have identical EV, but differ in standard deviation (0, 5.8, 11.5, 23.1). what is your utility function of intelligence? linear? you are risk neutral; be indifferent to the choices. convex? be risk seeking; choose [60, 140]. concave? be risk averse; choose [100, 100].
2
1
34
5,525
Jan 16
this is correct. agents are an acceleration tool. watching someone fail to solve a problem with an agent reveals that they lack the clarity and understanding of the problem to adequately solve it
Jan 16
What Claude Code has revealed is that most people either have mediocre ideas or no ideas at all. The tool is a force multiplier for those who already know what they want to build and how to think through it systematically; it elevates competence, rewards clarity, and accelerates execution for people who would have gotten there anyway, just slower. If you have a sharp vision and can break it into coherent steps, Claude Code becomes an extension of your own capability. But there's another mode of use entirely. For people without that clarity, the appeal is precisely that the input can stay vague; you gesture at something, hit enter, and wait to see what comes out. This is structurally identical to a slot machine: low effort, variable reward, and that intermittent reinforcement loop that hooks the susceptible. So the same tool that elevates the focused and capable is also manufacturing a kind of gambling behavior in people prone to it.
2
2
74
11,086
fdf retweeted
Jan 7
Replying to @systematicls
no comment on the specific problem setting, but as a general statistical problem this is well-solved with classification instead of regression. MSE loss is inappropriate; you should use binomial logistic loss pairwise on the entire set to be ranked (for each A, B, fit P(A > B)), or multinomial logistic loss on the entire set together (fit P(R1 = A), P(R1 = B), ..., etc).
6
4
59
19,556
Jan 3
am I crazy or should a representative of google not be saying something like this publicly she is a senior engineer who works on the gemini API, saying a competitor's product can do what they failed to do in a year's time
I'm not joking and this isn't funny. We have been trying to build distributed agent orchestrators at Google since last year. There are various options, not everyone is aligned... I gave Claude Code a description of the problem, it generated what we built last year in an hour.
171
22
1,649
334,261
Jan 3
just bizarre -- there are all kinds of NDA/confidentiality/brand comms issues here
8
197
28,373