Sam Finlayson

Sam Finlayson

862 Photos and videos

Tweets

Pinned Tweet

Sam Finlayson

@IAmSamFin

15 Jun 2019

Meta-pin👇

Shriram Krishnamurthi (primary: Bluesky)

Sam Finlayson retweeted

Shriram Krishnamurthi (primary: Bluesky)@ShriramKMurthi

Jun 13

As usual, @hadleywickham has one of the cleanest, simplest, nicest explanations of things: in this case, the relationship between LLMs, harnesses, and agents. Better than a million Medium posts. tidydesign.substack.com/p/wh…

200

11,974

Sam Finlayson

Sam Finlayson

@IAmSamFin

15h

I’m admittedly quite close to the subject on this one, but I’ve been having a lot of Gell-man amnesia whiplash the last few days seeing a lot of really smart folks make breathless takes about AI trends largely based on: (1) head to heads of open web search LLMs vs closed specialty systems on questions (and rubrics!) that are easily googled (2) a lot of really entertainingly confident assumptions about what’s under the hood at OE. I appreciate the impulse, there’s a lot yet to be discovered for this field, and OE can and should and hopes to do more to support publishable evals of these tools. But this one imo isn’t it. If you have ideas for what the ideal clinically relevant eval looks like, hit me up. Can’t promise it’ll happen but can promise it’ll be considered. (COI: work with OE, potentially motivated reasoning but not intentionally so)

OpenEvidence

@EvidenceOpen

17h

Rigorous evaluation of medical AI is good for everyone, and we welcome it. Counter to a half-dozen independent studies from institutions such as the Mayo Clinic that were highly positive on OpenEvidence—a lone paper now purports to show that generalized AI beats specialized clinical AI (@UpToDate, @EvidenceOpen). The paper has a massive undisclosed conflict of interest and irredeemable methodological flaws. Behind the scenes: The study authors run a competing in-house medical AI at their hospital, and asked OpenEvidence for an API to power it — including rights to build a "competing product" with OpenEvidence's own API. OpenEvidence declined. Then, this paper coincidentally appeared. Point-by-point, looking closely at the datasets used in the study, the disingenuous and fatal flaws become immediately apparent 🧵.

13,261

Sam Finlayson

Sam Finlayson

@IAmSamFin

Jun 9

Surely DFW smiles down when this is shared... the conduit for religious experience himself (nytimes.com/2006/08/20/sport…) articulately exploring the themes by which -- through her own lack of introspection -- Tracey Austin broke his heart. gwern.net/doc/psychology/wil… cc @AndrewLBeam

Sam Stoffel

@sam_stoffel

Jun 9

I come back to this speech every once in a while: “in the 1,526 singles matches I played in my career, I won almost 80% of those matches … what percentage of points do you think I won in those matches? only 54%.”

1:11

847

Sam Finlayson

Sam Finlayson

@IAmSamFin

May 29

I've always approached the 80k hours work with an optimistic prior. But it's exactly this sort of analysis that reflects -- for me-- significant issues with their approach, which tend toward "super coarse numbers go in, extremely consequential personal decisions come out".

Tomas Pueyo

@tomaspueyo

May 29

Replying to @tomaspueyo

3. People think doctor is the best career to have an impact. When quantified, doctors save on average one life every 10 years. So it's good, but there are better ways Conversely, making lots of 💰 and donating a share can get you to save 80 lives (20x more than doctor)

8,090

Shriram Krishnamurthi (primary: Bluesky)

Sam Finlayson retweeted

Shriram Krishnamurthi (primary: Bluesky)@ShriramKMurthi

May 27

The reason "grade inflation" and all these other concepts don't really make sense to me is because I have a pretty simple, and I think appropriate, definition of what a grade is (consistent with Alex's).

Your overall course grade is a certificate of how you did: An A means you did Excellent work, B means you did Good work, and C means you did Fair work. I view it as a one-letter recommendation letter (a recommendation letter—ha, ha). I envision a person trying to hire a student with the skills that this class teaches. An A effectively says, “This person knows or can figure out how to do well most or all the tasks that might come up!” A grade of B effectively says, “This person can do several things, but may need some guidance or help.” A grade of C means, “This person has basic competence in the area”.

It should then be obvious that your performance cannot affect that of your classmates, or vice versa. I therefore do not “grade on a curve”, because I consider the notion meaningless. By the same token, there is also no “default” grade in this course. At least in principle, everyone can do well.

ALT Your overall course grade is a certificate of how you did: An A means you did Excellent work, B means you did Good work, and C means you did Fair work. I view it as a one-letter recommendation letter (a recommendation letter—ha, ha). I envision a person trying to hire a student with the skills that this class teaches. An A effectively says, “This person knows or can figure out how to do well most or all the tasks that might come up!” A grade of B effectively says, “This person can do several things, but may need some guidance or help.” A grade of C means, “This person has basic competence in the area”. It should then be obvious that your performance cannot affect that of your classmates, or vice versa. I therefore do not “grade on a curve”, because I consider the notion meaningless. By the same token, there is also no “default” grade in this course. At least in principle, everyone can do well.

Alex Kontorovich

@AlexKontorovich

May 27

As I've said before: a "grade" should mean a band within a dependency graph of skills. These are the skills the school considers "mastered". Imagine: "hey Mom, want to see me add fractions (or solve the quadratic formula)? Click on this node in the graph and it'll spin up 5 problems, and watch how effortlessly I get the right answer." Now *that* would be accountable schooling!

10,505

Sam Finlayson

Sam Finlayson

@IAmSamFin

May 20

I've soured a bit on podcasts over the last few years but will probably buck that trend for this one as I've never heard a Michael I. Jordan take I didn't like.

Machine Learning Street Talk

@MLStreetTalk

May 20

Michael I. Jordan on the new MLST. Four things: > AGI is a PR term. It confuses young people. > Discourse is bipolar, either alarmist or exuberant, this is in his words "so demoralizing" for 20- and 25-year-old researchers. > ML's methods came from statistics and operations research, NOT the AI tradition. > Data markets are Stackelberg games, not optimisation problems. A lot of ML researchers have never computed an equilibrium. Michael I. Jordan is a no-nonsense original gangster of the field and was described by Science magazine, back in 2016 as the most influential living computer scientist.

5:43

2,575

Andrew White 🐦‍⬛

Sam Finlayson retweeted

Andrew White 🐦‍⬛

@andrewwhite01

May 14

hallucinated references will land you a 1-year ban from arxiv now. wow

367

3,511

240,334

Sam Finlayson

Sam Finlayson retweeted

Sam Finlayson

@IAmSamFin

May 3

To elaborate: I disagreed with Hinton’s infamous 2016 prediction at the time, but I believe the bull case for radiology’s obsolescence based on CV trends in 2016 was stronger than the bull case for surgery’s obsolescence based on robotics in 2026 and I don’t think it’s close.

532

Sam Finlayson

Sam Finlayson

@IAmSamFin

May 3

Not a gambling man, but I would strongly consider a formal bet on this one. (My position would be against the claim “surgery will be replaced by AI within 10 years”)

Noah Kaufman, MD

@noahkaufmanmd

May 2

If you think surgery won’t be replaced by AI/robotics in the next decade, I think you’re actually nuts. 🤯 instagram.com/reel/DXuxGGRFU…

2,932

Sam Finlayson

Sam Finlayson

@IAmSamFin

May 3

532

Josh Farkas MD 💊

Sam Finlayson retweeted

Josh Farkas MD 💊

@PulmCrit

May 2

when the patient getting awakened q1hr for days finally develops delirium:

ALT Mitchell And Webb Are We The Baddies GIF

Nicholas Morris @namorrismd

Apr 24

Hourly Neurological Examinations after Acute Brain Injury rarely detect actionable events after 48h and are associated with delirium. @NeurosurgeryCNS journals.lww.com/neurosurger…

186

15,386

Sam Finlayson

Sam Finlayson

@IAmSamFin

May 1

AI news cycle in a nutshell

5,416

Isaac Kohane

Sam Finlayson retweeted

Isaac Kohane

@zakkohane

May 1

Nuanced TLDR explainer on the import of the recent @ScienceMagazine article on LLM performance vs doctors.

Arjun (Raj) Manrai

@arjunmanrai

May 1

🧵1/ Our new study on AI and physician reasoning just came out in @ScienceMagazine. As co-senior author, I'm excited about our findings, and I do think AI will reshape medicine. But after seeing some of the discussions, I'm also worried about how our findings may be misinterpreted.

3,845

Andrej Karpathy

Sam Finlayson retweeted

Andrej Karpathy

@karpathy

Apr 30

This is the the quote I've been citing a lot recently.

kache

@yacineMTB

Feb 4

you can outsource your thinking but you cannot outsource your understanding

848

4,387

46,834

2,595,103

Sam Finlayson

Sam Finlayson

@IAmSamFin

Apr 28

You've got to remember that these are just simple jurors...The common clay of the new AI spring.

Pedro Domingos

@pmddomingos

Apr 28

The state of AI, as captured by the jurors in the Musk v. OpenAI trial:

336

Michael Baym

Sam Finlayson retweeted

Michael Baym @baym

Apr 25

Stoked for the day when the scientific standard for AI in Biology becomes “does it actually do the thing?” and not “does it point to a potential future in which it does the thing?”

112

6,427

Isaac Kohane

Sam Finlayson retweeted

Isaac Kohane

@zakkohane

Apr 22

Impressive data set. Very pediatrics relevant

NEJM AI @NEJM_AI

Apr 21

A new article introduces EchoNext-Mini, an open dataset of 100,000 electrocardiograms with curated structural heart disease labels and an accompanying convolutional neural network model for detecting structural heart disease from electrocardiogram data. nejm.ai/4sI5Iab

ALT Figure 1. Summary of the Contributions of this Article.

6,246

Sam Finlayson

Sam Finlayson

@IAmSamFin

Apr 18

Interested in seeing what uses people come up with for this

OpenEvidence

@EvidenceOpen

Apr 8

Dot phrases are the EHR’s best-kept secret. Clinical shortcuts, templates, and workarounds, all encoded in fragments of text that only make sense to the person who wrote them. We built the AI-native version. Dotflows are reusable natural language prompts that customize how OpenEvidence responds. Type “.” in the search bar and the platform adapts to your style, your specialty, your thinking. Use .avs to generate a patient-facing after visit summary. .discharge for structured inpatient notes. .prior_auth to write an insurance appeal letter, because of course that’s one of the first things physicians automated. And .succinct, which compresses every answer into high-yield shorthand. Apparently we were being too thorough. Browse the community library to see what other clinicians have created and steal any you like. Or build your own.

ALT Dotflows in OpenEvidence

415

vas

Sam Finlayson retweeted

vas

@vasuman

Apr 17

“Taste is the only moat” - VCs right before investing in gambling apps, AI to raise kids, and startups with 12,000 fake GitHub stars

211

3,335

152,763