michael

michael

84 Photos and videos

Tweets

michael @_michaelginn

Jun 8

In low-resource languages, speculative decoding may actually be hurting performance! 1/n

108

more replies

michael

michael @_michaelginn

Jun 8

Distillation is a common way to improve acceptance rates, but we find that distillation on one task (translation) tends to generalize poorly to another task (story generation) in the language 3/n

michael

michael @_michaelginn

Jun 8

So what should we do instead? It turns out the simple n-gram models might be a better choice, thanks to incredibly fast inference speeds. Sometimes, simpler is better! 4/n

will brown

michael retweeted

will brown

@willccbb

Jun 8

the God Model is a useful theoretical construct akin to a Worst-Case Adversary or a Busy Beaver Program or an NP Oracle, less compelling as a target to seek than as a foil for designing minimax programs which can be tangibly realized

8,170

michael

michael @_michaelginn

May 17

If you don’t think LLMs hallucinate anymore, try asking any remotely niche question about Logic Pro

126

Arthur Conmy

michael retweeted

Arthur Conmy

@ArthurConmy

May 5

DPO is substantially more similar to SFT than it is to RL. I will die on this hill.

404

35,925

michael

michael @_michaelginn

Apr 21

1. Workshop lowers publication standards 2. Quality of papers goes down 3. Workshop reputation goes down 4. Next workshop gets less submissions 5. Workshop lowers publication standards ....

442

michael

michael @_michaelginn

Apr 16

Most cringe *so far*

sankalp

@dejavucoder

Apr 15

the most cringe thing that has happened to applied ai is the openclaw hype

249

michael

michael @_michaelginn

Apr 15

Profoundly embarrassing stuff

David Im

@davidim

Apr 15

Introducing ABG CMO. If your CMO isn’t an ABG, you’re already losing Try now at abgcmo.com

1:05

160

Rhys

michael retweeted

Rhys

@RhysSullivan

Apr 13

not having to type is really nice, but i think i want to go back to manually writing the code myself and more leveraging LLMs for research and understanding of the codebase it's just too easy to defer you thinking today and end up in a bad state

Rhys

@RhysSullivan

Apr 13

from my experience, even the best models (Opus 4.6, 5.4 xhigh / 5.3 codex) cannot write good code today without an amount of work that is equivalent to just doing the work myself am excited for a world where they can, but in the current state i have very low trust in them

890

55,674

Lindia Tjuatja

michael retweeted

Lindia Tjuatja @lltjuatja

Apr 6

Really excited about this work w/ my long-time collaborators at Boulder! We address limitations in existing morphosyntactic annotation systems for digitally under-resourced languages and show how *jointly* predicting morphological segmentation helps with glossing performance

michael @_michaelginn

Apr 6

Excited to announce that the PolyGloss paper has been accepted to @aclmeeting! Previously, we trained models to help in endangered language documentation workflows by automatically predicting interlinear glosses. But real-world user studies revealed crucial issues...

2,379

michael

michael @_michaelginn

Apr 6

3,664

more replies

michael

michael @_michaelginn

Apr 6

Check it out here: arxiv.org/pdf/2601.10925 This work is the result of an ongoing collaboration between @lecslab and @LTIatCMU. Many thanks to my collaborators including @lexicutioner @lltjuatja @gneubig, and others!

110

michael

michael @_michaelginn

Apr 6

The models are available right now on HuggingFace! See usage instructions on our GitHub: github.com/lecs-lab/polyglos…

GitHub - lecs-lab/polygloss: A massively multilingual corpus and pretrained model for IGT

A massively multilingual corpus and pretrained model for IGT - lecs-lab/polygloss

github.com

104

michael

michael @_michaelginn

Apr 6

Excited to announce my work on learning FSTs with RNNs has been accepted to ACL Findings! @aclmeeting

michael @_michaelginn

Jan 20

(1) Learning transducers from data has been an open problem for decades. In a new paper with @lecslab, we present a highly effective approach that learns FSTs by imitating the hidden-state geometry of an RNN.

980

michael

michael @_michaelginn

Mar 30

Doing some important research

101