State-of-the-art generalisation testing in NLP. Tag us for a RT of your NLP generalisation paper tweet!

Joined April 2022
72 Photos and videos
Pinned Tweet
2 May 2024
The GenBench workshop is back! Do you work on generalisation (benchmarking) in #NLProc? Submit to the 2nd edition (genbench.org/workshop/) co-located with #EMNLP2024. We have a regular track and a ✨collaborative benchmarking task (CBT)✨ that's fully LLM-focused this year (1/6)
1
11
22
12,569
17 Nov 2024
That's a wrap! We (@glnmario, @christos_c, @_dieuwke_, @vernadankers, @khuyagbaatar_b, @a_kazemnejad & @ryandcotterell) thank all presenters, authors, reviewers and attendees!! The keynotes, the cats 😻, the posters, the talks and the lively panel: it was fantastic👏 🔥
6
46
2,876
GenBench retweeted
so proud of @HayleyRossLing for getting a best paper award at @GenBench this year!! 🎉🪅🎉 I'm sure @TeaAnd_OrCoffee would be too :) check out our paper and share if you think homemade cats are cats!
New paper with @najoungkim and @TeaAnd_OrCoffee testing if LLMs can draw adjective-noun inferences like humans! Turns out they often can, and even generalize to unseen combinations. But they're more optimistic about "artificial intelligence" than humans. arxiv.org/abs/2410.17482
1
5
60
3,473
GenBench retweeted
Woohoo go tinlab! Congrats @HayleyRossLing @TeaAnd_OrCoffee @najoungkim!!
16 Nov 2024
Replying to @GenBench
Best paper!
1
16
1,271
16 Nov 2024
Congratulations!
so proud of @HayleyRossLing for getting a best paper award at @GenBench this year!! 🎉🪅🎉 I'm sure @TeaAnd_OrCoffee would be too :) check out our paper and share if you think homemade cats are cats!
3
237
16 Nov 2024
Closing remarks and best paper award by @vernadankers
1
1
12
906
16 Nov 2024
Best paper!
2
7
1,407
16 Nov 2024
Congrats to all the authors!
2
92
16 Nov 2024
And we also have an honourable mention!
1
103
16 Nov 2024
Come listen to the hot takes of our panelist in the Brickell room! Do we still need generalisation evaluation? 🧐 #GenBench2024 #EMNLP2024
3
15
1,491
16 Nov 2024
Still at the poster session? Come join us for keynote 3 by @sameer_!
1
5
741
16 Nov 2024
Did you miss the GenBench poster session? Don't worry we've got you, here are (nearly all) posters! 😉 #GenBench2024 #EMNLP2024 Next up: keynote by Sameer Singh at 3!
1
13
830
16 Nov 2024
Spotlight time! Mirella Bueno on MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks aclanthology.org/2024.genben…
1
1
3
535
16 Nov 2024
Continuing with Bastian Bunzeck, presenting The SlayQA benchmark of social reasoning: testing gender-inclusive generalization with neopronouns aclanthology.org/2024.genben…
1
3
86
16 Nov 2024
Last spotlight presentation: MMLU-SR: A Benchmark for Stress-Testing Reasoning Capability of Large Language Models aclanthology.org/2024.genben… Unfortunately the authors couldn't make it, the work is kindly presented by their colleague Hengyi Wang 🙏
1
71
16 Nov 2024
Join us for our second keynote by Olmo co-lead @kylelostat
1
3
16
1,223
16 Nov 2024
He got all the room snickering already at slide 3! 😁
1
2
97
16 Nov 2024
Plus more cat pictures! 😻😻
1
93
16 Nov 2024
Oral presentation two with @sagnikrayc Investigating the Generalizability of Pretrained Language Models across Multiple Dimensions: A Case Study of NLI and MRC aclanthology.org/2024.genben…
1
2
7
916
16 Nov 2024
1
2
216