GenBench

GenBench

72 Photos and videos

Tweets

Pinned Tweet

GenBench @GenBench

2 May 2024

The GenBench workshop is back! Do you work on generalisation (benchmarking) in #NLProc? Submit to the 2nd edition (genbench.org/workshop/) co-located with #EMNLP2024. We have a regular track and a ✨collaborative benchmarking task (CBT)✨ that's fully LLM-focused this year (1/6)

GenBench Workshop 2024

The second workshop on generalisation (benchmarking) in NLP

genbench.org

12,569

GenBench

GenBench @GenBench

17 Nov 2024

That's a wrap! We (@glnmario, @christos_c, @_dieuwke_, @vernadankers, @khuyagbaatar_b, @a_kazemnejad & @ryandcotterell) thank all presenters, authors, reviewers and attendees!! The keynotes, the cats 😻, the posters, the talks and the lively panel: it was fantastic👏 🔥

2,876

Najoung Kim 🫠

GenBench retweeted

Najoung Kim 🫠@najoungkim

16 Nov 2024

so proud of @HayleyRossLing for getting a best paper award at @GenBench this year!! 🎉🪅🎉 I'm sure @TeaAnd_OrCoffee would be too :) check out our paper and share if you think homemade cats are cats!

Hayley Ross @HayleyRossLing

24 Oct 2024

New paper with @najoungkim and @TeaAnd_OrCoffee testing if LLMs can draw adjective-noun inferences like humans! Turns out they often can, and even generalize to unseen combinations. But they're more optimistic about "artificial intelligence" than humans. arxiv.org/abs/2410.17482

3,473

Kanishka Misra 🌊

GenBench retweeted

Kanishka Misra 🌊@kanishkamisra

16 Nov 2024

Woohoo go tinlab! Congrats @HayleyRossLing @TeaAnd_OrCoffee @najoungkim!!

GenBench @GenBench

16 Nov 2024

Replying to @GenBench

Best paper!

1,271

GenBench

GenBench @GenBench

16 Nov 2024

Congratulations!

Najoung Kim 🫠@najoungkim

16 Nov 2024

237

GenBench

GenBench @GenBench

16 Nov 2024

Closing remarks and best paper award by @vernadankers

906

GenBench

GenBench @GenBench

16 Nov 2024

Best paper!

1,407

GenBench

GenBench @GenBench

16 Nov 2024

Congrats to all the authors!

GenBench

GenBench @GenBench

16 Nov 2024

And we also have an honourable mention!

103

GenBench

GenBench @GenBench

16 Nov 2024

Come listen to the hot takes of our panelist in the Brickell room! Do we still need generalisation evaluation? 🧐 #GenBench2024 #EMNLP2024

1,491

GenBench

GenBench @GenBench

16 Nov 2024

Still at the poster session? Come join us for keynote 3 by @sameer_!

741

GenBench

GenBench @GenBench

16 Nov 2024

Did you miss the GenBench poster session? Don't worry we've got you, here are (nearly all) posters! 😉 #GenBench2024 #EMNLP2024 Next up: keynote by Sameer Singh at 3!

0:06

830

GenBench

GenBench @GenBench

16 Nov 2024

Spotlight time! Mirella Bueno on MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks aclanthology.org/2024.genben…

535

more replies

GenBench

GenBench @GenBench

16 Nov 2024

Continuing with Bastian Bunzeck, presenting The SlayQA benchmark of social reasoning: testing gender-inclusive generalization with neopronouns aclanthology.org/2024.genben…

GenBench

GenBench @GenBench

16 Nov 2024

Last spotlight presentation: MMLU-SR: A Benchmark for Stress-Testing Reasoning Capability of Large Language Models aclanthology.org/2024.genben… Unfortunately the authors couldn't make it, the work is kindly presented by their colleague Hengyi Wang 🙏

GenBench

GenBench @GenBench

16 Nov 2024

Join us for our second keynote by Olmo co-lead @kylelostat

1,223

GenBench

GenBench @GenBench

16 Nov 2024

He got all the room snickering already at slide 3! 😁

GenBench

GenBench @GenBench

16 Nov 2024

Plus more cat pictures! 😻😻

GenBench

GenBench @GenBench

16 Nov 2024

Oral presentation two with @sagnikrayc Investigating the Generalizability of Pretrained Language Models across Multiple Dimensions: A Case Study of NLI and MRC aclanthology.org/2024.genben…

916

GenBench

GenBench @GenBench

16 Nov 2024

216