AI2 WildBench Leaderboard (V2) - a Hugging Face Space by allenai
This app shows static tables ranking language models based on various metrics, letting you pick which models, tasks, and ranking methods to view. Simply choose the desired options and the leaderboa...
huggingface.co