Chatbot Arena Update!
1. Multilingual Arena -- four new languages (German, Spanish, Russian, Japanese).
GPT-4o is #1 in English, German, and Spanish. Gemini-1.5-Pro is #1 in Japanese, Chinese, and French. Claude-3 Opus is #1 in Russian. The competition is tight, and we need more votes 🗳️ to confidently rank them.
Let's challenge LLMs in any language!
2. Yi-1.5-34B-Chat shows impressive performance, matching larger models like Qwen-1.5-110B and GPT-4-0613. Congrats
@01AI_Yi on this milestone!
3. Phi-3 Medium and Small are finally on the board! Medium (14B) ranks near GPT-3.5-Turbo-0613, Small (7B) ranks ~Llama-2-70B. We also see robust performance in Hard Prompts.
Congrats
@Microsoft Phi team on these great models for the community!
Learn more
- Full leaderboard
leaderboard.lmsys.org
- Chat & vote at
chat.lmsys.org