LLM Performance Leaderboard

Interactive comparison of large language models across multiple benchmarks

Filter by BenchmarkClick a benchmark to sort models

RankModel NameOrganizationLicensearenaElo ScoreVotes
1Grok-3-Preview-02-24xAI
Proprietary
1412
3,364
2GPT-4.5-PreviewOpenAI
Proprietary
1411
3,242
3Gemini-2.0-Flash-Thinking-Exp-01-21Google
Proprietary
1384
17,487
4Gemini-2.0-Pro-Exp-02-05Google
Proprietary
1380
15,466
5ChatGPT-4o-latest (2025-01-29)OpenAI
Proprietary
1377
17,221
6DeepSeek-R1DeepSeek
MIT
1363
8,580
7Gemini-2.0-Flash-001Google
Proprietary
1357
13,257
8o1-2024-12-17OpenAI
Proprietary
1352
19,785
9Qwen2.5-MaxAlibaba
Proprietary
1336
11,930
10o3-mini-highOpenAI
Proprietary
1329
9,102
11DeepSeek-V3DeepSeek
DeepSeek
1318
22,007
12GLM-4-Plus-0111Zhipu
Proprietary
1311
6,035
13Qwen-Plus-0125Alibaba
Proprietary
1310
6,054
14Claude 3.7 SonnetAnthropic
Proprietary
1309
4,254
15Gemini-2.0-Flash-Lite-Preview-02-05Google
Proprietary
1308
12,774
16Step-2-16K-ExpStepFun
Proprietary
1305
5,132
17o1-miniOpenAI
Proprietary
1304
54,923
18o3-miniOpenAI
Proprietary
1304
15,463
19Gemini-1.5-Pro-002Google
Proprietary
1302
57,551
20Grok-2-08-13xAI
Proprietary
1288
67,038
21Yi-Lightning01.AI
Proprietary
1287
28,946
22Claude 3.5 Sonnet (20241022)Anthropic
Proprietary
1284
59,139
23Deepseek-v2.5-1210DeepSeek
DeepSeek
1279
7,247
24Athene-v2-Chat-72BNexusflow
Athene V2
1275
26,092
25GPT-4o-mini-2024-07-18OpenAI
Proprietary
1272
66,710
26Hunyuan-Large-2025-02-10Tencent
Proprietary
1271
3,860
27Gemini-1.5-Flash-002Google
Proprietary
1271
36,979
28Llama-3.1-405B-Instruct-bf16Meta
Llama 3.1
1269
34,228

🚀 Real-time updates | 🔍 Interactive visualizations | 📊 Data-driven insights

Data aggregated from multiple benchmark sources • Last updated: March 2025