Skip to main content

AI Models Leaderboard

24 current models ranked by benchmark score. Click a column header to re-sort.

#Model
1Grok 393.397.784.6
2o1featured92.394.878.3
3Claude Opus 4.6featured92.094.597.179.9
4Gemini 2.5 Profeatured91.097.084.0
5DeepSeek R1featured90.897.371.5
6Claude Sonnet 4.6featured90.193.593.770.0
7Claude 3.5 Sonnetfeatured88.793.778.365.0
8GPT-4ofeatured88.790.276.653.6
9Llama 3.1 405B88.673.5
10DeepSeek V388.591.690.2
11Grok 287.576.1
12Llama 3.3 70B86.077.0
13Gemini 1.5 Profeatured85.958.5
14Mistral Large 284.092.069.7
15Claude 3.5 Haiku83.088.0
16Gemini 2.0 Flash82.0
17GPT-4o mini82.087.2
18Mistral Small 381.0
19Gemini 1.5 Flash79.9
20Command R+75.7
21Llama 3.2 11B Vision73.0
22o4-minifeatured99.581.4
23Codestral91.1
24o3-mini97.079.7

Benchmark scores are sourced from official provider publications and independent evaluations. Scores reflect the model version and evaluation methodology at the time of measurement — direct comparisons across providers should be treated as approximate.