Arena Leaderboard

Elo rankings from human & LLM judge votes

Battle Now
Judge
RankModelCompanyEloWin%W / L / TBattles
1Nanonets OCR3nanonets109470.8%23W / 8L / 5T36
2Nanonets OCR2+nanonets107267.9%25W / 10L / 7T42
3Claude Sonnet 4.6anthropic106557.5%18W / 12L / 10T40
4GPT-5.4 · Low Reasoningopenai105152.8%14W / 12L / 10T36
5Claude Sonnet 4.6 · Thinkinganthropic101952.6%15W / 13L / 10T38
6GPT-5.4openai101648.4%8W / 9L / 14T31
7GPT-5.2openai101254.2%15W / 12L / 9T36
8GPT-4.1openai98641.9%9W / 14L / 8T31
9Gemini 3 Flashgoogle98440.7%8W / 13L / 6T27
10GPT-5 Miniopenai98250.0%13W / 13L / 9T35
11GPT-5.4 · Medium Reasoningopenai97746.8%11W / 13L / 7T31
12Gemini 3.1 Progoogle97241.3%6W / 10L / 7T23
13Gemini 2.5 Progoogle96644.6%7W / 10L / 11T28
14Gemini 2.5 Flash · Thinkinggoogle96242.0%8W / 12L / 5T25
15Claude Opus 4.6anthropic95641.9%9W / 15L / 13T37
16Claude Opus 4.6 · Low Thinkinganthropic95340.8%7W / 14L / 17T38
17Gemini 2.5 Flashgoogle93338.5%7W / 13L / 6T26