AI Intelligence · Benchmarks

AI Benchmark Center

Live leaderboards across reasoning, coding, math, expert knowledge, and agent performance.

Current Leaders by Category

📊
Long-Context QA (LC-QA-256K)
Cognito AI
93.8
📊
Multi-Document Summarization (MD-Summ-10Doc)
AuraNet Prime
88.5
📊
Complex Document Understanding (CDU-Financial)
Vector Engine
91.2
📊
Legal Document Interpretation (LDI-Contract)
Genesis X
92.5
📊
Scientific Paper Synthesis (SPS-Interdisciplinary)
Quantum Core
90.1
📊
Flores-200
Aurora Pro
88.5
📊
M3Exam
DeepMind Voyager
93
📊
XNLI
Microsoft Chimera
95
📊
IndicMT-Bench
Localized AI
79.5
📊
AfriTranslate
DeepMind Voyager
73.1

Overall Model Ranking

(avg. normalized score across all benchmarks)
🥇NEW
DeepMind Voyager
Google DeepMind
100.0
avg. score
🥈NEW
Microsoft Chimera
Microsoft
100.0
avg. score
🥉NEW
Localized AI
IndicAI Hub
100.0
avg. score
4NEW
Genesis X
Nova Labs
99.7
avg. score
5NEW
Anthropic Aether
Anthropic
99.5
avg. score
6NEW
Vector Engine
Quantum Mechanics Inc.
99.2
avg. score
7NEW
AfriLingua Pro
Ubuntu AI
99.2
avg. score
8NEW
Mindstream v4
Cerebra Solutions
99.0
avg. score
📊

Long-Context QA (LC-QA-256K)

Long-Context QA (LC-QA-256K) · score

🥇
Cognito AISynapse CorpNEW

Jun 15, 2026

93.8
🥈
OmniMind v3Apex LabsNEW

Jun 12, 2026

92.1
🥉
Hyperion ProStellar AINEW

Jun 10, 2026

91.5
4
Quantum CoreInfinity SystemsNEW

Jun 8, 2026

90.7
5
EchoMind 2.1Horizon TechNEW

Jun 5, 2026

89.9
📊

Multi-Document Summarization (MD-Summ-10Doc)

Multi-Document Summarization (MD-Summ-10Doc) · score

🥇
AuraNet PrimeLumina AINEW

Jun 16, 2026

88.5
🥈
Genesis XNova LabsNEW

Jun 14, 2026

87.9
🥉
Mindstream v4Cerebra SolutionsNEW

Jun 11, 2026

87.2
4
Titan ForgeAtlas InnovationsNEW

Jun 9, 2026

86.8
5
Cognito AISynapse CorpNEW

Jun 6, 2026

86.5
📊

Complex Document Understanding (CDU-Financial)

Complex Document Understanding (CDU-Financial) · score

🥇
Vector EngineQuantum Mechanics Inc.NEW

Jun 18, 2026

91.2
🥈
NeuraLink MaxBioMind AINEW

Jun 17, 2026

90.5
🥉
Sentinel 1.0Watchdog TechNEW

Jun 13, 2026

89.8
4
OmniMind v3Apex LabsNEW

Jun 7, 2026

89.2
5
AuraNet PrimeLumina AINEW

Jun 4, 2026

88.7
📊

Legal Document Interpretation (LDI-Contract)

Legal Document Interpretation (LDI-Contract) · score

🥇
Genesis XNova LabsNEW

Jun 20, 2026

92.5
🥈
Hyperion ProStellar AINEW

Jun 19, 2026

91.8
🥉
Vector EngineQuantum Mechanics Inc.NEW

Jun 15, 2026

91.1
4
Sentinel 1.0Watchdog TechNEW

Jun 10, 2026

90.4
5
EchoMind 2.1Horizon TechNEW

Jun 7, 2026

89.7
📊

Scientific Paper Synthesis (SPS-Interdisciplinary)

Scientific Paper Synthesis (SPS-Interdisciplinary) · score

🥇
Quantum CoreInfinity SystemsNEW

Jun 22, 2026

90.1
🥈
Mindstream v4Cerebra SolutionsNEW

Jun 21, 2026

89.6
🥉
NeuraLink MaxBioMind AINEW

Jun 16, 2026

88.9
4
Titan ForgeAtlas InnovationsNEW

Jun 12, 2026

88.3
5
Cognito AISynapse CorpNEW

Jun 9, 2026

87.7
📊

Flores-200

Flores-200 · score

🥇
Aurora ProQuantum LabsNEW

May 15, 2026

88.5
🥈
Nexus AISynapse CorpNEW

Apr 22, 2026

87.9
🥉
Baidu Erlang v3BaiduNEW

May 1, 2026

87
4
OmniLinguaGlobalTech SolutionsNEW

Mar 10, 2026

86.2
5
TranscendAetherAINEW

Feb 28, 2026

85.1
📊

M3Exam

M3Exam · score

🥇
DeepMind VoyagerGoogle DeepMindNEW

Jun 1, 2026

93
🥈
Aurora ProQuantum LabsNEW

May 15, 2026

92.1
🥉
Nexus AISynapse CorpNEW

Apr 22, 2026

91.5
4
Meta AtlasMeta PlatformsNEW

May 20, 2026

91
5
OmniLinguaGlobalTech SolutionsNEW

Mar 10, 2026

90.3
📊

XNLI

XNLI · score

🥇
Microsoft ChimeraMicrosoftNEW

Jun 5, 2026

95
🥈
Anthropic AetherAnthropicNEW

May 25, 2026

94.5
🥉
Nexus AISynapse CorpNEW

Apr 22, 2026

94.2
4
OmniLinguaGlobalTech SolutionsNEW

Mar 10, 2026

93.8
5
Baidu Erlang v3BaiduNEW

May 1, 2026

93.5
📊

IndicMT-Bench

IndicMT-Bench · score

🥇
Localized AIIndicAI HubNEW

May 8, 2026

79.5
🥈
OmniLinguaGlobalTech SolutionsNEW

Mar 10, 2026

78.1
🥉
Aurora ProQuantum LabsNEW

May 15, 2026

77.8
4
Project ShaktiTechMahindra AINEW

Apr 15, 2026

77
5
Nexus AISynapse CorpNEW

Apr 22, 2026

76.9
📊

AfriTranslate

AfriTranslate · score

🥇
DeepMind VoyagerGoogle DeepMindNEW

Jun 1, 2026

73.1
🥈
AfriLingua ProUbuntu AINEW

Jun 10, 2026

72.5
🥉
TransAfricaPanaGen AINEW

May 2, 2026

71.8
4
Aurora ProQuantum LabsNEW

May 15, 2026

70.5
5
OmniLinguaGlobalTech SolutionsNEW

Mar 10, 2026

70

About These Benchmarks

MMLU (Massive Multitask Language Understanding)

Tests knowledge across 57 subjects including STEM, humanities, and social sciences. 14,000+ questions.

HumanEval (Coding)

164 hand-crafted programming challenges. Measures ability to produce correct code from docstrings.

MATH

12,500 competition math problems from AMC, AIME, and AMC 10/12. Tests advanced mathematical reasoning.

SWE-bench Verified

Real GitHub issues from popular open-source repos. Measures end-to-end software engineering capability.