AI Intelligence · Benchmarks

AI Benchmark Center

Live leaderboards across reasoning, coding, math, expert knowledge, and agent performance.

Current Leaders by Category

📊

Long-Context QA (LC-QA-256K)

Cognito AI

93.8

📊

Multi-Document Summarization (MD-Summ-10Doc)

AuraNet Prime

88.5

📊

Complex Document Understanding (CDU-Financial)

Vector Engine

91.2

📊

Legal Document Interpretation (LDI-Contract)

Genesis X

92.5

📊

Scientific Paper Synthesis (SPS-Interdisciplinary)

Quantum Core

90.1

📊

Flores-200

Aurora Pro

88.5

📊

M3Exam

DeepMind Voyager

📊

XNLI

Microsoft Chimera

📊

IndicMT-Bench

Localized AI

79.5

📊

AfriTranslate

DeepMind Voyager

73.1

Overall Model Ranking

(avg. normalized score across all benchmarks)

🥇NEW

DeepMind Voyager

Google DeepMind

100.0

avg. score

🥈NEW

Microsoft Chimera

Microsoft

100.0

avg. score

🥉NEW

Localized AI

IndicAI Hub

100.0

avg. score

4NEW

Genesis X

Nova Labs

99.7

avg. score

5NEW

Anthropic Aether

Anthropic

99.5

avg. score

6NEW

Vector Engine

Quantum Mechanics Inc.

99.2

avg. score

7NEW

AfriLingua Pro

Ubuntu AI

99.2

avg. score

8NEW

Mindstream v4

Cerebra Solutions

99.0

avg. score

📊

Long-Context QA (LC-QA-256K)

Long-Context QA (LC-QA-256K) · score

🥇

Cognito AISynapse CorpNEW

Jun 15, 2026

93.8

🥈

OmniMind v3Apex LabsNEW

Jun 12, 2026

92.1

🥉

Hyperion ProStellar AINEW

Jun 10, 2026

91.5

Quantum CoreInfinity SystemsNEW

Jun 8, 2026

90.7

EchoMind 2.1Horizon TechNEW

Jun 5, 2026

89.9

📊

Multi-Document Summarization (MD-Summ-10Doc)

Multi-Document Summarization (MD-Summ-10Doc) · score

🥇

AuraNet PrimeLumina AINEW

Jun 16, 2026

88.5

🥈

Genesis XNova LabsNEW

Jun 14, 2026

87.9

🥉

Mindstream v4Cerebra SolutionsNEW

Jun 11, 2026

87.2

Titan ForgeAtlas InnovationsNEW

Jun 9, 2026

86.8

Cognito AISynapse CorpNEW

Jun 6, 2026

86.5

📊

Complex Document Understanding (CDU-Financial)

Complex Document Understanding (CDU-Financial) · score

🥇

Vector EngineQuantum Mechanics Inc.NEW

Jun 18, 2026

91.2

🥈

NeuraLink MaxBioMind AINEW

Jun 17, 2026

90.5

🥉

Sentinel 1.0Watchdog TechNEW

Jun 13, 2026

89.8

OmniMind v3Apex LabsNEW

Jun 7, 2026

89.2

AuraNet PrimeLumina AINEW

Jun 4, 2026

88.7

📊

Legal Document Interpretation (LDI-Contract)

Legal Document Interpretation (LDI-Contract) · score

🥇

Genesis XNova LabsNEW

Jun 20, 2026

92.5

🥈

Hyperion ProStellar AINEW

Jun 19, 2026

91.8

🥉

Vector EngineQuantum Mechanics Inc.NEW

Jun 15, 2026

91.1

Sentinel 1.0Watchdog TechNEW

Jun 10, 2026

90.4

EchoMind 2.1Horizon TechNEW

Jun 7, 2026

89.7

📊

Scientific Paper Synthesis (SPS-Interdisciplinary)

Scientific Paper Synthesis (SPS-Interdisciplinary) · score

🥇

Quantum CoreInfinity SystemsNEW

Jun 22, 2026

90.1

🥈

Mindstream v4Cerebra SolutionsNEW

Jun 21, 2026

89.6

🥉

NeuraLink MaxBioMind AINEW

Jun 16, 2026

88.9

Titan ForgeAtlas InnovationsNEW

Jun 12, 2026

88.3

Cognito AISynapse CorpNEW

Jun 9, 2026

87.7

📊

Flores-200

Flores-200 · score

🥇

Aurora ProQuantum LabsNEW

May 15, 2026

88.5

🥈

Nexus AISynapse CorpNEW

Apr 22, 2026

87.9

🥉

Baidu Erlang v3BaiduNEW

May 1, 2026

OmniLinguaGlobalTech SolutionsNEW

Mar 10, 2026

86.2

TranscendAetherAINEW

Feb 28, 2026

85.1

📊

M3Exam

M3Exam · score

🥇

DeepMind VoyagerGoogle DeepMindNEW

Jun 1, 2026

🥈

Aurora ProQuantum LabsNEW

May 15, 2026

92.1

🥉

Nexus AISynapse CorpNEW

Apr 22, 2026

91.5

Meta AtlasMeta PlatformsNEW

May 20, 2026

OmniLinguaGlobalTech SolutionsNEW

Mar 10, 2026

90.3

📊

XNLI

XNLI · score

🥇

Microsoft ChimeraMicrosoftNEW

Jun 5, 2026

🥈

Anthropic AetherAnthropicNEW

May 25, 2026

94.5

🥉

Nexus AISynapse CorpNEW

Apr 22, 2026

94.2

OmniLinguaGlobalTech SolutionsNEW

Mar 10, 2026

93.8

Baidu Erlang v3BaiduNEW

May 1, 2026

93.5

📊

IndicMT-Bench

IndicMT-Bench · score

🥇

Localized AIIndicAI HubNEW

May 8, 2026

79.5

🥈

OmniLinguaGlobalTech SolutionsNEW

Mar 10, 2026

78.1

🥉

Aurora ProQuantum LabsNEW

May 15, 2026

77.8

Project ShaktiTechMahindra AINEW

Apr 15, 2026

Nexus AISynapse CorpNEW

Apr 22, 2026

76.9

📊

AfriTranslate

AfriTranslate · score

🥇

DeepMind VoyagerGoogle DeepMindNEW

Jun 1, 2026

73.1

🥈

AfriLingua ProUbuntu AINEW

Jun 10, 2026

72.5

🥉

TransAfricaPanaGen AINEW

May 2, 2026

71.8

Aurora ProQuantum LabsNEW

May 15, 2026

70.5

OmniLinguaGlobalTech SolutionsNEW

Mar 10, 2026

About These Benchmarks

MMLU (Massive Multitask Language Understanding)

Tests knowledge across 57 subjects including STEM, humanities, and social sciences. 14,000+ questions.

HumanEval (Coding)

164 hand-crafted programming challenges. Measures ability to produce correct code from docstrings.

MATH

12,500 competition math problems from AMC, AIME, and AMC 10/12. Tests advanced mathematical reasoning.

SWE-bench Verified

Real GitHub issues from popular open-source repos. Measures end-to-end software engineering capability.