ResearcharXivNEW

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

Dai 2026-06-18
Yalun DaiHao LiShulin Tian

Real-world spatial intelligence requires reasoning over a continuous and evolving 3D world, yet existing VLMs and tool-augmented agents largely remain tied to static, stateless inference from isolated visual observations. We introduce \textbf{\textsc{S-Agent}}, a spatial tool-use agentic paradigm for understanding and reasoning over continuous multi-view images and videos. By formulating spatial r

Read on arXiv
Data aggregated and editorially reviewed by TrendMing.

Key Contributions

  • Real-world spatial intelligence requires reasoning over a continuous and evolving 3D world, yet existing VLMs and tool-augmented agents largely remain tied to static, stateless inference from isolated visual observations.
  • We introduce \textbf{\textsc{S-Agent}}, a spatial tool-use agentic paradigm for understanding and reasoning over continuous multi-view images and videos.
  • By formulating spatial r

Research Themes

AIResearch