ResearcharXivNEW
S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence
Dai 2026-06-18
Yalun DaiHao LiShulin Tian
Real-world spatial intelligence requires reasoning over a continuous and evolving 3D world, yet existing VLMs and tool-augmented agents largely remain tied to static, stateless inference from isolated visual observations. We introduce \textbf{\textsc{S-Agent}}, a spatial tool-use agentic paradigm for understanding and reasoning over continuous multi-view images and videos. By formulating spatial r
Read on arXivData aggregated and editorially reviewed by TrendMing.
Key Contributions
- Real-world spatial intelligence requires reasoning over a continuous and evolving 3D world, yet existing VLMs and tool-augmented agents largely remain tied to static, stateless inference from isolated visual observations.
- We introduce \textbf{\textsc{S-Agent}}, a spatial tool-use agentic paradigm for understanding and reasoning over continuous multi-view images and videos.
- By formulating spatial r
Research Themes
AIResearch