ResearcharXivNEW

SSD: Spatially Speculative Decoding Accelerates Autoregressive Image Generation

Xiang 2026-06-18
Shilong XiangZirui ZhangLijun Yu

Autoregressive models excel in visual generation by treating images as 1D sequences of discrete tokens, mirroring language modeling. However, this flattening discards the intrinsic 2D spatial locality of visual signals, creating severe computational bottlenecks during inference. We introduce Spatially Speculative Decoding (SSD), a framework that aligns the predictive objective with the natural geo

Read on arXiv
Data aggregated and editorially reviewed by TrendMing.

Key Contributions

  • Autoregressive models excel in visual generation by treating images as 1D sequences of discrete tokens, mirroring language modeling.
  • However, this flattening discards the intrinsic 2D spatial locality of visual signals, creating severe computational bottlenecks during inference.
  • We introduce Spatially Speculative Decoding (SSD), a framework that aligns the predictive objective with the natural geo

Research Themes

AIResearch