ResearcharXivNEW

FlowEdit: Associative Memory for Lifelong Pronunciation Adaptation in Flow-Matching TTS

Singh 2026-06-18
Harshit SinghAyush Pratap SinghNityanand Mathur

Flow-matching text-to-speech systems achieve remarkable zero-shot quality but remain static after deployment: pronunciation errors on out-of-vocabulary proper nouns persist unless the model is retrained. We introduce FlowEdit, a life-long adaptation framework for frozen flow-matching TTS that learns pronunciation corrections as latent conditioning edits rather than weight updates. When corrective

Read on arXiv
Data aggregated and editorially reviewed by TrendMing.

Key Contributions

  • Flow-matching text-to-speech systems achieve remarkable zero-shot quality but remain static after deployment: pronunciation errors on out-of-vocabulary proper nouns persist unless the model is retrained.
  • We introduce FlowEdit, a life-long adaptation framework for frozen flow-matching TTS that learns pronunciation corrections as latent conditioning edits rather than weight updates.
  • When corrective

Research Themes

AIResearch