ResearcharXivNEW
FlowEdit: Associative Memory for Lifelong Pronunciation Adaptation in Flow-Matching TTS
Singh 2026-06-18
Harshit SinghAyush Pratap SinghNityanand Mathur
Flow-matching text-to-speech systems achieve remarkable zero-shot quality but remain static after deployment: pronunciation errors on out-of-vocabulary proper nouns persist unless the model is retrained. We introduce FlowEdit, a life-long adaptation framework for frozen flow-matching TTS that learns pronunciation corrections as latent conditioning edits rather than weight updates. When corrective
Read on arXivData aggregated and editorially reviewed by TrendMing.
Key Contributions
- Flow-matching text-to-speech systems achieve remarkable zero-shot quality but remain static after deployment: pronunciation errors on out-of-vocabulary proper nouns persist unless the model is retrained.
- We introduce FlowEdit, a life-long adaptation framework for frozen flow-matching TTS that learns pronunciation corrections as latent conditioning edits rather than weight updates.
- When corrective
Research Themes
AIResearch