海阔天空蓝

一个玩过n种运动的语音合成算法攻城狮

0%

Vocoders 模型总结

语音合成声码器脉络总结如下,持续更新ing

Order Model Year Institution Conference Inherited Model (Base model) Corresponding Author (Team leader) URL
1 WaveNet 2016.9 Google DeepMind SSW 2016 CNN Nal Kalchbrenner https://arxiv.org/pdf/1609.03499.pdf
2 WaveRNN 2018.6 DeepMind & Google Brain ICML 2018 RNN Nal Kalchbrenner https://arxiv.org/pdf/1802.08435.pdf
3 WaveGlow 2018.10 Nvidia ICASSP 2019 WaveNet Rafael Valle https://arxiv.org/pdf/1811.00002.pdf
4 LPCNet 2019.2 Mozilla, Google ICASSP 2019 WaveRNN Jean-Marc Valin https://arxiv.org/pdf/1810.11846.pdf
5 WaveGAN 2019.2 UC San Diego ICLR 2019 GAN Miller Puckette https://arxiv.org/pdf/1802.04208.pdf
6 Multi-band WaveRNN 2019.4 Tecent AI Lab Interspeech 2020 DurIAN, WaveRNN Dong Yu https://arxiv.org/pdf/1909.01700.pdf
7 MelGAN 2019.12 University of Montreal, Mila, Lyrebird AI NeurIPS 2019 GAN Yoshua Bengio https://arxiv.org/pdf/1910.06711.pdf
8 SqueezeWave 2020.1 UC Berkeley WaveGlow Bichen Wu https://arxiv.org/pdf/2001.05685.pdf
9 Parallel WaveGAN (PWG) 2020.2 LINE Corp., NAVER Corp. GAN Ryuichi Yamamoto https://arxiv.org/pdf/1910.11480.pdf
10 Multi-band MelGAN 2020.5 西北工业大学,sogou melgan, multi-band Xielei https://arxiv.org/pdf/2005.05106.pdf
11 FeatherWave 2020.10 Tecent Interspeech 2020 MB LP, WaveRNN Shan Liu https://isca-speech.org/archive/Interspeech_2020/pdfs/1156.pdf
12 WaveGrad 2020.10 Johns Hopkins University, Google Brain CNN Heiga Zen https://arxiv.org/pdf/2009.00713.pdf

GAN Vocoder: Multi-Resolution Discriminator Is All You Need

此篇论文尝试解释为什么近期涌现的GAN-based vocoders要好于过往的Flow-based或者Autoregressive的vocoders。文章通过消融实验分析认为原因主要在于Multi-Resolution Discriminator的设计使得GAN-based vocoders达到了一个新的水平。