- Speech Language #まとめ編
Index
Text-to-Speach / TTS
SPEAR-TTS / 2023
- Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision
- [2023]
- arxiv.org
Multilingual Shallow Fusion / 2023
- Massively Multilingual Shallow Fusion with Large Language Models
- [2023]
- arxiv.org
Imaginary Voice / 2023
- Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech
- [2023]
- arxiv.org
Foundation TTS / 2023
- FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
- [2023]
- arxiv.org
NaturalSpeech 2 / 2023
- NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
テクニック・工夫
pause insertion / 2023
- Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
アプリケーション・サービス
Bark
- Bark