Wavenet and deep learning - based TTS systems
Across
- 2. Replicating a person’s voice using AI
- 3. Delay between input and speech output
- 6. Generating audio points from probability
- 10. DeepMind model that generates speech one sample at a time
- 13. Converts acoustic features into waveform audio
- 14. RNN variant handling long-term dependencies
- 17. Computing system inspired by biological neurons
- 20. Predicts future samples from past outputs
- 22. Collection of text and audio samples
- 24. TTS system supporting many voices
- 26. Single system trained from text to speech
- 28. Written representation of a sound
- 30. Mechanism to focus on relevant input parts
- 31. Field that enables machines to mimic human intelligence
- 33. Converts encoded features into speech output
- 35. Technique to improve audio quality
- 37. Network effective in feature extraction
Down
- 1. Frequency-based audio representation for TTS
- 4. Mapping between text and audio
- 5. Raw audio signal representation
- 7. Intonation, stress, and rhythm in speech
- 8. Predicts length of each phoneme
- 9. Basic sound unit of spoken language
- 11. Converts written text into natural-sounding speech
- 12. Machine learning using multi-layer neural networks
- 15. Reducing precision of audio samples
- 16. Vector representing speaker identity
- 18. Neural architecture using self-attention
- 19. Artificial production of human speech
- 21. Number of audio samples per second
- 23. Network designed for sequence modeling
- 25. Generating speech from trained model
- 27. How human-like synthesized speech sounds
- 29. Converts text into hidden representations
- 32. Process of learning model parameters
- 34. Large structured speech dataset
- 36. Simplified recurrent neural network