Wavenet and deep learning - based TTS systems

Across

2. Replicating a person’s voice using AI
3. Delay between input and speech output
6. Generating audio points from probability
10. DeepMind model that generates speech one sample at a time
13. Converts acoustic features into waveform audio
14. RNN variant handling long-term dependencies
17. Computing system inspired by biological neurons
20. Predicts future samples from past outputs
22. Collection of text and audio samples
24. TTS system supporting many voices
26. Single system trained from text to speech
28. Written representation of a sound
30. Mechanism to focus on relevant input parts
31. Field that enables machines to mimic human intelligence
33. Converts encoded features into speech output
35. Technique to improve audio quality
37. Network effective in feature extraction

Down

1. Frequency-based audio representation for TTS
4. Mapping between text and audio
5. Raw audio signal representation
7. Intonation, stress, and rhythm in speech
8. Predicts length of each phoneme
9. Basic sound unit of spoken language
11. Converts written text into natural-sounding speech
12. Machine learning using multi-layer neural networks
15. Reducing precision of audio samples
16. Vector representing speaker identity
18. Neural architecture using self-attention
19. Artificial production of human speech
21. Number of audio samples per second
23. Network designed for sequence modeling
25. Generating speech from trained model
27. How human-like synthesized speech sounds
29. Converts text into hidden representations
32. Process of learning model parameters
34. Large structured speech dataset
36. Simplified recurrent neural network