Wavenet and deep learning - based TTS systems

12345678910111213141516171819202122232425262728293031323334353637
Across
  1. 2. Replicating a person’s voice using AI
  2. 3. Delay between input and speech output
  3. 6. Generating audio points from probability
  4. 10. DeepMind model that generates speech one sample at a time
  5. 13. Converts acoustic features into waveform audio
  6. 14. RNN variant handling long-term dependencies
  7. 17. Computing system inspired by biological neurons
  8. 20. Predicts future samples from past outputs
  9. 22. Collection of text and audio samples
  10. 24. TTS system supporting many voices
  11. 26. Single system trained from text to speech
  12. 28. Written representation of a sound
  13. 30. Mechanism to focus on relevant input parts
  14. 31. Field that enables machines to mimic human intelligence
  15. 33. Converts encoded features into speech output
  16. 35. Technique to improve audio quality
  17. 37. Network effective in feature extraction
Down
  1. 1. Frequency-based audio representation for TTS
  2. 4. Mapping between text and audio
  3. 5. Raw audio signal representation
  4. 7. Intonation, stress, and rhythm in speech
  5. 8. Predicts length of each phoneme
  6. 9. Basic sound unit of spoken language
  7. 11. Converts written text into natural-sounding speech
  8. 12. Machine learning using multi-layer neural networks
  9. 15. Reducing precision of audio samples
  10. 16. Vector representing speaker identity
  11. 18. Neural architecture using self-attention
  12. 19. Artificial production of human speech
  13. 21. Number of audio samples per second
  14. 23. Network designed for sequence modeling
  15. 25. Generating speech from trained model
  16. 27. How human-like synthesized speech sounds
  17. 29. Converts text into hidden representations
  18. 32. Process of learning model parameters
  19. 34. Large structured speech dataset
  20. 36. Simplified recurrent neural network