Across
- 4. A neural network architecture that relies on self-attention mechanisms.
- 6. Part of a transformer architecture that processes the input data.
- 7. Algorithm used to adjust weights in neural networks.
- 9. A function applied to the output of a neural network node to introduce non-linearity.
- 11. A process where a function calls itself, often used in algorithmic problem-solving.
- 13. The process of converting text into smaller units such as words or subwords.
- 14. A type of RNN designed to better handle long-term dependencies in sequential data.
- 17. Part of a transformer that generates the output from encoded data.
- 18. Transformer-based model that excels at natural language understanding.
- 19. A type of machine learning model, often used in image recognition.
- 20. A multi-dimensional array of data, fundamental to deep learning computations.
Down
- 1. When a model learns to perform too well on training data, but fails on new data.
- 2. A learning paradigm where the model is trained on labeled data.
- 3. A regularization technique where random units are ignored during training to prevent overfitting.
- 5. A step-by-step procedure used for solving problems or performing computations.
- 8. A basic unit in a neural network, introduced in the 1950s.
- 10. A framework involving a generator and a discriminator, often used in image generation.
- 12. A supervised learning algorithm used for classification tasks.
- 15. An individual measurable property or characteristic of a phenomenon being observed.
- 16. Type of machine learning where agents learn by interacting with an environment.
