Across
- 2. A trainable parameter that shifts a neuron’s activation independent of inputs.
- 4. One full pass through the entire training dataset during learning.
- 7. A function that measures how far a model’s predictions are from targets.
- 8. A computing model made of interconnected layers that mimics neuron connections in the brain.
- 11. The vector of partial derivatives indicating how to change parameters to reduce loss.
- 14. The field where algorithms learn patterns from data without explicit programming.
- 16. When a model learns training noise and performs poorly on new data.
- 18. A nonlinear function applied to a neuron’s output to introduce complexity.
- 19. Techniques that reduce overfitting by constraining model complexity.
Down
- 1. The process of breaking text into pieces (tokens) for model input.
- 2. The method for computing gradients to update neural network weights.
- 3. A learning paradigm that trains on labeled input–output pairs.
- 5. A learning paradigm that finds structure in unlabeled data.
- 6. A step-by-step procedure for solving a problem or performing a task.
- 9. A dense vector that represents discrete items (words, tokens) in continuous space.
- 10. A model component that generates output from a latent representation.
- 12. A model component that converts raw input into a latent representation.
- 13. A neural architecture that uses self-attention to handle sequences.
- 15. A trainable parameter that scales an input in a neural network.
- 17. A structured collection of examples used to train or evaluate models.
