Across
- 2. This memory-efficient fine-tuning method introduces rank-decomposed matrices to update only a fraction of model parameters.
- 3. A dense vector representation of words or tokens used to capture semantic meaning
- 6. A neural network architecture that revolutionized NLP by introducing self-attention mechanisms
- 7. A transformer model that processes text bidirectionally, making it adept at tasks requiring context from both past and future tokens
- 8. A mechanism that helps models focus on relevant parts of the input sequence while processing text
- 10. One complete pass through the entire training dataset during the learning process
Down
- 1. The input query or text that guides the behavior of a language model to generate desired outputs
- 4. The stage where a trained model applies its learned parameters to predict or generate outputs for unseen data
- 5. A regularization technique that randomly ignores a subset of neurons during training to prevent overfitting
- 9. The process of splitting text into smaller units, often words or subwords, for input to language models
