Across
- 1. A model performs well on training data but poorly on new data.
- 3. A technique to normalize the inputs of a layer.
- 8. A type of neural network architecture with nested layers.
- 11. A basic building block of a neural network.
- 12. A technique to ignore certain timesteps in a sequence.
- 17. One complete pass through the entire training dataset.
- 19. A type of neural network architecture with residual connections.
- 22. A key operation in Convolutional Neural Networks (CNNs).
- 24. A pooling operation that selects the maximum value in each region.
- 25. A technique to stop training when the validation loss stops improving.
- 26. A type of learning where the model learns from unlabeled data.
- 27. A type of neural network architecture with parallel branches.
- 29. A layer where all nodes are connected to all nodes in the previous layer.
- 31. A CNN Layer where dimensions of the input are reduced.
- 35. A type of neural network architecture with self-attention.
- 37. The simplest form of a neural network, single-layer binary classifier.
- 39. A technique to normalize the inputs of a layer.
- 41. Adapts a pre-trained model for a new task.
- 44. Determines the output of a node in a neural network.
- 45. A popular activation function introducing non-linearity.
- 46. A layer to learn a low-dimensional representation of categorical data.
- 47. A type of operation that preserves the spatial dimensions of the input.
- 49. A type of neural network architecture with small convolutional filters.
- 50. A mechanism to attend to different parts of the same input.
- 51. Adding extra pixels to the input to preserve its dimensions.
- 52. A technique to prevent overfitting in neural networks.
- 56. A configuration setting parameters to the model.
- 58. A mechanism to focus on important parts of the input.
- 59. A technique to fine-tune a pre-trained model for a new task.
Down
- 2. A neural network designed for sequential data.
- 4. A training algorithm for updating weights in a neural network.
- 5. A strategy for setting initial values in a neural network.
- 6. An activation function similar to sigmoid, ranges from -1 to 1.
- 7. A technique to prevent overfitting nodes in neural networks.
- 9. A type of learning where the model learns from labeled data.
- 10. Controls the size of steps in gradient descent.
- 13. A layer to concatenate multiple tensors along a specific axis.
- 14. A type of recurrent neural network with memory cell.
- 15. A pooling operation that computes the average value in each region.
- 16. A type of neural network architecture for unsupervised learning.
- 18. A technique to prevent overfitting in neural networks.
- 20. A type of connection that bypasses one or more layers.
- 21. Number of samples processed before
- 23. An additional parameter representing an offset in neural networks.
- 28. A type of operation that preserves the temporal dimensions of the input.
- 30. A mechanism to attend to multiple parts of the input.
- 32. A layer to convert a multi-dimensional tensor into a vector.
- 33. An architecture where information flows in one direction.
- 34. A type of neural network architecture with local response normalization.
- 36. An activation function used in the output layer for classification.
- 38. An issue where gradients become very small during training.
- 40. A technique to artificially increase the size of the training dataset.
- 42. An optimization algorithm for finding the minimum.
- 43. The number of pixels to slide the kernel across the input.
- 48. A type of neural network architecture for generative modeling.
- 53. A matrix used for convolution operation.
- 54. A type of neural network architecture with gates.
- 55. A type of encoding for sequential data.
- 57. A type of Neural Network handling sequential data.
