NNDL Assignment

Edit Answers

Across

1. A model performs well on training data but poorly on new data.
3. A technique to normalize the inputs of a layer.
8. A type of neural network architecture with nested layers.
11. A basic building block of a neural network.
12. A technique to ignore certain timesteps in a sequence.
17. One complete pass through the entire training dataset.
19. A type of neural network architecture with residual connections.
22. A key operation in Convolutional Neural Networks (CNNs).
24. A pooling operation that selects the maximum value in each region.
25. A technique to stop training when the validation loss stops improving.
26. A type of learning where the model learns from unlabeled data.
27. A type of neural network architecture with parallel branches.
29. A layer where all nodes are connected to all nodes in the previous layer.
31. A CNN Layer where dimensions of the input are reduced.
35. A type of neural network architecture with self-attention.
37. The simplest form of a neural network, single-layer binary classifier.
39. A technique to normalize the inputs of a layer.
41. Adapts a pre-trained model for a new task.
44. Determines the output of a node in a neural network.
45. A popular activation function introducing non-linearity.
46. A layer to learn a low-dimensional representation of categorical data.
47. A type of operation that preserves the spatial dimensions of the input.
49. A type of neural network architecture with small convolutional filters.
50. A mechanism to attend to different parts of the same input.
51. Adding extra pixels to the input to preserve its dimensions.
52. A technique to prevent overfitting in neural networks.
56. A configuration setting parameters to the model.
58. A mechanism to focus on important parts of the input.
59. A technique to fine-tune a pre-trained model for a new task.

Down

2. A neural network designed for sequential data.
4. A training algorithm for updating weights in a neural network.
5. A strategy for setting initial values in a neural network.
6. An activation function similar to sigmoid, ranges from -1 to 1.
7. A technique to prevent overfitting nodes in neural networks.
9. A type of learning where the model learns from labeled data.
10. Controls the size of steps in gradient descent.
13. A layer to concatenate multiple tensors along a specific axis.
14. A type of recurrent neural network with memory cell.
15. A pooling operation that computes the average value in each region.
16. A type of neural network architecture for unsupervised learning.
18. A technique to prevent overfitting in neural networks.
20. A type of connection that bypasses one or more layers.
21. Number of samples processed before
23. An additional parameter representing an offset in neural networks.
28. A type of operation that preserves the temporal dimensions of the input.
30. A mechanism to attend to multiple parts of the input.
32. A layer to convert a multi-dimensional tensor into a vector.
33. An architecture where information flows in one direction.
34. A type of neural network architecture with local response normalization.
36. An activation function used in the output layer for classification.
38. An issue where gradients become very small during training.
40. A technique to artificially increase the size of the training dataset.
42. An optimization algorithm for finding the minimum.
43. The number of pixels to slide the kernel across the input.
48. A type of neural network architecture for generative modeling.
53. A matrix used for convolution operation.
54. A type of neural network architecture with gates.
55. A type of encoding for sequential data.
57. A type of Neural Network handling sequential data.