CS5243 - Data Science
Across
- 1. , A feature vectorization process in NLP
- 4. Ratio of correctly predicted +ves to the total number of +ves
- 6. , Lemmatization uses __________ to modify the words
- 9. , collection of all documents, is called
- 11. , Distance measure used by k-NMN to train and test the categorical features is
- 13. , A classifier model performs better on training, but poorly on testing is referred to
- 14. , K-NN is an example of ____________ learning
- 16. , A Public corpora
- 19. , P(X/C) in Naive Bayes is termed to be
- 23. , Sigmoid activation function transforms linear combined data to ___________ form
- 25. , Method used to train the model with N-1 samples and tested with 1 sample
- 26. , Type-I error in the confusion matrix refers to _______ value
- 27. , Soft margin SVM additionally has ________ variable with hard margin SVM
- 29. , ID3 is sensitive to number of ____________ attribute values
- 30. , Eye color is an example of ____________ datatype
Down
- 2. , Likert scale is an example of __________ data
- 3. , Ratio of predicted +ves to the total number of +ves
- 5. , Value of k in k-NN is determined using
- 7. , Cancer Vs Non-Cancer is an example _________ binary datatype
- 8. , cosine score equals to zero represents the two vectors are
- 10. , P(X) in Naive Bayes is termed to be
- 12. , Binary classifier sensitive to noise
- 15. , Method used to handle the continuous data in Naiva Bayes
- 17. , A constant use to increase/ decrease the net input value in logistic regression
- 18. , Entropy of the dataset having binary class labels can have the value > 1. State True or False
- 20. , Generalized distance measure of L1 and L2 norm
- 21. , Size of the confusion matrix, if the dataset having 600 features with N class labels are used for training and testing
- 22. , TF-IDF uses ________ matrix for large vocabulary
- 24. , identifies the unique or rare occurrence of the words in the documents
- 28. , L1 norm is also called as