Across
- 4. The process of breaking text into words or terms for indexing.
- 7. Measures similarity between two sets as intersection divided by union.
- 8. Represents documents and queries as vectors in multi-dimensional space for similarity computation.
- 10. Phonetic algorithm for indexing words by their sound rather than spelling.
- 12. Retrieval model based on logical operators AND, OR, and NOT.
- 13. Technique to identify hidden relationships between terms and documents using SVD.
- 14. Matrix factorization method used in Latent Semantic Indexing (LSI).
- 15. Harmonic mean of precision and recall; used to balance both.
Down
- 1. Number of operations (insert, delete, substitute) to convert one string into another.
- 2. Commonly used words like “the”, “is”, “and” that are removed during text processing.
- 3. Data structure mapping terms to the list of documents that contain them.
- 5. Technique used to reduce words to their root or base form (e.g., “running” → “run”).
- 6. Fraction of relevant documents that are successfully retrieved.
- 9. Fraction of retrieved documents that are relevant.
- 11. Index that stores all rotations of a word to efficiently handle wildcard queries.
- 12. Index created using pairs of consecutive characters for tolerant retrieval.
