Across
- 2. Character-level tokenization
- 5. Byte Pair Encoding
- 7. Handling various data types
- 9. Splits text into tokens
- 10. Large Language Models
Down
- 1. Vocabulary size
- 3. Set of known tokens
- 4. End-of-text, start-of-message
- 6. Numerical representations of text
- 8. Efficient token compression
