Across
- 1. Model modification
- 3. Simple token type
- 4. SentencePiece handling
- 6. Rare token issue
- 8. Prompt compression
- 10. Delimiter tokens
- 11. Popular LLM
- 14. LLM input units
- 16. LLM struggle
- 17. GPT-4 improvement
- 19. BPE merging factor
- 21. Vocabulary size
- 23. Tokenization issues
- 24. Token representation
Down
- 2. BPE benefit
- 4. Adding new tokens
- 5. SentencePiece input
- 7. Early method
- 9. Token set
- 12. Text processor
- 13. Alternative tokenizer
- 15. Advanced method
- 18. BPE's main goal
- 20. Byte Pair Encoding
- 22. SentencePiece option