Across
- 2. A non text modality used by image recognition and multimodal models
- 4. Task where text in images can be converted to another language
- 9. Type of learning used with human feedback to improve behavior
- 11. Model used to generate images from text prompts
- 12. A modality involving moving visual information mentioned in the article
- 15. Spoken sounds that multimodal models can learn to understand
- 17. The underlying neural network architecture used by LLMs and LMMs
- 18. Large language models mentioned as state of the art AI systems
- 19. OpenAIs multimodal model that handles text and images
- 20. The first step where models learn from massive datasets
Down
- 1. Model used by ChatGPT to parse audio inputs
- 3. Anthropic model claimed to have strong vision capabilities
- 5. Unhealthy ideas models may learn from internet scale data
- 6. Capable across multiple kinds of data instead of just one
- 7. The type of model large language models are based on
- 8. Googles multimodal AI models described as natively multimodal
- 10. Acronym for reinforcement learning with human feedback
- 13. Visual data that multimodal models can analyze and answer questions about
- 14. Different kinds of data such as text images audio or video
- 16. The main modality traditional large language models work with
