Describe parts of images using text prompts
Chat with an AI that understands text and images
Engage in multi-modal conversations with images and videos
Generate responses to video or image inputs