(large multimodal model).
What are the developments:
🤖 LLaVa: A competitor to the open-source GPT4-V
🔗 Langchain for identifying images: RAG on images
🚀 MiniGPT-v2: Visual-language hybrid tasks
🎨 SEED-LLaMA: Simulates human seeing, reading, and imagining
