Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI (Kunlun Inc.), specializing in vision-language reasoning.
-
Updated
Jul 16, 2025 - Python
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI (Kunlun Inc.), specializing in vision-language reasoning.
[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
[CVPR 2025] 🔥 Official impl. of "Audio-Visual Instance Segmentation".
Sample project of multimodal decision and image generation with DeepSeek Janus Pro 7B with Real-ESRGAN upscaling
Add a description, image, and links to the multimodal-understanding topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-understanding topic, visit your repo's landing page and select "manage topics."