← Back to models

Qwen3-VL

Alibaba (Qwen Team) October 15 2025

Parameters

4B / 8B (2 sizes; dense; Instruct and Thinking variants)

License

Apache 2.0

Key Features

Vision-language family with 256K→1M context; OCR, spatial grounding (2D/3D), visual coding, GUI agents; 32-language OCR; FP8 optimized for low VRAM; Thinking variants enhance multimodal reasoning/STEM; strong in long-doc/video comprehension.