Qwen released QVQ, the first open-weight model for visual reasoning

December 25, 2024

Building on the foundation of Qwen2-VL-72B, QvQ integrates architectural improvements that enhance cross-modal reasoning. Its open-weight design underscores the team’s commitment to making advanced AI more accessible.

Article

QVQ: To See the World with Wisdom | Qwen

Trying out QvQ—Qwen’s new visual reasoning model

Tweets

🎄Happy holidays and we wish you enjoy this year. Before moving to 2025, Qwen has the last gift for you, which is QVQ!

🎉 This may be the first open-weight model for visual reasoning. It is called QVQ, where V stands for vision. It just reads an image and an instruction, starts… pic.twitter.com/BX1ORiltIf
— Qwen (@Alibaba_Qwen) December 24, 2024

QVQ：突破视觉智能的里程碑 - Qwen团队最新多模态推理模型解析

TL;DR
QVQ是一个基于Qwen2-VL-72B开发的开创性AI模型，通过结合视觉和语言能力，在数学推理、科学分析等复杂任务中展现出卓越表现，特别是在MMMU基准测试中获得70.3的高分，标志着AI在视觉理解和推理能力方面的重大突破

基本介绍
-… https://t.co/hhun89O3Qd pic.twitter.com/tvANoJo0O5
— meng shao (@shao__meng) December 24, 2024

QvQ-72B-Preview now on MLX 🚀🎄

TLDR
🏆SoTA open-source multimodal
🧠 Capable of step-by-step reasoning
💪🏾 Competitive MMMU score with o1, GPT-4o and Sonnet 3.5
🔥 Beats GPT-4o and Sonnet 3.5 on MathVista and MathVision

You can now run inference and finetune (QLora) locally on… https://t.co/qaVJ2AhoPA pic.twitter.com/hUq8EChYwW
— Prince Canuma (@Prince_Canuma) December 24, 2024

comments

Back to Events