DeepSeek open-sourced DeepSeek-V3-Base, a 685B parameter model

December 25, 2024

LiveBench reported by r/LocalLlama - DeepSeek v3 is the BEST open weight LLM AND SECOND BEST non-reasoning LLM after `gemini-exp-1206`

Video

(46) DeepSeek-V3 (Free API) + Cline & Aider : This is The BEST AI Coding Setup Right Now! (Beats Cursor!) - YouTube

Article

DeepSeek-V3-Base on HuggingFace

Benchmark Results: DeepSeek V3 on LiveBench

DeepSeek's new AI model appears to be one of the best 'open' challengers yet | TechCrunch

DeepSeek-V3: Training 671 Billion Parameters with a $6 Million dollar Budget | ml-news – Weights & Biases

Tweets

‼ DeepSeek chat is powered by V3 and is powerful ‼

Here an MVP of Asteroids game with AI companies logos. Fully built with it in few minutes!

Sonnet 3.5 is not the King 👑 anymore 🤷‍♂️
Anthropic it's your turn!

🧵Artifact created in the comment pic.twitter.com/FCMZTb52fQ
— Ivan Fioravanti ᯅ (@ivanfioravanti) December 25, 2024

Resource constraints are a beautiful thing. Survival instinct in a cut-throat AI competitive land is a prime drive for breakthroughs.

I’ve been following DeepSeek for a long time. They had one of the best open coding models last year. Superior OSS models put huge pressure on… https://t.co/ARtRjAXiOJ
— Jim Fan (@DrJimFan) December 27, 2024

Cool things from DeepSeek v3's paper:

1. Float8 uses E4M3 for forward & backward - no E5M2
2. Every 4th FP8 accumulate adds to master FP32 accum
3. Latent Attention stores C cache not KV cache
4. No MoE loss balancing - dynamic biases instead

More details:
1. FP8: First large… pic.twitter.com/06AO8EFv4p
— Daniel Han (@danielhanchen) December 27, 2024

DeepSeek V3 实测：与 Claude 3.5 Sonnet、o1 Pro 代码能力对比

本期视频将深入解析DeepSeek最新发布的V3版本，包括其671亿参数、14.8T token 预训练等核心规格。

通过多轮测试，分别与Claude 3.5 Sonnet和o1 Pro在Python、JavaScript、Swift、Java等编程语言上进行了对比。

时间戳

0:00 -… pic.twitter.com/KtvxViaqTZ
— nicekate (@nicekate8888) December 27, 2024

A full day (24h) of continuously generating with Deepseek V3 costs $1.50
— Tom Dörr (@tom_doerr) December 27, 2024

comments

Back to Events