𝐃𝐞𝐞𝐩𝐒𝐞𝐞𝐤 𝐑𝟏 𝐈𝐬 𝐑𝐞𝐚𝐥. 𝐓𝐡𝐞 𝐌𝐲𝐭𝐡𝐬 𝐀𝐛𝐨𝐮𝐭 𝐈𝐭? 𝐍𝐨𝐭 𝐒𝐨 𝐌𝐮𝐜𝐡.⁣

2 min readFeb 8, 2025

Let’s not fall for the misinformation about 𝐃𝐞𝐞𝐩𝐒𝐞𝐞𝐤 𝐑𝟏! Let’s set the record straight:⁣
⁣
1. Training didn’t just cost ~$𝟔𝐌 🧐The $𝟓.𝟓𝐌 figure covers base model compute only — no ablations, smaller runs, or data generation included.⁣

2. It’s not a side project 🙂‍↕️DeepSeek is owned by 𝐇𝐢𝐠𝐡-𝐅𝐥𝐲𝐞𝐫, a Chinese hedge fund managing $𝟕𝐁+ with a team of math, physics, and informatics Olympians.⁣

3. They don’t have “a few GPUs” — they have 𝟓𝟎,𝟎𝟎𝟎 🙂‍↔️

4. The real 𝐃𝐞𝐞𝐩𝐒𝐞𝐞𝐤 𝐑𝟏 is a 𝟔𝟕𝟏𝐁 𝐌𝐨𝐄 model requiring 𝟏𝟔𝐱 𝟖𝟎𝐆𝐁 𝐆𝐏𝐔𝐬 (𝐇𝟏𝟎𝟎𝐬) to run 🫠

5. The smaller “distilled” versions (e.g., 1.5B) are not R1; 🤭 they’re just fine-tuned 𝐐𝐰𝐞𝐧/𝐋𝐥𝐚𝐦𝐚 models. Yes, they can run locally, but they’re nowhere near R1-level performance.⁣

6. Hosted versions on their website may use your data to train new models 🤯(check the ToS).⁣
⁣
7. The exciting part? DeepSeek just announced 𝐉𝐚𝐧𝐮𝐬-𝐏𝐫𝐨-𝟕𝐁, an open-source model that generates images and outperforms OpenAI’s 𝐃𝐀𝐋𝐋-𝐄 𝟑 and 𝐒𝐭𝐚𝐛𝐥𝐞 𝐃𝐢𝐟𝐟𝐮𝐬𝐢𝐨𝐧 across benchmarks 🥺The AI competition is heating up!⁣
⁣
The good news? DeepSeek AI has been contributing to open-source and science for 2+ years 🫡 Hugging Face is even building a fully open pipeline. The future looks bright for everyone!

— Seeyafo

𝐃𝐞𝐞𝐩𝐒𝐞𝐞𝐤 𝐑𝟏 𝐈𝐬 𝐑𝐞𝐚𝐥. 𝐓𝐡𝐞 𝐌𝐲𝐭𝐡𝐬 𝐀𝐛𝐨𝐮𝐭 𝐈𝐭? 𝐍𝐨𝐭 𝐒𝐨 𝐌𝐮𝐜𝐡.⁣

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Vishal P

No responses yet