0% found this document useful (0 votes)
3 views2 pages

thread/1883686162709295541 HTML

DeepSeek has revolutionized AI model training by drastically reducing costs from $100M to $5M and hardware requirements from 100,000 GPUs to just 2,000, while maintaining competitive performance. Their innovative approach includes using less memory and specialized expert systems that activate only when needed, making AI development more accessible. This disruption poses a significant threat to Nvidia's business model, as it allows smaller players to compete without the need for expensive data centers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views2 pages

thread/1883686162709295541 HTML

DeepSeek has revolutionized AI model training by drastically reducing costs from $100M to $5M and hardware requirements from 100,000 GPUs to just 2,000, while maintaining competitive performance. Their innovative approach includes using less memory and specialized expert systems that activate only when needed, making AI development more accessible. This disruption poses a significant threat to Nvidia's business model, as it allows smaller players to compete without the need for expensive data centers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

threadreaderapp.com /thread/1883686162709295541.

html

🧵 Finally had a chance to dig into DeepSeek’s r1…


Let me break down why DeepSeek's AI innovations are blowing people's minds (and possibly threatening
Nvidia's $2T market cap) in simple terms...
0/ first off, shout out to @doodlestein who wrote the must-read on this here:
youtubetranscriptoptimizer.com/blog/05_the_sh…

1/ First, some context: Right now, training top AI models is INSANELY expensive. OpenAI, Anthropic, etc.
spend $100M+ just on compute. They need massive data centers with thousands of $40K GPUs. It's like
needing a whole power plant to run a factory.

2/ DeepSeek just showed up and said "LOL what if we did this for $5M instead?" And they didn't just talk
- they actually DID it. Their models match or beat GPT-4 and Claude on many tasks. The AI world is (as
my teenagers say) shook.

3/ How? They rethought everything from the ground up. Traditional AI is like writing every number with 32
decimal places. DeepSeek was like "what if we just used 8? It's still accurate enough!" Boom - 75% less
memory needed.

4/ Then there's their "multi-token" system. Normal AI reads like a first-grader: "The... cat... sat..."
DeepSeek reads in whole phrases at once. 2x faster, 90% as accurate. When you're processing billions
of words, this MATTERS.

5/ But here's the really clever bit: They built an "expert system." Instead of one massive AI trying to know
everything (like having one person be a doctor, lawyer, AND engineer), they have specialized experts that
only wake up when needed.

6/ Traditional models? All 1.8 trillion parameters active ALL THE TIME. DeepSeek? 671B total but only
37B active at once. It's like having a huge team but only calling in the experts you actually need for each
task.

7/ The results are mind-blowing:


- Training cost: $100M → $5M
- GPUs needed: 100,000 → 2,000
- API costs: 95% cheaper
- Can run on gaming GPUs instead of data center hardware

8/ "But wait," you might say, "there must be a catch!" That's the wild part - it's all open source. Anyone
can check their work. The code is public. The technical papers explain everything. It's not magic, just
incredibly clever engineering.

9/ Why does this matter? Because it breaks the model of "only huge tech companies can play in AI." You
don't need a billion-dollar data center anymore. A few good GPUs might do it.

1/2
10/ For Nvidia, this is scary. Their entire business model is built on selling super expensive GPUs with
90% margins. If everyone can suddenly do AI with regular gaming GPUs... well, you see the problem.

11/ And here's the kicker: DeepSeek did this with a team of <200 people. Meanwhile, Meta has teams
where the compensation alone exceeds DeepSeek's entire training budget... and their models aren't as
good.

12/ This is a classic disruption story: Incumbents optimize existing processes, while disruptors rethink the
fundamental approach. DeepSeek asked "what if we just did this smarter instead of throwing more
hardware at it?"

13/ The implications are huge:


- AI development becomes more accessible
- Competition increases dramatically
- The "moats" of big tech companies look more like puddles
- Hardware requirements (and costs) plummet

14/ Of course, giants like OpenAI and Anthropic won't stand still. They're probably already implementing
these innovations. But the efficiency genie is out of the bottle - there's no going back to the "just throw
more GPUs at it" approach.

15/ Final thought: This feels like one of those moments we'll look back on as an inflection point. Like
when PCs made mainframes less relevant, or when cloud computing changed everything.

AI is about to become a lot more accessible, and a lot less expensive. The question isn't if this will disrupt
the current players, but how fast.

/end

P.S. And yes, all this is available open source. You can literally try their models right now. We're living in
wild times!🚀
•••

Missing some Tweet in this thread? You can try to force a refresh

2/2

You might also like