DeepSeek Permanently Reduces The Price Of Its Flagship V4 Model By 75 Percent

jaykrown@lemmy.world · 12 hours ago

qaz@lemmy.world · 4 hours ago

FYI the flash model is ~158 GB

Tja@programming.dev · 2 hours ago

How are they running it? Doesn’t the model have to fit in (V)RAM? Does Nvidia have such huge memories in the H cards?

BlackLaZoR@lemmy.world · 1 minute ago

There’s tech for splitting model to run on multiple cards, but it requires really fast interconnect between GPUs.

Taasz/Woof@lemmy.blahaj.zone · 2 minutes ago

Lots of GPUs together.

boonhet@sopuli.xyz · edit-2 36 minutes ago

For self hosting it essentially needs to fit in VRAM + RAM but it’ll take a lot of CPU for the part in RAM

Deepseek probably uses those big fancy H cards and not one but several together to increase VRAM.

Mwa@thelemmy.club · 3 hours ago

The destiled models?