

- It’s not like the companies train one model and they use it for months until they need new version. They train new models all the time to update them and test new ideas.
- They don’t use small models. Typical LLMs offered by ChatGPT or Claude are the big ones
- They process thousands of queries per second so their GPUs are maxed out all the time, not just for few seconds.
Small models can only handle limited set of tasks. To cover a lot of different tasks you would need a lot of small models. What DeepSeek did was build a lot of small models with each acting as an expert on one topic (more or less). It’s more energy efficient to train but not necessarily to run as you have to chain a lot of small models to get good results.
What do people use LLM for? Asking questions you would normally ask Google. Google sucks now so it’s easier to ask ChatGPT. You can also use it for simple tasks like checking text for grammar errors, writing emails and so on.