DeepSeek: Is this China’s ChatGPT moment and a wake-up call for the US?https://indianexpress.com/article/technology/artificial-intelligence/deepseek-is-this-chinas-chatgpt-moment-and-a-wake-up-call-for-the-us-9799686/
Chinese AI lab DeepSeek released two new AI models this month. Their limited use of resources to achieve extraordinary results is making the world take notice. ..The Chinese lab has created something monumental—they have introduced a powerful open-source AI model that rivals the best offered by the US companies. Since AI companies require billions of dollars in investments to train AI models, DeepSeek’s innovation is a masterclass in optimal use of limited resources. ..
Even as AI companies in the US were harnessing the power of advanced hardware like NVIDIA H100 GPUs, DeepSeek relied on less powerful H800 GPUs. This could have been only possible by deploying some inventive techniques to maximise the efficiency of these older generation GPUs.
https://www.youtube.com/watch?v=WEBiebbeNCA How China’s New AI Model DeepSeek Is Threatening U.S. Dominance
https://www.perplexity.ai/search/tell-me-about-yourself-xwLDeJHdTNCL5gRUyaYphQ
https://www.perplexity.ai/search/what-is-perplexity-ai-AWkRzywZRmKewduLu3tBqg
https://techcrunch.com/2025/01/27/deepseek-punctures-tech-spending-plans-and-what-analysts-are-saying/ While brokerage firm Jefferies warns that DeepSeek’s efficient approach “punctures some of the capex euphoria” following recent spending commitments from Meta and Microsoft — each exceeding $60 billion this year — Citi is questioning whether such results were actually achieved without advanced GPUs.
Goldman Sachs sees broader implications, suggesting the development could reshape competition between established tech giants and startups by lowering barriers to entry. especially with recent new models developed at a fraction of the cost of existing ones; 2) from training to more inferencing, with increased emphasis on post-training (including reasoning capabilities and reinforcement capabilities) that requires significantly lower computational resources vs. pre-training; and 3) the potential for further global expansion