DeepSeek: Key Insights into the Rising AI Chatbot App ‣ ATL FM NewsRoom

--Advertisements--

DeepSeek has taken the spotlight this week, as the Chinese AI lab’s chatbot app soared to the top of the Apple App Store and Google Play charts. The efficiency of DeepSeek’s AI models has prompted Wall Street analysts and technologists to question the sustainability of the U.S.’s lead in the AI sector and the ongoing demand for AI chips.

But what is the origin of DeepSeek, and how did it achieve such rapid international recognition?

--Advertisements--

Origins of DeepSeek
DeepSeek is supported by High-Flyer Capital Management, a Chinese quantitative hedge fund that leverages AI for trading decisions. Co-founder Liang Wenfeng, an AI enthusiast, started High-Flyer in 2015 after developing an interest in trading during his studies at Zhejiang University. The hedge fund, established in 2019, focuses on developing AI algorithms.

In 2023, High-Flyer launched DeepSeek as a separate lab dedicated to AI research. With High-Flyer as an investor, it became an independent company, also named DeepSeek.

--Advertisements--

From its inception, DeepSeek built its own data center clusters for model training. However, like other Chinese AI firms, it has faced challenges due to U.S. export restrictions on hardware. As a result, the company had to resort to using Nvidia H800 chips, a less powerful alternative to the H100 accessible to U.S. companies.

DeepSeek’s technical team is primarily composed of young talent, aggressively recruiting PhD researchers from leading Chinese universities. It also employs individuals without computer science backgrounds to broaden the understanding of various subjects, as reported by The New York Times.

DeepSeek’s Impressive Models
DeepSeek introduced its initial models—DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat—in November 2023. However, it was the launch of the next-generation DeepSeek-V2 models in the spring that garnered significant attention in the AI industry.

DeepSeek-V2, a versatile system for text and image analysis, excelled in multiple AI benchmarks and was more cost-effective than comparable models, prompting competitors like ByteDance and Alibaba to reduce their pricing and even offer some models for free.

ICYMT: GRASAG UCC Hosts 5th International Postgraduate Conference

The release of DeepSeek-V3 in December 2024 further elevated the company’s profile. According to internal benchmarks, DeepSeek V3 outperforms both open-source models like Meta’s Llama and proprietary models accessed via API, such as OpenAI’s GPT-4.

Notably, DeepSeek’s R1 “reasoning” model, launched in January, claims to perform on par with OpenAI’s o1 model on key metrics. As a reasoning model, R1 self-fact-checks, helping to avoid common pitfalls in AI responses. Although reasoning models take longer to produce solutions, they tend to be more reliable in fields like physics, science, and math.

However, DeepSeek’s models are subject to scrutiny by China’s internet regulator, ensuring that their responses align with “core socialist values.” For instance, the R1 model does not answer questions about sensitive topics such as Tiananmen Square or Taiwan’s autonomy.

A Disruptive Business Approach
DeepSeek’s business model is somewhat ambiguous, as it prices its products below market value and offers some services for free. The company does not seek investment, despite significant interest from venture capitalists.

DeepSeek attributes its cost competitiveness to breakthroughs in efficiency, although some experts question the accuracy of its claims.

Regardless, developers have embraced DeepSeek’s models, which, while not truly open source, are available under permissive licenses that allow commercial use. Clem Delangue, CEO of Hugging Face, noted that developers have created over 500 derivative models of R1, collectively achieving 2.5 million downloads.

DeepSeek’s success is seen as transformative in the AI landscape, leading to significant stock impacts for rivals like Nvidia and prompting responses from industry leaders, including OpenAI’s CEO Sam Altman. In March, U.S. Commerce Department officials indicated that DeepSeek would be banned on government devices.

Microsoft has made DeepSeek available through its Azure AI Foundry service, a platform that consolidates AI services for enterprises. During a recent earnings call, Meta’s CEO Mark Zuckerberg emphasized that investments in AI infrastructure remain a strategic advantage.

Despite some companies and countries banning DeepSeek, including South Korea and New York state, the full implications of DeepSeek’s innovations are still unfolding. Moving forward, while improved models are anticipated, the U.S. government appears increasingly cautious about perceived foreign influences, with discussions around potentially banning DeepSeek from government devices.

SOURCE: TECH CRUNCH