Tech and AIDeepSeek AI: What you need to know about the...

DeepSeek AI: What you need to know about the ChatGPT rival

-


In a mere week, DeepSeek‘s R1 large language model has dethroned ChatGPT on the App Store, shaken up the stock market, and posed a serious threat to OpenAI and, by extension, U.S. dominance of the AI industry.

Last Monday, Chinese AI company DeepSeek released an open-source LLM called DeepSeek R1, becoming the buzziest AI chatbot since ChatGPT. It’s purportedly just as good — if not better — than OpenAI’s models, cheaper to use, and allegedly developed with way fewer chips than its competitors. Here’s what you need to know about DeepSeek R1 and why everyone is suddenly talking about it.

DeepSeek R1 claims to surpass OpenAI models in key benchmarks

With the release of DeepSeek R1, the company published a report on its capabilities, including performance on industry-standard benchmarks. DeepSeek claims its LLM beat OpenAI’s reasoning model o1 on advanced math and coding tests (AIME 2024, MATH-500, SWE-bench Verified) and earned just below o1 on another programming benchmark (Codeforces), graduate-level science (GPQA Diamond), and general knowledge (MMLU).

Mashable’s Stan Schroeder put DeepSeek R1 to the test by asking it to “code a fairly complex web app which needed to parse publicly available data, and create a dynamic website with travel and weather information for tourists,” and came away impressed with its capabilities.

At this point, several LLMs exist that perform comparably to OpenAI’s models, like Anthropic Claude, Meta’s open-source Llama models, and Google Gemini. But DeepSeek R1’s performance, combined with other factors, makes it such a strong contender.

Mashable Light Speed

Unlike OpenAI models, DeepSeek R1 is open source

Because DeepSeek R1 is open source, anyone can access and tweak it for their own purposes. It also allows programmers to look under the hood and see how it works. Open-source models are considered critical for scaling AI use and democratizing AI capabilities since programmers can build off them instead of requiring millions of dollars worth of computing power to build their own.

Meta took this approach by releasing Llama as open source, compared to Google and OpenAI, which are criticized by open-source advocates as gatekeeping. Google’s Gemini model is closed source, but it does have an open-source model family called Gemma.

It’s cheap to use and was cheap to build

DeepSeek R1 has a free web app version, accessible via chat.deepseek.com, and an API that costs significantly less than OpenAI’s API access to its most advanced model. Its reasoning model costs $0.14 for one million cached input tokens, compared to $7.50 per one million cached input tokens for OpenAI’s o1 model. That’s an absolute steal that unsurprisingly has programmers flocking to it.

For AI industry insiders and tech investors, DeepSeek R1’s most significant accomplishment is how little computing power was (allegedly) required to build it. According to DeepSeek engineers via The New York Times, the R1 model required only 2,000 Nvidia chips. That’s compared to a reported 10,000 Nvidia GPUs required for OpenAI’s models as of 2023, so it’s undoubtedly more now.

That’s quite a bold claim, but if true, it calls into question how much investment is needed to develop data centers like the $500 billion Stargate project currently underway. The stock market certainly noticed DeepSeek R1’s alleged cost efficiency, with Nvidia taking a 13 percent dip in stock price on Monday.

DeepSeek R1 is the new king on Apple’s App Store

Clearly, users have noticed DeepSeek R1’s prowess. By Monday, the new kid on the block topped the Apple App Store as the number one free app, replacing ChatGPT as the reigning free app.

Who knows if DeepSeek R1’s momentum will continue, but it has definitely reignited the AI race and taken the competition to global heights.





Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest news

Imagen Network (IMAGE) to Integrate Advanced Llama 4-Based AI for Multimodal Personalization

Integration of the most recent multimodal intelligence mannequin boosts content material relevance, consumer concentrating on, and cross-format interplay. July 09,...

European VC breaks taboo by investing in pure defense tech from Ukraine’s war zones

Defense tech has gone from a no-go zone for VCs to a hot investment sector. However, dual use...

Tokenized Equities: Big Promise, Bigger Hurdles in the Race to Democratize Investing

The tokenization of equities, while seen as a promising way to democratize access to publicly listed company stocks,...

Bitcoin Stays Steady, But Momentum Flashes Bullish Signs: Bitfinex Alpha

With bitcoin (BTC) having remained in a consolidation phase for a relatively long time, market participants are wondering...

Advertisement

UAE Denies Golden Visa Claim for $TON Stakers

Disclaimer: This article is for informational purposes only and does not constitute financial advice. BitPinas has no commercial...

Must read

Imagen Network (IMAGE) to Integrate Advanced Llama 4-Based AI for Multimodal Personalization

Integration of the most recent multimodal intelligence mannequin boosts...

You might also likeRELATED
Recommended to you