Tech and AINvidia Becomes a Major Model Maker With Nemotron 3

Nvidia Becomes a Major Model Maker With Nemotron 3

-


Nvidia has made a fortune supplying chips to companies working on artificial intelligence, but today the chipmaker took a step toward becoming a more serious model maker itself by releasing a series of cutting-edge open models, along with data and tools to help engineers use them.

The move, which comes at a moment when AI companies like OpenAI, Google, and Anthropic are developing increasingly capable chips of their own, could be a hedge against these firms veering away from Nvidia’s technology over time.

Open models are already a crucial part of the AI ecosystem with many researchers and startups using them to experiment, prototype, and build. While OpenAI and Google offer small open models, they do not update them as frequently as their rivals in China. For this reason and others, open models from Chinese companies are currently much more popular, according to data from Hugging Face, a hosting platform for open source projects.

Nvidia’s new Nemotron 3 models are among the best that can be downloaded, modified, and run on one’s own hardware, according to benchmark scores shared by the company ahead of release.

“Open innovation is the foundation of AI progress,” CEO Jensen Huang said in a statement ahead of the news. “With Nemotron, we’re transforming advanced AI into an open platform that gives developers the transparency and efficiency they need to build agentic systems at scale.”

Nvidia is taking a more fully transparent approach than many of its US rivals by releasing the data used to train Nemotron—a fact that should help engineers modify the models more easily. The company is also releasing tools to help with customization and fine-tuning. This includes a new hybrid latent mixture-of-experts model architecture, which Nvidia says is especially good for building AI agents that can take actions on computers or the web. The company is also launching libraries that allow users to train agents to do things using reinforcement learning, which involves giving models simulated rewards and punishments.

Nemotron 3 models come in three sizes: Nano, which has 30 billion parameters; Super, which has 100 billion; and Ultra, which has 500 billion. A model’s parameters loosely correspond to how capable it is as well as how unwieldy it is to run. The largest models are so cumbersome that they need to run on racks of expensive hardware.

Model Foundations

Kari Ann Briski, vice president of generative AI software for enterprise at Nvidia, said open models are important to AI builders for three reasons: Builders increasingly need to customize models for particular tasks; it often helps to hand queries off to different models; and it is easier to squeeze more intelligent responses from these models after training by having them perform a kind of simulated reasoning. “We believe open source is the foundation for AI innovation, continuing to accelerate the global economy,” Briski said.

The social media giant Meta released the first advanced open models under the name Llama in February 2023. As competition has intensified, however, Meta has signaled that its future releases might not be open source.

The move is part of a larger trend in the AI industry. Over the past year, US firms have moved away from openness, becoming more secretive about their research and more reluctant to tip off their rivals about their latest engineering tricks.



Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest news

Bitcoin price coils into a triangle pattern

Bitcoin price compresses within a tightening...

Strategy’s BTC Yield turns negative for first time in years

The BTC Yield of MSTR stock in Michael Saylor’s treasury company, Strategy, has turned negative this quarter for...

Google Search Live Gets a Gemini Audio Upgrade for Smoother Replies

Search Live gets an upgrade with Gemini 2.5 native audio, delivering faster, more natural voice conversations and hands-free...

Advertisement

Ripple (XRP) and Solana (SOL) Get a Big Boost as CME Group Rolls Out New Offerings

The first XRP futures launched in May this year on CME. The Chicago Mercantile Exchange (CME) has doubled...

OKX says ‘multiple litigations’ involving Mantra underway

A heated, months-long feud between Mantra and OKX has escalated to the point of legal threats amid a...

Must read

Strategy’s BTC Yield turns negative for first time in years

The BTC Yield of MSTR stock in Michael...

You might also likeRELATED
Recommended to you