| Published On Apr 20, 2026 6:46 am CEST | By Jenny Patel

Nvidia Launches Nemotron 3 Super With Faster AI Performance

Nvidia has released Nemotron 3 Super, a new open AI model built to run faster and handle very long prompts. Nvidia is aiming it at developers building AI agents, where costs can rise fast when models need to reason through many steps.

Good to Know

Nvidia says Nemotron 3 Super delivers up to 7.5x higher throughput than Qwen3.5 122B A10B.
The model supports context windows of up to 1 million tokens.
Nvidia has made the model and related training material openly available.

Built for Speed and Long Inputs

Nemotron 3 Super does not use all of its parameters every time it answers. Instead, it uses a Mixture of Experts design, where only part of the model turns on for each task. Nvidia says that helps lower inference costs and makes the model more useful for AI agents that often burn through large amounts of tokens.

The model uses a mix of Mamba and Transformer layers across 88 layers. In simple terms, one part helps it deal with very long inputs more efficiently, while the other helps it stay accurate. Nvidia says that setup gives the model a native context window of up to 1 million tokens.

Nvidia also added a routing system called LatentMoE. It sends each task to a smaller group of experts inside the model instead of using the full system. According to Nvidia, that allows more specialization without raising inference costs in the same way as regular MoE systems.

Mirax Casino

350% or 5BTC + 150 Spins!

New players only. Exclusive Welcome Bonus of 350% + 150 Free Spins

Casino

Visit Site

The company says Nemotron 3 Super delivers 2.2x the throughput of GPT OSS 120B and 7.5x the throughput of Qwen3.5 122B A10B in the stated test setup. Nvidia also says it offers more than 5x the throughput and up to 2x the accuracy of the earlier Nemotron Super version.

Training was done on 25 trillion tokens, followed by an extra phase on 51 billion tokens to stretch context length to 1 million tokens. Nvidia then used supervised fine tuning and reinforcement learning to improve performance.

Benchmark results were also strong. Nvidia reports scores of 83.73 on MMLU Pro, 90.21 on AIME25, 60.47 on SWE Bench with OpenHands, 85.6% on PinchBench, and 91.64 on RULER 1M. The model also powers Nvidia AI Q, a research agent that reached the top of the Deepresearch Bench leaderboard.

Nvidia trained the model in NVFP4, a format built for Blackwell GPUs. On B200 hardware, Nvidia says inference can run up to 4x faster than FP8 on H100, with no reported loss in accuracy.

BetUS

Get 125% / $2,500 on 1st deposit!

New players only. Exclusive Welcome Bonus of up to $2,500

Casino & Sports

Visit Site

Nemotron 3 Super is available under the Nvidia Nemotron Open Model License. Developers can get checkpoints in BF16, FP8, and NVFP4 on Hugging Face. Nvidia also supports inference through Nvidia NIM, build.nvidia.com, Perplexity, Openrouter, Together AI, Google Cloud, AWS, Azure, Coreweave, Dell Enterprise Hub, and HPE. More guides and recipes are available through NeMo.

Next Bitcoin Slips as Iran Pulls Back From Second U.S. Meeting »

Previous « CoinPoker World Poker Masters Runs May 3 to June 1 With $25M GTD

Jenny Patel

Jenny Patel, a dedicated freelance writer, has been consumed by her love for gaming since her childhood days. Her go-to games growing up were Elder Scrolls V: Skyrim on PC and Halo 3 on XBOX. Jenny now enjoys the flexibility of working remotely, allowing her to explore the world while indulging in her gaming passion.

Tags: AINvidia

Pudgy Penguins Deepens Manchester City Partnership
2 days ago
BPMG Adds Web3 Features To Existing Games
2 days ago
Meta Launches Paid Plus Plans For Instagram And Facebook
3 days ago
Alpha Compute Closes GAMEE Deal And Forms Alpha Games
4 days ago
Blast NFT Game Fantasy Top To Close By End Of June
5 days ago
Moonveil Shuts Down After GameFi Market Breaks
5 days ago
Roblox Receives FTC Complaint Over Child Safety Claims
6 days ago
Tomoland Raises $2 Million for Mobile Web3 Creator Platform
6 days ago

Nvidia Launches Nemotron 3 Super With Faster AI Performance

Built for Speed and Long Inputs

Related