Skip to content

Welcome to Nemotron 3 Super NVFP4 Deployment Guide

This guide provides comprehensive instructions for deploying and hosting NVIDIA Nemotron 3 Super in NVFP4 (4-bit floating point) format across various platforms and cloud providers.

🤖 Nemotron 3 Super

Nemotron 3 Super is NVIDIA's latest open-source LLM featuring a hybrid Mamba-Transformer architecture with Mixture of Experts (MoE) routing for enhanced agentic reasoning capabilities.

Core Features

Learn More

🚀 Getting Started

Select a section below to begin:

📖 About This Guide

This documentation covers:

  • NVFP4-specific configuration for optimal performance on Blackwell architecture GPUs
  • Step-by-step deployment commands for each supported platform
  • Comparative analysis of different inference engines
  • Cloud provider integration patterns for transient GPU workloads