Llama 3.1 Nemotron Ultra 253B

by nvidia

Chat

Description

Llama 3.1 Nemotron Ultra 253B is Nvidia's flagship reasoning model, created by converting Meta's Llama 3.1 405B from dense to MoE architecture. Features 253B total parameters with ~17B active per token, Dynamic Thinking with toggleable reasoning, tool calling, and structured outputs.

Providers

Nebius

View provider details and policies.

Pricing

Input$0.60 /1M tokens

Output$1.80 /1M tokens

Get Started

Specifications

Context128K tokens

Parameters—

License—

Released—

Capabilities

ToolsJSON ModeStreaming

EU-Compliant API Access

Use Llama 3.1 Nemotron Ultra 253B with a simple API call. OpenAI-compatible endpoint, EU data residency guaranteed.

JavaScript

const response = await fetch("https://api.eurouter.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer ${process.env.EUROUTER_API_KEY}`,
  },
  body: JSON.stringify({
    model: "llama-3.1-nemotron-ultra-253b",
    messages: [
      { role: "user", content: "Hello!" }
    ],
  }),
});

const data = await response.json();
console.log(data.choices[0].message.content);

More from nvidia

Nemotron Nano V2 12B

$0.07 / $0.20

Nemotron 3 Nano 30B A3B

$0.06 / $0.24

Explore more

Nebius provider

View Nebius's data policy and model catalog.

Browse all models

Explore the full EUrouter model catalog.

Models API

List models and capabilities via the API.

Quickstart

Authenticate and make your first request.

Integrate AI without GDPR risk.

You need AI that won’t create compliance headaches. Your data stays in the EU, GDPR is enforced by default, and every request is routed for the best balance of cost, latency, and uptime, reducing risk while improving performance.

Get Started

GDPR by default

EU data residency

Smart routing