AI Product Factory

The NVIDIA-native
AI product factory.

Six stages. Raw documents in. Shipping, signed, auditable AI product out. Built by an NVIDIA Inception member to push the full NVIDIA stack as far as it will go.

Not a training tool. Not a CLI with a Gradio URL. A factory — double-click and it works.

Request Early Access See the 6 stages

onyx-forge — pipeline

$ forge run --input ./docs/ --target h100

■ Stage 1 · Ingest

nv-ingest: nemotron-ocr-v1 + page-elements-v3

✓ 2,847 chunks → LanceDB

■ Stage 2 · Data Studio

dedup + score + synthetic gen

✓ 4,201 training pairs curated

■ Stage 3 · Model Selection

GPU: H100 80GB · budget: 72GB VRAM

→ Recommended: Llama-3.1-8B-Instruct

■ Stage 4 · Training

engine: NeMo · LoRA rank 64

✓ Checkpoint saved · loss: 0.0812

■ Stage 5 · Eval Gates

7 dimensions · threshold: 0.85

✓ QUALIFIED checkpoint · score: 0.91

■ Stage 6 · Package

Docker Compose + Ed25519 sign

✓ forge-product-v1.0.0.msi · signed

6

Integrated stages

NVIDIA-native

NeMo · NIM · TensorRT-LLM

Ed25519

Signed governance receipts

BYOK

MIT/Apache clean · NGC runtime

NVIDIA Integration

Deeper NVIDIA integration
than any other tool in the space.

LLaMA-Factory, Unsloth Studio, and every competitor assume you already have clean data, an ML engineer, and a GitHub account. Forge ships the whole factory — and it runs on the full NVIDIA stack, not just the GPU.

NVIDIA Inception Member

Onyx AI Labs

Built specifically to push the full NVIDIA stack as far as it would go. Every stage of Forge uses NVIDIA-native infrastructure — not as a plugin, but as the engine.

→

nv-ingest pipeline

nemotron-ocr-v1, nemotron-page-elements-v3, nemotron-table-structure-v1 — extracts structured data that generic loaders miss

→

NIM reranker + NeMo Retriever

Semantic search and reranking using NVIDIA's production microservices, not open-source approximations

→

NeMo training + TensorRT-LLM serving

The same pipeline NVIDIA uses for enterprise model customization — fine-tuning to production-ready inference

→

DGX-aware GPU profiler

Detects your GPU (DGX, RTX, H100, H200, B100), sets memory budgets automatically, recommends optimal architecture

→

BYOK — MIT/Apache clean runtime

Ships MIT/Apache-clean. Pulls NVIDIA-proprietary assets at runtime under your own NGC credentials — you own the license

How Forge compares

Capability	Forge	LLaMA-Factory	Unsloth
Raw doc ingest (PDF/DOCX)	✓	✗	✗
Data curation + synthetic gen	✓	✗	✗
GPU auto-selection	✓	✗	✗
NeMo training engine	✓	✗	✗
Eval gates (block bad models)	✓	✗	✗
Air-gap Docker packaging	✓	✗	✗
Signed governance receipts	✓	✗	✗
Desktop app (double-click)	✓	✗	✗

LLaMA-Factory and Unsloth are excellent training tools. Forge ships the entire factory.

NVIDIA NeMo NIM Microservices nv-ingest nemotron-ocr-v1 nemotron-page-elements-v3 nemotron-table-structure-v1 NIM Reranker TensorRT-LLM DGX-aware GPU Profiler Unsloth HuggingFace PEFT LanceDB

Pipeline

Six stages. One product out the other end.

Every stage is integrated, not bolted together. Data flows through the pipeline automatically. No context-switching between tools, no shell scripts, no YAML archaeology.

Stage 01

Ingest

PDF, DOCX, HTML, and web sources → chunked, embedded, stored in LanceDB. Powered by NVIDIA NeMo Retriever extractors — captures tables, diagrams, and page elements that generic loaders lose.

nv-ingest · nemotron-ocr-v1

nemotron-page-elements-v3 · table-structure-v1

LanceDB vector store

Stage 02

Data Studio

Curate, score, deduplicate, and generate synthetic data. Built-in human review queue for flagged samples. Produces a clean, scored training dataset before a single GPU cycle is spent.

Quality scoring · deduplication

Synthetic data generation

Human review queue

Stage 03

Model Selection

GPU profiler detects your hardware and sets memory budgets automatically. Recommends the optimal base architecture for your task, data volume, and hardware — no spreadsheets required.

DGX · RTX · H100 · H200 · B100

Auto VRAM budget · architecture selection

Task-specific model registry

Stage 04

Training

NeMo is the primary engine — SFT, LoRA, QLoRA. Engine registry automatically selects Unsloth or HuggingFace PEFT when they're the right fit. One UI. Multiple backends. You don't choose the plumbing.

NVIDIA NeMo (primary)

Unsloth · HuggingFace PEFT

Engine registry · SFT · LoRA · QLoRA

Stage 05

Eval Gates

Seven-dimension benchmarks that block bad models from shipping. Not just metrics — gates. A checkpoint either passes or it doesn't. Produces a qualified checkpoint, not just a checkpoint.

7 eval dimensions · configurable thresholds

QUALIFIED / BLOCKED output status

Citation verification · hallucination rate

Stage 06

Package

Air-gappable Docker Compose, cloud NIM, or Terraform. Every artifact is Ed25519 signed with a governance receipt. Ships as a Windows MSI installer. Mac coming.

Air-gap Docker Compose · cloud NIM · Terraform

Ed25519 signed governance receipts

Windows MSI · Mac coming

Raw Docs → Ingest → Curate → Select → Train → Eval → Shipping Product

Proof

We built our own product with Forge.

Cortex — our regulatory intelligence API — was built end-to-end using Onyx Forge. It is the product that Forge built. Every stage, every artifact, every governance receipt.

Built with Forge

Cortex

Regulatory Intelligence API

Cortex ingests 20+ regulatory frameworks, trains on supervised compliance examples, passes 7-dimension eval gates, and ships as a signed Docker package with Ed25519 governance receipts. Every stage ran through Forge.

73K+ cited obligations Ed25519 signed Air-gappable 20+ frameworks

View Cortex →

✓ Stage 1: 73K+ chunks ingested

✓ Stage 2: Compliance training pairs

✓ Stage 3: H100 profile selected

✓ Stage 4: NeMo SFT training

✓ Stage 5: All 7 gates passed

✓ Stage 6: Docker + MSI signed

Who it's for

Built for serious AI builders.

AI Builders

Pushing the NVIDIA stack

You're not just using an LLM API. You want to run your own models on your own hardware with the same infrastructure NVIDIA uses for enterprise deployments. Forge gives you the full NVIDIA toolchain without the integration work.

Startups

Shipping domain-specific AI products

You have a domain, documents, and a product to ship. You don't have time to assemble an ingest pipeline, a training workflow, an eval framework, and a packaging system. Forge is the whole stack, double-click to start.

Enterprise AI Teams

Shipping-grade, auditable outputs

You need more than a fine-tuned checkpoint. You need signed artifacts, eval reports, governance receipts, and packaging your security team can review. Forge ships production-grade outputs — not prototypes.

Sovereign AI / Government

Air-gapped, auditable, sovereign

Air-gappable Docker Compose. Ed25519 signed artifacts. No cloud dependency at runtime. Designed for environments where the model, the data, and the audit trail must stay inside your perimeter.

Early Access

Request Early Access

Forge is in limited early access. We're onboarding teams building serious AI products on NVIDIA hardware.

Pricing available on request. No commitments required.

✓

Request received.

We'll be in touch within one business day.

First name

Required.

Last name

Required.

Work email

Please use your corporate or institutional email address.

Company / Organization

Required.

GPU / Hardware (optional)

What are you building? (optional)

forge@onyxailabs.com · Response within one business day

The NVIDIA-native AI product factory.

Deeper NVIDIA integrationthan any other tool in the space.

How Forge compares

Six stages. One product out the other end.

Ingest

Data Studio

Model Selection

Training

Eval Gates

Package

We built our own product with Forge.

Cortex

Built for serious AI builders.

Pushing the NVIDIA stack

Shipping domain-specific AI products

Shipping-grade, auditable outputs

Air-gapped, auditable, sovereign

Request Early Access

Request received.

The NVIDIA-native
AI product factory.

Deeper NVIDIA integration
than any other tool in the space.