AI Product Factory

The NVIDIA-native
AI product factory.

Six stages. Raw documents in. Shipping, signed, auditable AI product out. Built by an NVIDIA Inception member to push the full NVIDIA stack as far as it will go.

Not a training tool. Not a CLI with a Gradio URL. A factory — double-click and it works.

onyx-forge — pipeline
$ forge run --input ./docs/ --target h100
Stage 1 · Ingest
nv-ingest: nemotron-ocr-v1 + page-elements-v3
2,847 chunks → LanceDB
Stage 2 · Data Studio
dedup + score + synthetic gen
4,201 training pairs curated
Stage 3 · Model Selection
GPU: H100 80GB · budget: 72GB VRAM
Recommended: Llama-3.1-8B-Instruct
Stage 4 · Training
engine: NeMo · LoRA rank 64
Checkpoint saved · loss: 0.0812
Stage 5 · Eval Gates
7 dimensions · threshold: 0.85
QUALIFIED checkpoint · score: 0.91
Stage 6 · Package
Docker Compose + Ed25519 sign
forge-product-v1.0.0.msi · signed
6
Integrated stages
NVIDIA-native
NeMo · NIM · TensorRT-LLM
Ed25519
Signed governance receipts
BYOK
MIT/Apache clean · NGC runtime
NVIDIA Integration

Deeper NVIDIA integration
than any other tool in the space.

LLaMA-Factory, Unsloth Studio, and every competitor assume you already have clean data, an ML engineer, and a GitHub account. Forge ships the whole factory — and it runs on the full NVIDIA stack, not just the GPU.

NVIDIA Inception Member
Onyx AI Labs

Built specifically to push the full NVIDIA stack as far as it would go. Every stage of Forge uses NVIDIA-native infrastructure — not as a plugin, but as the engine.

nv-ingest pipeline
nemotron-ocr-v1, nemotron-page-elements-v3, nemotron-table-structure-v1 — extracts structured data that generic loaders miss
NIM reranker + NeMo Retriever
Semantic search and reranking using NVIDIA's production microservices, not open-source approximations
NeMo training + TensorRT-LLM serving
The same pipeline NVIDIA uses for enterprise model customization — fine-tuning to production-ready inference
DGX-aware GPU profiler
Detects your GPU (DGX, RTX, H100, H200, B100), sets memory budgets automatically, recommends optimal architecture
BYOK — MIT/Apache clean runtime
Ships MIT/Apache-clean. Pulls NVIDIA-proprietary assets at runtime under your own NGC credentials — you own the license

How Forge compares

Capability Forge LLaMA-Factory Unsloth
Raw doc ingest (PDF/DOCX)
Data curation + synthetic gen
GPU auto-selection
NeMo training engine
Eval gates (block bad models)
Air-gap Docker packaging
Signed governance receipts
Desktop app (double-click)

LLaMA-Factory and Unsloth are excellent training tools. Forge ships the entire factory.

NVIDIA NeMo NIM Microservices nv-ingest nemotron-ocr-v1 nemotron-page-elements-v3 nemotron-table-structure-v1 NIM Reranker TensorRT-LLM DGX-aware GPU Profiler Unsloth HuggingFace PEFT LanceDB
Pipeline

Six stages. One product out the other end.

Every stage is integrated, not bolted together. Data flows through the pipeline automatically. No context-switching between tools, no shell scripts, no YAML archaeology.

Stage 01

Ingest

PDF, DOCX, HTML, and web sources → chunked, embedded, stored in LanceDB. Powered by NVIDIA NeMo Retriever extractors — captures tables, diagrams, and page elements that generic loaders lose.

nv-ingest · nemotron-ocr-v1
nemotron-page-elements-v3 · table-structure-v1
LanceDB vector store
Stage 02

Data Studio

Curate, score, deduplicate, and generate synthetic data. Built-in human review queue for flagged samples. Produces a clean, scored training dataset before a single GPU cycle is spent.

Quality scoring · deduplication
Synthetic data generation
Human review queue
Stage 03

Model Selection

GPU profiler detects your hardware and sets memory budgets automatically. Recommends the optimal base architecture for your task, data volume, and hardware — no spreadsheets required.

DGX · RTX · H100 · H200 · B100
Auto VRAM budget · architecture selection
Task-specific model registry
Stage 04

Training

NeMo is the primary engine — SFT, LoRA, QLoRA. Engine registry automatically selects Unsloth or HuggingFace PEFT when they're the right fit. One UI. Multiple backends. You don't choose the plumbing.

NVIDIA NeMo (primary)
Unsloth · HuggingFace PEFT
Engine registry · SFT · LoRA · QLoRA
Stage 05

Eval Gates

Seven-dimension benchmarks that block bad models from shipping. Not just metrics — gates. A checkpoint either passes or it doesn't. Produces a qualified checkpoint, not just a checkpoint.

7 eval dimensions · configurable thresholds
QUALIFIED / BLOCKED output status
Citation verification · hallucination rate
Stage 06

Package

Air-gappable Docker Compose, cloud NIM, or Terraform. Every artifact is Ed25519 signed with a governance receipt. Ships as a Windows MSI installer. Mac coming.

Air-gap Docker Compose · cloud NIM · Terraform
Ed25519 signed governance receipts
Windows MSI · Mac coming
Raw Docs Ingest Curate Select Train Eval Shipping Product
Proof

We built our own product with Forge.

Cortex — our regulatory intelligence API — was built end-to-end using Onyx Forge. It is the product that Forge built. Every stage, every artifact, every governance receipt.

Built with Forge

Cortex

Regulatory Intelligence API

Cortex ingests 20+ regulatory frameworks, trains on supervised compliance examples, passes 7-dimension eval gates, and ships as a signed Docker package with Ed25519 governance receipts. Every stage ran through Forge.

73K+ cited obligations Ed25519 signed Air-gappable 20+ frameworks
View Cortex →
Stage 1: 73K+ chunks ingested
Stage 2: Compliance training pairs
Stage 3: H100 profile selected
Stage 4: NeMo SFT training
Stage 5: All 7 gates passed
Stage 6: Docker + MSI signed
Who it's for

Built for serious AI builders.

AI Builders

Pushing the NVIDIA stack

You're not just using an LLM API. You want to run your own models on your own hardware with the same infrastructure NVIDIA uses for enterprise deployments. Forge gives you the full NVIDIA toolchain without the integration work.

Startups

Shipping domain-specific AI products

You have a domain, documents, and a product to ship. You don't have time to assemble an ingest pipeline, a training workflow, an eval framework, and a packaging system. Forge is the whole stack, double-click to start.

Enterprise AI Teams

Shipping-grade, auditable outputs

You need more than a fine-tuned checkpoint. You need signed artifacts, eval reports, governance receipts, and packaging your security team can review. Forge ships production-grade outputs — not prototypes.

Sovereign AI / Government

Air-gapped, auditable, sovereign

Air-gappable Docker Compose. Ed25519 signed artifacts. No cloud dependency at runtime. Designed for environments where the model, the data, and the audit trail must stay inside your perimeter.

Early Access

Request Early Access

Forge is in limited early access. We're onboarding teams building serious AI products on NVIDIA hardware.

Pricing available on request. No commitments required.

Request received.

We'll be in touch within one business day.

Required.
Required.
Please use your corporate or institutional email address.
Required.

forge@onyxailabs.com  ·  Response within one business day