NVIDIA repositories

cccl

Public

CUDA Core Compute Libraries

cpp hpc gpumodern-cpp parallel-computing cuda nvidia gpu-acceleration cuda-kernels gpu-computing

C++

•

Other

•300•2.1k•1.1k•204•Updated

Dec 10, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

cuda pytorch moeblackwell llm-serving

Python

•

Other

•1.9k•12k•580•463•Updated

Dec 10, 2025

JAX-Toolbox

Public

JAX-Toolbox

Python

•

Apache License 2.0

•68•367•80•47•Updated

Dec 10, 2025

KAI-Scheduler

Public

KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale

Go

•

Apache License 2.0

•116•962•22•34•Updated

Dec 10, 2025

NVSentinel

Public

NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments

Go

•

Apache License 2.0

•27•112•34•9•Updated

Dec 10, 2025

Megatron-LM

Public

Ongoing research training transformer models at scale

transformers model-para large-language-models

Python

•

Other

•3.4k•14k•333•237•Updated

Dec 10, 2025

cuda-quantum

Public

C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows

python cpp quantumquantum-computing hacktoberfest quantum-programming-language quantum-algorithms quantum-machine-learning unitaryhack

C++

•

Other

•309•871•407•86•Updated

Dec 10, 2025

spark-rapids-jni

Public

RAPIDS Accelerator JNI For Apache Spark

Cuda

•

Apache License 2.0

•75•52•83•5•Updated

Dec 10, 2025

ACCV-Lab

Public

Accelerated Computer Vision Lab (ACCV-Lab) is a systematic collection of packages with the common goal to facilitate end-to-end efficient training in the ADAS domain, each package offering tools & best practices for a specific aspect/task in this domain.

Python

•

Apache License 2.0

•4•23•1•0•Updated

Dec 10, 2025

nvidia-container-toolkit

Public

Build and run containers leveraging NVIDIA GPUs

Go

•

Apache License 2.0

•448•3.9k•118•19•Updated

Dec 10, 2025

Model-Optimizer

Public

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

Python

•

Apache License 2.0

•211•1.6k•68•53•Updated

Dec 10, 2025

DALI

Public

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

python machine-learning deep-learningneural-network mxnet gpu image-processing pytorch gpu-tensorflow data-processing

C++

•

Apache License 2.0

•651•5.6k•223•34•Updated

Dec 10, 2025

Fuser

Public

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++

•

Other

•71•365•207•211•Updated

Dec 10, 2025

cloudai

Public

CloudAI Benchmark Framework

Python

•

Apache License 2.0

•38•76•2•6•Updated

Dec 10, 2025

doca-platform

Public

DOCA Platform manages provisioning and service orchestration for Bluefield DPUs

Go

•

Apache License 2.0

•14•62•0•0•Updated

Dec 10, 2025

k8s-device-plugin

Public

NVIDIA device plugin for Kubernetes

kubernetes

Go

•

Apache License 2.0

•759•3.6k•75•38•Updated

Dec 10, 2025

OSMO

Public

The developer-first platform for scaling complex Physical AI workloads across heterogeneous compute—unifying training GPUs, simulation clusters, and edge devices in a simple YAML

Python

•

Apache License 2.0

•3•57•6•7•Updated

Dec 10, 2025

gpu-operator

Public

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes

kubernetes gpu cudanvidia

Go

•

Apache License 2.0

•424•2.4k•94•75•Updated

Dec 10, 2025

NeMo-Agent-Toolkit

Public

The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.

Python

•

Apache License 2.0

•446•1.6k•54•29•Updated

Dec 10, 2025

aerial-cuda-accelerated-ran

Public

An SDK (Software Development Kit) for building commercial-grade, AI-native, 3GPP, and O-RAN compliant 5G/6G gNB software on NVIDIA-accelerated computing platforms.

C++

•

Other

•7•19•1•0•Updated

Dec 10, 2025

aerial-framework

Public

A toolchain for generating high-performance, GPU-accelerated 5G/6G pipelines from Python and a modular, real-time runtime for executing the pipelines on NVIDIA Aerial™ RAN Computer platforms.

C++

•

Other

•2•9•0•0•Updated

Dec 10, 2025

tilus

Public

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

tile programming kernelcuda

Python

•

Apache License 2.0

•14•420•8•0•Updated

Dec 10, 2025

numba-cuda

Public

The CUDA target for Numba

Python

•

BSD 2-Clause "Simplified" License

•47•223•99•25•Updated

Dec 10, 2025

accelerated-computing-hub

Public

NVIDIA curated collection of educational resources related to general purpose GPU programming.

Jupyter Notebook

•

Other

•169•952•14•2•Updated

Dec 10, 2025

nv-ingest

Public

NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.

Python

•

Apache License 2.0

•279•2.8k•101•37•Updated

Dec 10, 2025

libmctp

Public

C

•

Other

•36•8•0•0•Updated

Dec 10, 2025

nsmd

Public

MCTP VDM-based Nvidia System Management API

C++

•

Apache License 2.0

•1•4•1•0•Updated

Dec 10, 2025

TileGym

Public

Helpful kernel tutorials and examples for tile-based GPU programming

Python

•

Other

•15•327•0•1•Updated

Dec 10, 2025

cudaqx

Public

Accelerated libraries for quantum-classical computing built on CUDA-Q.

C++

•

Other

•37•70•27•12•Updated

Dec 10, 2025

cuopt

Public

GPU accelerated decision optimization

gpu optimization cudalinear-programming

Cuda

•

Apache License 2.0

•101•609•75•22•Updated

Dec 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Corporation

All

All

641 repositories

cccl

TensorRT-LLM

JAX-Toolbox

KAI-Scheduler

NVSentinel

Megatron-LM

cuda-quantum

spark-rapids-jni

ACCV-Lab

nvidia-container-toolkit

Model-Optimizer

DALI

Fuser

cloudai

doca-platform

k8s-device-plugin

OSMO

gpu-operator

NeMo-Agent-Toolkit

aerial-cuda-accelerated-ran

aerial-framework

tilus

numba-cuda

accelerated-computing-hub

nv-ingest

libmctp

nsmd

TileGym

cudaqx

cuopt

All

All

Repositories list

641 repositories