CUDA Programming for Dummies --- in ML perspective

CUDA examples and exercises focused on performance optimization, parallel algorithms, and their application to fundamental Deep Learning components.

Project Goals

This repository serves as an interactive learning environment to master key parallel computing concepts:

CUDA concepts: High level CUDA concepts including threads, synchronisation, shared memory and tiling.
Thrust Proficiency: Use NVIDIA's Thrust library for highly-optimized parallel patterns (e.g., sort, reduce, transform).
Application: Apply CUDA to Matrix Multiplication (GEMM) and basic Neural Network architectures.

Key Exercises

File/Area	Concept Learned	Primary Task
`optimized_max_displacement.cu`	Fused Operations	Analyze the memory access pattern of the zip iterator.
`performance_comparison.cu`	Benchmarking	Benchmark naive vs. optimized code across varying data sizes.
`matmul/`	Tiled Kernels	Implement and test a tiled GEMM kernel for cache reuse.
`neural_nets/`	Element-wise Transforms	Use `thrust::transform` to implement custom ReLU/Sigmoid activation functions.

Requirements

CUDA Toolkit 11.0 or higher
A CUDA-capable NVIDIA GPU
A C++14 compatible compiler (e.g., nvcc)

Acknowledgments

Lei Mao's Blogs

An even easier introduction to CUDA

CUDA programming guide

Fundamentals of Accelerated Computing with Modern CUDA C++

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
DDP vs FSDP		DDP vs FSDP
Introduction		Introduction
Module 1		Module 1
matmul		matmul
neural_nets		neural_nets
transformer implementation		transformer implementation
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CUDA Programming for Dummies --- in ML perspective

Project Goals

Key Exercises

Requirements

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Aravind-11/Cuda-programming-exercises

Folders and files

Latest commit

History

Repository files navigation

CUDA Programming for Dummies --- in ML perspective

Project Goals

Key Exercises

Requirements

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages