Model Parallelism: Building and Deploying Large Neural Networks

About This Course Very large deep neural networks (DNNs), whether applied to natural language processing (e.g., GPT-3), computer vision (e.g., huge Vision Transformers), or speech AI (e.g., Wave2Vec 2) have certain properties that set them apart from their smaller counterparts. As DNNs become larger and are trained on progressively larger datasets, they can adapt to new tasks with just a handful of training examples, accelerating the route toward general artificial intelligence. Training models that contain tens to hundreds of billions of parameters on vast datasets isn’t trivial and requires a unique combination of AI, high-performance computing (HPC), and systems knowledge. In this workshop, participants will learn how to: Train neural networks across multiple servers Use techniques such as activation checkpointing, gradient accumulation, and various forms of model parallelism to overcome the challenges associated with large-model memory footprint Capture and understand training performance characteristics to optimize model architecture Deploy very large multi-GPU models to production using NVIDIA Triton™ Inference Server The goal of this course is to demonstrate how to train the largest of neural networks and deploy them to production. Requirements Familiarity with: Good understanding of PyTorch Good understanding of deep learning and data parallel training concepts Practice with deep learning and data parallel are useful, but optional Tools, libraries, frameworks used: PyTorch, Megatron-LM, DeepSpeed, Slurm, Triton Inference Server Related Training Building Transformer-Based Natural Language Processing Applications Learn how to use Transformer-based natural language processing models for text classification tasks, such as categorizing documents. Fundamentals of Deep Learning for Multi-GPUs echniques for training deep neural networks on multi-GPU technology to shorten the training time required for data-intensive applications. For additional hands-on training through the NVIDIA Deep Learning Institute, visit www.nvidia.com/dli .

Generative AILLMMachine LearningComputer VisionMultimodal
Provider
NVIDIA DLI
Duration
8 hrs
Mode
live
Pricing
Paid

Catalog checked Mar 16, 2026. Enrollment happens on the provider website; progress tracking happens here.

Open provider page

What you will cover

Deep Learning, generative AI, LLM systems, machine learning, computer vision, multimodal systems

Recommended next

LLM Foundations for Builders
A free, self-paced introduction to modern large language model systems.
Review course
Machine Learning Refresher
Refresh the statistics and ML foundations needed for advanced GenAI work.
Review course
Fine-Tuning and MLOps
Bridge experimentation and operations for adapted language models.
Review course
Related

Keep the path moving

Verified freebasic

A free, self-paced introduction to modern large language model systems.

LLMGenerative AIPrompt Engineering
5 hrsself-pacedChecked Mar 1, 2026
Verified freebasic

Refresh the statistics and ML foundations needed for advanced GenAI work.

Machine LearningPython FoundationsStatistics
12 hrsself-pacedChecked Feb 22, 2026
Verified freeprofessional

Bridge experimentation and operations for adapted language models.

LLMFine-TuningMLOps
10 hrsliveChecked Mar 10, 2026
Model Parallelism: Building and Deploying Large Neural Networks | OpenCourseMap