Course

CUDA at Scale for the Enterprise

Johns Hopkins University

This course, offered by Johns Hopkins University, equips students with the skills to develop software that utilizes multiple CPUs and GPUs in computational environments. Through a series of comprehensive modules, students will learn to manage asynchronous workflows, sort data, process images, and implement their own software using CUDA techniques and libraries.

Students will delve into topics such as parallel programming, C/C++ refresher, and the CUDA computational model. The course provides insights into developing software that can handle asynchronous data using CUDA, and implementing interactive GPU computational processing kernels. Moreover, it covers the utilization of CUDA, hardware memory capabilities, and algorithms/libraries to solve programming challenges, including image processing.

This course is designed for software developers and data scientists working in high-performance computing, data processing, and machine learning fields. By the end of the course, students will be equipped to develop software that can use multiple CPUs and GPUs, create asynchronous workflows with CUDA’s events and streams capability, and solve programming challenges using the CUDA computational model.

Certificate Available ✔

Get Started / More Info
CUDA at Scale for the Enterprise
Course Modules

This course comprises modules covering GPU specialization, multiple CPU/GPU systems, CUDA events and streams, sorting using GPUs, and image processing using Nvidia programming primitives.

Course Overview

This module provides an overview of the GPU specialization, including course expectations, development tools, and a refresher on C/C++. Students will also engage in assignments and discussions on enterprise data processing and canonical algorithms.

Multiple CPU/GPU Systems

Module 2 delves into multiple CPU/GPU systems, covering multiple CPU architectures, the CUDA multiple GPU programming model, and the comparison between multiple CPUs and GPUs. Students will also engage in activities, assignments, and discussions related to this topic.

CUDA Events and Streams

Module 3 focuses on CUDA events and streams, exploring their syntax, use cases, and assignment walkthrough. Students will also engage in discussions and a lab activity related to CUDA streams and events.

Sorting Using GPUs

Module 4 delves into sorting using GPUs, providing pseudocode for sorting algorithms, memory, and GPU pseudocode, as well as lab activities and assignments on sorting algorithms. Students will also explore a reading list on GPU sort algorithms.

Image Processing using Nvidia Programming Primitives

The final module covers image processing using Nvidia programming primitives, including syntax demonstrations and an independent project overview. Students will also engage in discussions and a lab activity related to NPP Box Filter.

More Software Development Courses

Java Enterprise Edition

LearnQuest

Java Enterprise Edition is a comprehensive Specialization for intermediate Java learners, covering web basics, servlet lifecycles, JSPs, and Enterprise Java Beans,...

Implementando un motor con Alibaba Cloud y ElasticSearch

Coursera Project Network

Learn to implement a powerful search engine using Alibaba Cloud and ElasticSearch in just 1 hour.

Programming Languages, Part C

University of Washington

Programming Languages, Part C provides a deep dive into functional programming using languages like ML, Racket, and Ruby. Gain a robust understanding of language...

Docker para Principiantes: Despliega Contenedores

Coursera Project Network

Learn the basics of Docker and how to deploy containers in this 1-hour guided project. No experience required, start optimizing your application deployments with...