This course, offered by Johns Hopkins University, equips students with the skills to develop software that utilizes multiple CPUs and GPUs in computational environments. Through a series of comprehensive modules, students will learn to manage asynchronous workflows, sort data, process images, and implement their own software using CUDA techniques and libraries.
Students will delve into topics such as parallel programming, C/C++ refresher, and the CUDA computational model. The course provides insights into developing software that can handle asynchronous data using CUDA, and implementing interactive GPU computational processing kernels. Moreover, it covers the utilization of CUDA, hardware memory capabilities, and algorithms/libraries to solve programming challenges, including image processing.
This course is designed for software developers and data scientists working in high-performance computing, data processing, and machine learning fields. By the end of the course, students will be equipped to develop software that can use multiple CPUs and GPUs, create asynchronous workflows with CUDA’s events and streams capability, and solve programming challenges using the CUDA computational model.
Certificate Available ✔
Get Started / More InfoThis course comprises modules covering GPU specialization, multiple CPU/GPU systems, CUDA events and streams, sorting using GPUs, and image processing using Nvidia programming primitives.
This module provides an overview of the GPU specialization, including course expectations, development tools, and a refresher on C/C++. Students will also engage in assignments and discussions on enterprise data processing and canonical algorithms.
Module 2 delves into multiple CPU/GPU systems, covering multiple CPU architectures, the CUDA multiple GPU programming model, and the comparison between multiple CPUs and GPUs. Students will also engage in activities, assignments, and discussions related to this topic.
Module 3 focuses on CUDA events and streams, exploring their syntax, use cases, and assignment walkthrough. Students will also engage in discussions and a lab activity related to CUDA streams and events.
Module 4 delves into sorting using GPUs, providing pseudocode for sorting algorithms, memory, and GPU pseudocode, as well as lab activities and assignments on sorting algorithms. Students will also explore a reading list on GPU sort algorithms.
The final module covers image processing using Nvidia programming primitives, including syntax demonstrations and an independent project overview. Students will also engage in discussions and a lab activity related to NPP Box Filter.
Java Enterprise Edition is a comprehensive Specialization for intermediate Java learners, covering web basics, servlet lifecycles, JSPs, and Enterprise Java Beans,...
Learn to implement a powerful search engine using Alibaba Cloud and ElasticSearch in just 1 hour.
Programming Languages, Part C provides a deep dive into functional programming using languages like ML, Racket, and Ruby. Gain a robust understanding of language...
Learn the basics of Docker and how to deploy containers in this 1-hour guided project. No experience required, start optimizing your application deployments with...