CPEG455/655: High-Performance Computing with Commodity Hardware

Home Syllabus Lecture Notes Assignments Useful Links

Course Description

New commodity parallel computing devices, including Graphics processing units (GPUs) and IBM Cell processor, bring the originally elite high performance computing into the reach of general public. To program and accelerate applications on the new high performance computing devices, we must understand both the computational architecture and the principles of program optimization. This course discusses GPU and IBM Cell hardware, as well as concepts and techniques for optimizing general purpose computing on the new architectures.

Modern GPUs are high-performance parallel computing devices.  Floating point performance of these devices has far outpaced that of conventional CPUs, stimulating interest in general purpose computing using these devices. However, General Purpose Computing on GPU (GPGPU) has traditionally been hard, requiring a mix of graphics specific languages and unfamiliar programming paradigms. To lower the barrier of entry, NVIDIA and AMD has released CUDA (Compute Unified Driver Architecture) and CTM (Close To Metal), both of which are C APIs for programming GPUs. This course will introduce the fundamental organization of GPU hardware, discuss the architectures and the program models as defined by CUDA or CTM, and compare the new architectures with general-purpose processors.

In addition to the discussion of hardware and architecture of GPUs, another major theme of this course is the program optimization techniques. We will discuss general optimization concepts include mapping an algorithm to the hardware's computational resources, efficiently utilizing the unconventional memory hierarchy, and profiling and optimizing a design. Moreover, this course emphasizes using the CUDA and CTM to program and optimize computationally demanding applications on the new platforms.

The coursework will include homework, exams, and a final project. Students must have a strong understanding of programming in C or C++ to do well in this course. By taking this course, students will deepen understanding of the interaction between software and hardware, and gain hand-on experience of using and designing cutting-edge technology in the frontier of high performance computing.

Course Information
Course Number: ELEG 455/655
Course Title: Programming Modern Graphics Cards
Time: MWF 10:10am-11:00am
Location: 122 Sharp Lab
Prerequisites: Computer architecture. Parallel programming is recommended but not required

Instructor Information
Instructor:

Office Hours:Mon/Wed 11:00am-12:00pm and by appointment
Office Location:308 DuPont

Text Books

No required text books. However, the following books are recommended and will help your projects.

Topics Covered

Grading

Late Policy

Late submission will be penalized on an hourly scheme.
Up to 1 hour late, -15%
Up to 2 hours late, -40%
Up to 3 hours late, -70%
Zero grade after 3 hours.

Project
The class will break up into groups of two to implement a single computationally intense algorithm on a GPU.  The algorithm must be approved by the instructor.  Specific deliverables for this project are as follows:

Suggested Project Topics

Application tuning:

Architecture research: