This course provides a comprehensive overview of designing artificial intelligence (AI) accelerators, focusing on the computational methods and hardware architectures required for efficient implementation. Students will be introduced to key AI concepts and models before exploring hardware design topics. The course also includes hands-on exercises to practice hardware implementation using Verilog. Through the final project, students will design and implement a core computation unit of a real AI accelerator.
Monday, 10:30 AM-12:00 PM (Room C481-1, Engineering Bldg. or Virtual)
Wednesday, 9:00-10:30 AM (Recorded)
Wednesday 3:00-4:00 PM, or by appointment (Room C383-1, Engineering Bldg.)
Week | Overview | Lecture Contents | Remark |
---|---|---|---|
1 | Introduction | AI Model, Application and Hardware History | |
2 | Hardware Description Language | Introduction to Verilog and Design Examples | |
3 | Parallelism | Parallel Computing Hardware and Instruction-Level Parallelism | Assignment 1 |
4 | Multicore CPU | Modern CPU Architecture and Data-Level Parallelism | |
5 | Graphic Processing Unit (GPU) | GPU Architecture and Optimization Techniques | Assignment 2 |
6 | Tensor Processing Unit (TPU) | TPU Architecture and Systolic Array Concept | |
7 | Parallel Computation | Hardware-Efficient Matrix Multiplications and Tiling Concept | |
8 | Midterm Exam | ||
9 | Basic AI Models I | Multi-Layer Perceptrons and Convolutional Neural Networks | Assignment 3 |
10 | Basic AI Models II | Recurrent Neural Networks and Transformers | |
11 | AI Model Optimization I | AI Model Specifications and Pruning | Assignment 4 |
12 | AI Model Optimization II | Quantization | |
13 | Neural Processing Unit (NPU) | Eyeriss and EIE | Final Project |
14 | Advanced NPU | Other Recent NPUs | |
15 | Hardware Performance Model | Roofline Analysis Model | |
16 | Final Exam |