This image is generated by DALL·E

AI Accelerator Design (Spring 2025)

This course provides a comprehensive overview of designing artificial intelligence (AI) accelerators, focusing on the computational methods and hardware architectures required for efficient implementation. Students will be introduced to key AI concepts and models before exploring hardware design topics. The course also includes hands-on exercises to practice hardware implementation using Verilog. Through the final project, students will design and implement a core computation unit of a real AI accelerator.

Announcements

Class Time

Monday, 10:30 AM-12:00 PM (Room C481-1, Engineering Bldg. or Virtual)
Wednesday, 9:00-10:30 AM (Recorded)

Office Hour

Wednesday 3:00-4:00 PM, or by appointment (Room C383-1, Engineering Bldg.)

Syllabus

Week Overview Lecture Contents Remark
1 Introduction AI Model, Application and Hardware History
2 Hardware Description Language Introduction to Verilog and Design Examples
3 Parallelism Parallel Computing Hardware and Instruction-Level Parallelism Assignment 1
4 Multicore CPU Modern CPU Architecture and Data-Level Parallelism
5 Graphic Processing Unit (GPU) GPU Architecture and Optimization Techniques Assignment 2
6 Tensor Processing Unit (TPU) TPU Architecture and Systolic Array Concept
7 Parallel Computation Hardware-Efficient Matrix Multiplications and Tiling Concept
8 Midterm Exam
9 Basic AI Models I Multi-Layer Perceptrons and Convolutional Neural Networks Assignment 3
10 Basic AI Models II Recurrent Neural Networks and Transformers
11 AI Model Optimization I AI Model Specifications and Pruning Assignment 4
12 AI Model Optimization II Quantization
13 Neural Processing Unit (NPU) Eyeriss and EIE Final Project
14 Advanced NPU Other Recent NPUs
15 Hardware Performance Model Roofline Analysis Model
16 Final Exam