課程名稱 |
高等編譯器設計 Advanced Compiler Design |
開課學期 |
113-1 |
授課對象 |
電機資訊學院 資訊網路與多媒體研究所 |
授課教師 |
廖世偉 |
課號 |
CSIE5054 |
課程識別碼 |
922EU1220 |
班次 |
|
學分 |
3.0 |
全/半年 |
半年 |
必/選修 |
選修 |
上課時間 |
星期五1,2,3(8:10~11:10) |
上課地點 |
資104 |
備註 |
初選不開放。本課程以英語授課。 總人數上限:100人 |
|
|
課程簡介影片 |
|
核心能力關聯 |
核心能力與課程規劃關聯圖 |
課程大綱
|
為確保您我的權利,請尊重智慧財產權及不得非法影印
|
課程概述 |
Future compiler techniques will take advantage of AI extensively. We have upgraded the compiler syllabus accordingly.
Many students asked me, "Do I need to take 低等compiler before taking 高等compiler?" The answer is NO. In addition to the self-guide trend in the GPT era, basic compiler course is about parsing and simple code generation, while this course is about data flow analysis, optimization, and code generation. You can grasp the scope of Advanced Compilers by answering the following questions:
* AI needs a lot of cycles; machines such as GPUs are complex to code. How do we automatically generate efficient code for these machines effectively. (Parallelism and Locality)
* Can we use program analysis to automatically detect security bugs in programs? (Pointer analysis)
* How do we automatically manage memory efficiently so users do not have to manage it themselves? (Garbage collection)
* How do we make high-level programming languages efficient by optimizing the code? (Data-flow analysis)
* The highest programming language is obviously natural language. Can we program our virtual assistant to perform compound tasks in natural language? We use AI techniques to map natural language into formal languages. Note: no prior knowledge in machine learning is needed. (Neural networks, Satisfiability Modulo Theories)
Note that the scope of CSIE5054 is only a subset of the scope of Advanced Compilers. After all, NTU has shortened the length of a semester. In 16 weeks we'll show that the theory, techniques and algorithms in a compiler are applicable to a wide range of problems in software design and development. The #1 goal of this course is to provide students with the theory and techniques in program analysis and optimization.
Also, compiler's teaching should take advantage of AI too. In the GPT era, every class should support being self-guided. So starting this year we shall reference Cornell's "CS 6120: Advanced Compilers: The Self-Guided Online Course". For example, 11/22 is a school Holiday and you don't need to come to class. But we'll provide self-guided materials for 11/22 so you can learn it in your dorm that day, with the aid of GPT as your personal tutor.
Next, many future systems are for AI. Instead of focusing on Model AI, we focus on System AI. Jensen Huang points out the direction for Taiwan: 在半導體上做AI。This System AI approach matches the globalization trend. 各位英雄好漢,AI創業不一定要玩model而已,不要忽視資本的力量,尤其不要忽視全球資本的力量: 全球分工下,台灣被分配的就是 System AI. 而 Mag-7 (如我的老東家Google) 則有拿到 Model AI 的話語權。What does the P in "GPT" stand for? As Eric Schmidt, ex-CEO of Google, pointed out in August that an AI data center today costs US$300B, in Taiwan, each model has to start with the Pre-trained ones (aided by 小打小鬧 sometimes). Future Model AI breakthroughs outside Mag-7 have to start with mathematic insight (數學系), while System AI breakthroughs belong to Computer Science department. If you want to pursue the System AI path, this course is a MUST. Mediatek executives told us that they want to hire more System AI people, but few of our candidates are really qualified these days. 清交 offer many System training and 與科學園區及護國神山群有地利之便,but NTU didn't offer the key Compiler course for the past 2 years during the COVID hiatus. In the same period, model-related courses abound. We have to fix it! After all, the computer is 0s and 1s. Who controls the 0s and 1s? It's the compiler. Your "fancy" model still needs to map to the machine as 0s and 1s. Who generate them? If you ever sit in front of your screen and think how behind-the-screen really works, you are a computer science student. Then this course belongs to you.
Compiler is the crown jewel of computer system education, with its solid math foundation and its complex yet practical system. Learning compiler already equips you with real Software Engineering background. It's no toy model -- Compiler's output need to work, in terms of both running to completion correctly and speeding up, rather than slow-down, the system. If you can excel in this course, you can excel in any other part of both theory and practical computer science. Industry loves to hire compiler students because industry knows they can excel in any CS disciplines. E.g., Google Brain team was founded by Jeff Dean, a compiler person at heart who got his PhD in compilers from U. of Washington. Modular AI was founded by Chris Lattner, who got a PhD in compilers from UIUC and founded the LLVM project. LLVM was inspired by SUIF (Stanford University Intermediate Format). My PhD thesis from Stanford is exactly on the exploration of SUIF.
In short, both because of 正本清源 as a CS educator and because of the industry request, we will not dwell in theoretical computer science only and NTU will restart the regular compiler courses. The industry mandate beseeches NTU to really prepare students for industry. 請勿污名台灣做代工而已,"在半導體上做AI" already contributes to 25兆市值的TSMC and 股價1500的Mediatek (市值2兆)。The value-add is tremendous and appreciated world-wide. Jensen Huang appreciates Taiwan's System AI capability. Let's together re-focus NTU to be well-grounded. It's a MUST.
Course Enrollment
To enroll in this course, please fill out the application form.
For EECS students:
1. We will send you the authorization code soon.
2. After receiving the code and enrolling, please still complete HW0 by 9/13 23:59, because it's 7% of your final grade.
For students from departments outside EECS:
1. Please complete HW0 before 9/13 23:59 to receive the authorization code.
2. After receiving the code, enroll in the course.
Application Form: https://forms.gle/j6nAddEFcw3Duwcj7
Homework 0:
https://drive.google.com/file/d/1QjBw3fxZ_gupIweGJMzhrWSncwsbS6X6/view?usp=sharing
https://classroom.github.com/a/9j5CMziv
TA Email: llvm@csie.ntu.edu.tw
Week 1 Slide:
https://docs.google.com/presentation/d/1hoxaPSFcVRfLJVGFpWrlnSkczzvOiobMroTYaa3k2Uc/edit?usp=sharing
Google Meet: meet.google.com/aph-bymp-dak |
課程目標 |
For research-type students: To provide students with the theory and techniques in program analysis and optimization.
For industry-type students: To equip students with the much-needed System AI capability.
Overall, this course is at the core of Computer Science department. (我們叫資工系,而不是應數系 or 資管系。這課屬於資工系 or AI 系). E.g., this course is not offered in the Applied Math department |
課程要求 |
In the GPT era, there is no pre-requisite course. If you are an EECS student, you can code and you can take this course. If not, please show you can code by completing HW0. |
預期每週課後學習時數 |
|
Office Hours |
每週五 10:10~10:40 每週五 07:30~08:00 |
指定閱讀 |
Cornell: CS 6120: Advanced Compilers: The Self-Guided Online Course |
參考書目 |
Compilers: Principles, Techniques, and Tools (the Dragon Book)
Engineering a Compiler |
評量方式 (僅供參考) |
No. |
項目 |
百分比 |
說明 |
1. |
Homework 0 |
7% |
1. Students from departments outside EECS must complete this homework after filling out the form to receive the authorization code.
2. Students from EECS departments still need to complete this homework after enrolling in this course. |
2. |
Homework 1~4 |
32% |
1. Following the Homework 0, there are 4 assignments, each accounting for 8% of the final grade.
2. Late submissions are not allowed and will receive zero points. |
3. |
Midterm |
20% |
1. In the GPT era, assessment of student's learning is challenging. Everyone must participate in the exam in person without using GPT.
2. Good teaching without good assessment above still can't be called good education.
3. Please follow the exam rules as announced in the course guidelines. |
4. |
Mock Final Exam |
5% |
1. Important to take this Mock Exam 2 weeks before Final, so you know what to expect.
2. We always do this. In our experience, you probably WON'T excel in Final if you skip the Mock Exam.
3. You can take the Mock Exam either in person or online. Key is: We'll immediately explain the key points after the exam. |
5. |
Final |
30% |
1. In the GPT era, assessment of student's learning is challenging. Everyone must participate in the exam in person without using GPT.
2. Good teaching without good assessment above still can't be called good education.
3. Please follow the exam rules as announced in the course guidelines. |
6. |
Class Participation |
6% |
|
|
針對學生困難提供學生調整方式 |
上課形式 |
|
作業繳交方式 |
|
考試形式 |
|
其他 |
由師生雙方議定 |
|
週次 |
日期 |
單元主題 |
第1週 |
9/6 |
1. Overview: What is low-level VM, virtual machine (VM), compiler etc.
2. Course overview: From high-level (9月) to middle-end (10月) to low-level (11月). 線上資源的使用方式.
3. Representing Programs, and Introduction to Bril (HW0) |
第2週 |
9/13 |
Data Flow, Global Analysis, and Optimization |
第3週 |
9/20 |
Loop Optimization and Interprocedural Analysis |
第4週 |
9/27 |
Affine Partitioning (HW1) |
第5週 |
10/4 |
Local Optimization, Dead Code Elimination, and Local Value Numbering |
第6週 |
10/11 |
Holiday |
第7週 |
10/18 |
Static Single Assignment (HW2: 修改 Bril 和 LLVM 整合) |
第8週 |
10/25 |
Midterm Exam. Introduction to LLVM and Writing an LLVM Pass. |
第9週 |
11/1 |
Alias Analysis (HW3: 實作 LLVM pass) |
第10週 |
11/8 |
Introduction to RISCV and LLVM backend |
第11週 |
11/15 |
Holiday |
第12週 |
11/22 |
School Holiday (Self-Guided Materials) |
第13週 |
11/29 |
Introduction to RISCV and LLVM backend |
第14週 |
12/6 |
Mock Final Exam. Memory Management and Dynamic Compilers (HW4: 建構完整的編譯器,輸出 RISC-V 後端指令,並開啟相關最佳化機制) |
第15週 |
12/13 |
Advanced Topic |
第16週 |
12/20 |
Final Exam |
|