課程資訊
課程名稱
資料科學中的維度縮減
Dimension Reduction for data science 
開課學期
108-1 
授課對象
理學院  數學研究所  
授課教師
李克昭 
課號
MATH5189 
課程識別碼
221 U8550 
班次
 
學分
2.0 
全/半年
半年 
必/選修
選修 
上課時間
星期一1,2(8:10~10:00) 
上課地點
天數102 
備註
總人數上限:30人 
Ceiba 課程網頁
http://ceiba.ntu.edu.tw/1081MATH5189_ 
課程簡介影片
 
核心能力關聯
本課程尚未建立核心能力關連
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

Dimension reduction for data science
The discipline of data science requires transparency to strive. Problem-solving in data science often employs multiple methods packaged in one or more bundles by algorithm inventors or software developers. Due to the complexity of the data in terms of volume, dimension, structure  and mode,  new methods tend to contain existing methods as the inner layers.  This causes the danger of gradual loss in transparency to both users and future developers. 
 
Dimension reduction (DR) is a key component in  many data-analytic algorithms or software packages for application in data science.  Principal component analysis(PCA), being implemented  as an inner layer in  numerous packages, has become a household name in lieu of  dimension reduction across a variety of scientific disciplines.  In addition to PCA, many methods for dimension reduction have been developed. The goal of this course is to present a general statistical framework of dimension reduction and discuss the shared and unique properties of different DR methods. 

This is a graduate course for students who have  an adequate undergraduate-level of  statistics, applied mathematics, data science, data engineering, or related training.  Exceptionally strong undergraduate students are also welcome to take this course. 

The class will meet once a week for one and a half hour
 

課程目標
待補 
課程要求
This is a graduate course for students who have  an adequate undergraduate-level of  statistics, applied mathematics, data science, data engineering, or related training.  Exceptionally strong undergraduate students are also welcome to take this course. 
The class will meet once a week for one and a half hour.
Grading is based on class participation (20%) and a term project (80%).
A written report of about 10 pages of main context; if you have more to report, put them in Appendix as supplementary information
The last lecture will be given on December 16, 2019
The final report should be submitted before January 5, 2020
There are two options :
Track 1 : selective homework problems, review of reference papers, etc.
Track 2 : data analysis project, new ideas to contribute, new algorithms, etc.
For track 2, team work with no more than 3 participants is allowed.
 
預期每週課後學習時數
 
Office Hours
 
參考書目
待補 
指定閱讀
待補 
評量方式
(僅供參考)
   
課程進度
週次
日期
單元主題
第1週
9/09  The topic of dimension reduction in data science 
第2週
9/16  The key to dimension reduction : symmetry 
第3週
9/23  A vibrant world of regression models devoid of the original flavor 失去原味的迴歸模型世界活力更充沛 
第4週
9/30  No class this week 
第5週
10/07  Bias-variance tradeoff : the intertwining relationship between model selection criterion, method of regularization (Tikhonov), LASSO and cross-valuation, compounded by the issue of honesty 
第7週
10/21  A wide world of classification tasks: the issue of support separability 
第8週
10/28  Classification Part II : SVM 
第9週
11/4  Classification Part III : additional notes 
第10週
11/11  Wherein MA(multivariate analysis) meets with RL(representation learning): DR (dimension reduction) 
第11週
11/18  A comparison of several matrix decompositions for dimension reduction
 
第12週
11/25  Sliced inverse regression for dimension reduction: how much information is preserved? 
第13週
12/02  Principle Hessian Direction (PHD): curvature pursuit  
第14週
12/09  Liquid association 
第15週
12/16  Data visualization, Deep learning and Transfer learning  
第16週
12/23  no lecture 
第17週
12/30  no lecture