課程名稱 |
資料科學之統計基礎(二) Statistical Foundations of Data Science |
開課學期 |
106-2 |
授課對象 |
理學院 應用數學科學研究所 |
授課教師 |
陳素雲 |
課號 |
MATH5079 |
課程識別碼 |
221 U8260 |
班次 |
|
學分 |
3.0 |
全/半年 |
半年 |
必/選修 |
選修 |
上課時間 |
星期三6,7,8(13:20~16:20) |
上課地點 |
天數201 |
備註 |
資料科學學程課程。 限本系所學生(含輔系、雙修生) 或 限電資學院學生(含輔系、雙修生) 總人數上限:30人 |
Ceiba 課程網頁 |
http://ceiba.ntu.edu.tw/1062MATH5079_SFDS2 |
課程簡介影片 |
|
核心能力關聯 |
本課程尚未建立核心能力關連 |
課程大綱
|
為確保您我的權利,請尊重智慧財產權及不得非法影印
|
課程概述 |
This course will cover the following topics: unsupervised dimension reduction (SVD, PCA), supervised dimension reduction (sliced inverse regression SIR), tensor methods (high order SVD, multilinear PCA, multilinear algebra, tensor SIR, tensor regression), clustering analysis (k-means, self-updating process), discriminant analysis (LDA, logistic regression), kernel machines (kernel PCA, kernel SIR, kernel Fisher discriminant analysis, support vector machine, reproducing kernel Hilbert space), robust loss functions (Kullback-Leibler divergence, Bregman divergence, gamma-divergence, etc), neural networks (universal approximation theory, back-propagation, activation functions, dropout regularization). |
課程目標 |
selected statistical methods and theory for machine learning and data science |
課程要求 |
1. grading: 80% assignments (homework, mini projects), 20% class participation (attendance, in-class discussions)
2. able to write scripts for data analysis in a package such as Matlab, R, or Python
3. good understanding of theory is required |
預期每週課後學習時數 |
|
Office Hours |
另約時間 備註: by appointment |
指定閱讀 |
will be assigned in class |
參考書目 |
An Introduction to Statistical Learning by James, Witten, Hastie and Tibshirani. web link to an ebook is posted on ceiba.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Hastie, Tibshirani and Friedman. web link to an ebook is posted on ceiba. |
評量方式 (僅供參考) |
|
週次 |
日期 |
單元主題 |
第1週 |
2/28 |
national holiday, no class |
第2週 |
3/07 |
1. course overview;
2. clustering analysis: k-means, hierarchical clustering and self-updating process for clustering
ebooks:
An Introduction to Statistical Learning
http://www-bcf.usc.edu/~gareth/ISL/
The Elements of Statistical Learning
|
第3週 |
3/14 |
unsupervised dimension reduction: SVD, PCA, stability analysis |
第4週 |
3/21 |
unsupervised dimension reduction (continued) |
第5週 |
3/28 |
high order SVD |
第6週 |
4/04 |
national holiday, no class |
第7週 |
4/11 |
Tensor Toolbox. Tensor methods. |
第8週 |
4/18 |
tensor methods;
supervised dimension reduction, sliced inverse regression, KSIR Toolbox |
第9週 |
4/25 |
linear discriminant analysis, logistic regression, maximum margin linear classifier |
第10週 |
5/02 |
數學系自主學習週。
複習 HOSVD, MPCA, LDA, logistic reg, linear SVM, 以及介紹軟體 |
第11週 |
5/09 |
robust loss functions and various divergence measures |
第12週 |
5/16 |
kernel machines: kernel trick, kernel PCA, kernel Fisher discriminant analysis |
第13週 |
5/23 |
kernel machines: SVM |
第14週 |
5/30 |
1. kernel machines (continued)
2. Bring a laptop to class. |
第15週 |
6/06 |
introduction to neural networks, universal approximation theory, back-propagation algorithm |
第16週 |
6/13 |
neural networks (continued) |
第17週 |
6/20 |
mini-project presentation (each person 8-10 minutes)
報告次序:黃聖堯,曾華廷 ,邱郁軒 ,劉妍君 ,鄭宗哲 ,森元俊成,朱傑韜,盧俊澎,林致弘,何文劭 ,洪嘉鴻 ,呂融昇 ,林伯儒 ,謝君宥,顏惠萱
上 ceiba 確認你的作業 (1,2,3,6,7,8次的作業) 成績 |
|