課程名稱 |
巨量資料統計與探勘 Big Data Statistics and Mining |
開課學期 |
102-1 |
授課對象 |
電機資訊學院 資訊工程學研究所 |
授課教師 |
歐陽彥正 |
課號 |
CSIE7120 |
課程識別碼 |
922 U4130 |
班次 |
|
學分 |
3 |
全/半年 |
半年 |
必/選修 |
選修 |
上課時間 |
星期五3,4,@(10:20~) |
上課地點 |
資101 |
備註 |
限學士班三年級以上 總人數上限:50人 |
Ceiba 課程網頁 |
http://ceiba.ntu.edu.tw/1021bdsm |
課程簡介影片 |
|
核心能力關聯 |
核心能力與課程規劃關聯圖 |
課程大綱
|
為確保您我的權利,請尊重智慧財產權及不得非法影印
|
課程概述 |
This course covers the fundamental theory of statistical analysis and mining for handling big data. |
課程目標 |
The students enrolled in this course will learn statistical and data mining
approaches featuring low time complexity and therefore offering significant
advantages when exploited to tackle big data. Furthermore, the students will
develop valuable insights for tackling big data.
Statistical analysis of big data (9 weeks)
* Review of probability and statistics
* Convergence rates of statistical estimators
* Challenges introduced by high dimensionality
* Statistical approaches with low time-complexity
Data mining of big data (9 weeks)
* Supervised learning algorithms
* Unsupervised learning algorithms
* Regression and function approximation with the regularization networks
* Optimization algorithms |
課程要求 |
|
預期每週課後學習時數 |
|
Office Hours |
|
指定閱讀 |
|
參考書目 |
Reference:
Probability and Statistical Inference, Hogg, R.V. and E.A. Tanis |
評量方式 (僅供參考) |
|
週次 |
日期 |
單元主題 |
Week 1 |
9/13 |
Introduction |
Week 2 |
9/20 |
中秋節放假 |
Week 3 |
9/27 |
Continuous Distribution |
Week 4 |
10/04 |
Parametric estimation |
Week 5 |
10/11 |
Kernel density estimation |
Week 6 |
10/18 |
Kernel density estimation |
Week 9 |
11/08 |
midterm exam |
Week 11 |
11/15 |
Feature Selection |
Week 14 |
12/13 |
Clustering |
Week 15 |
12/20 |
Optimization.web &
Microarray overfit |
Week 17 |
1/03 |
Decision Tree |
|