課程名稱 |
機器學習 Machine Learning |
開課學期 |
112-1 |
授課對象 |
工學院 應用力學研究所 |
授課教師 |
舒貽忠 |
課號 |
AM7192 |
課程識別碼 |
543 M1180 |
班次 |
|
學分 |
3.0 |
全/半年 |
半年 |
必/選修 |
選修 |
上課時間 |
星期三6(13:20~14:10)星期五7,8(14:20~16:20) |
上課地點 |
應113應113 |
備註 |
總人數上限:60人 |
|
|
課程簡介影片 |
|
核心能力關聯 |
核心能力與課程規劃關聯圖 |
課程大綱
|
為確保您我的權利,請尊重智慧財產權及不得非法影印
|
課程概述 |
The course provides a comprehensive introduction to machine learning, with a primary emphasis on the fundamental principles governing learning algorithms. It covers a wide range of topics, including: (1) Supervised Learning: generative and discriminative probabilistic classifiers (Bayes/logistic regression)、least squares regression、Neural Networks (Convolutional Neural Networks, Recurrent Neural Networks);(2) Probabilistic Graphical Model: Hidden Markov model (HMM);(3) Basic Learning Theory:PAC learning and model selection. This course aims to provide students with a robust foundation essential for conducting research in machine learning. |
課程目標 |
Upon completion, students will be proficient in utilizing calculus, linear algebra, optimization, probability, and statistics to create learning models for diverse real-world challenges. Moreover, they will be well-prepared for advanced research in machine learning and related domains. |
課程要求 |
中文授課,以板書形式,講解機器學習演算法數學原理。The course is taught in Chinese, utilizing the blackboard for writing and explaining the mathematical principles of machine learning algorithms. |
預期每週課後學習時數 |
|
Office Hours |
另約時間 |
指定閱讀 |
待補 |
參考書目 |
1. C. M. Bishop. Pattern Recognition and Machine Learning, Springer, 2006
2. Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, 2014.
3. O. Calin. Deep Learning Architectures: A Mathematical Approach, Springer, 2020
4. K. P. Murphy. Probabilistic Machine Learning: An Introduction, MIT Press, 2022
5. Y. S. Abu-Mostafa, M. Magdon-Ismail and H. T. Lin. Learning From Data, AMLbook, 2012
6. E. Alpaydin. Introduction to Machine Learning, MIT Press, 2020. |
評量方式 (僅供參考) |
|
週次 |
日期 |
單元主題 |
第1週 |
9/06,9/08 |
Mathematical formulation of a learning problem, Evaluation of a model (loss function), Generalization error, Empirical Risk Minimization (ERM), ERM with inductive bias, Bayes optimal classifier |
第2週 |
9/13,9/15 |
Example of Bayes optimal classifiers, Polynomial Threshold Functions, Overfitting, Generalization/Empirical errors vs model complexity |
第3週 |
9/20,9/22 |
Example for explaining No-Free-Lunch Theorem, Perceptron Learning Algorithm (PLA) for linearly separable data |
第4週 |
9/27,9/29 |
Mean, standard deviations, Bernoulli distribution, examples for Bayes Theorem |
第5週 |
10/04,10/06 |
Naive Bayes Classifier based on Bernoulli distribution, Maximum Likelihood Estimation (MLE), Algorithm, example (classification of hand-written digits) |
第6週 |
10/11,10/13 |
Naive Bayes Classifier based on Gaussian distribution, Maximum Likelihood Estimation (MLE), Algorithm, decision boundary |
第7週 |
10/18,10/20 |
Confusion matrix, ROC curve, discriminative probabilistic model (Logistic Regression) |
第8週 |
10/25,10/27 |
Logistic Regression (sentiment example), comparison between generative and discriminative models, MLE for learning parameters |
第9週 |
11/01,11/03 |
Optimization, gradient descent, example from logistic regression (in-class coding), stochastic gradient descent, comparison with PLA, nonlinear classifiers using nonlinear transformation |
第10週 |
11/08,11/10 |
Neural Networks: abstract neuron, AND, OR and XOR problems, multi-layer perception (MLP), mathematical definition, revisit XOR by Boolean operation |
第11週 |
11/15,11/17 |
Neural Networks: explain why the direction calculation of the gradient of loss function with respect to the weights is not efficient; introduce and derive the algorithm of Backpropagation |
第12週 |
11/22,11/24 |
Convolutional Neural Network (CNN), convolution in 1D and 2D signals, cross correlation, convolution layer vs fully-connected layer, characteristics of CNN: sparse and weights sharing, receptive field |
第13週 |
11/29,12/01 |
Why need convolution? an example of Sobel operator for edge detection, CNN architecture |
第14週 |
12/06,12/08 |
Pooling, CNN Explainer, backpropagation in CNN (derivation) |
第15週 |
12/13,12/15 |
Information Entropy, Shannon's source coding theorem, cross-entropy loss, Kullback-Leibler divergence |
第16週 |
12/20,12/22 |
Final Exam |
|