課程資訊

High-dimensional probability

110-1

MATH5269

221 U9140

3.0

We will focus on probability theory in high dimensions with applications in data science. This course is intended for senior undergraduate students and graduate students who are interested in mathematical tools used in data science and high-dimensional statistics. Examples of high-dimensional probabilistic problems include random matrices, estimation for high-dimensional data, randomized algorithms, optimization in a disordered system etc.

This course will be divided into two parts. In the first part, we will focus on R. Vershynin's book "High-Dimensional Probability". We will talk about concentration inequalities for random variables, random vectors in high dimensions and some basic random matrix theory. If time allows, we will also cover some other interesting (and important) topics in this book.

In the second part, we will more or less follow the book "Information, Physics, and Computation" by M. Mezard and A. Montanari, and discuss how methods in information theory and statistical physics can be applied to random optimization problems. This book is a physics book, but we will try to make things rigorous. Tentative topics include the Random Energy Model, the random code ensemble, number partitioning, factor graphs and graph ensembles.

Our aim is to provide a brief introduction to some mathematical tools used in data science (in particular, the importance of these tools is rising very rapidly) that are not covered in usual undergraduate probability courses. Some applications in data science will also be discussed.

Undergraduate probability theory, analysis and linear algebra. Knowledge of measure theory is not essential but would be helpful.

Office Hours

R. Vershynin, High-Dimensional Probability. (Available at author's website: https://www.math.uci.edu/~rvershyn/papers/HDP-book/HDP-book.html)
M. Mezard and A. Montanari, Information, Physics, and Computation.

(僅供參考)

 No. 項目 百分比 說明 1. Homework 60% There will be about 6-8 assignments throughout the semester. 2. Take home exams 40% There will be 2 take home exams.

 課程進度
 週次 日期 單元主題 第1週 9/23, 9/28 Introduction, random variables, basic inequalities, tail probabilities, limit theorems 第2週 9/30, 10/5 Hoeffding's inequality, Chernoff's inequality 第3週 10/7, 10/12 Erdos-Renyi graph, subgaussian distribution 第4週 10/14, 10/19 Subexponential distribution, Orlicz spaces, Bernstein's inequality, examples of random vectors, concentration of the norm 第5週 10/21, 10/26 Isotropic random vectors, more examples of random vectors, frames, subgaussian vectors in higher dimensions 第6週 10/28, 11/2 Grothendieck's inequality, semidefinite programming, maximum cut, Grothendieck's identity 第7週 11/4, 11/9 Second proof of Grothendieck's inequality, basic linear algebra 第8週 11/11, 11/16 Midterm, covering numbers, packing numbers, error correcting codes 第9週 11/18, 11/23 Random subgaussian matrices, low rank matrix detection, community detection 第10週 11/25, 11/30 Two-sided bound on operator norm, estimating covariance matrix, Gaussian mixture model, Wigner's semicircle law 第11週 12/2, 12/7 Proof of Wigner's semicircle law, Marchenko-Pastur distribution, Brunn-Minkowski inequality 第12週 12/9, 12/14 Isoperimetric inequality on R^n and on S^{n-1}, concentration of Lipschitz function on the sphere, Grassmannian, Johnson-Lindenstrauss Lemma 第13週 12/16, 12/21 Matrix Bernstein's inequality and its applications, the Sherrington-Kirkpatrick model 第14週 12/23, 12/28 Random energy model, replica method, replica symmetry breaking, Parisi's formula 第15週 12/30, 1/4 Bayesian inference in Gaussian noise, Nishimori's identity, Low-rank symmetric matrix estimation, BBP transition 第16週 1/6 Discussion of the replica symmetric formula: examples, applications and approximate message passing