Statistical Learning: Algorithmic and Nonparametric Approaches



In this web page you will find

Class outline:
Lecture Title Description Notes Code
NA Review Stuff you should know: Basics of probability, the central limit theorem, and inference PDF NA
1 Introduction to Regression and Prediction We will describe linear regression in the context of a prediction problem. PDF R
2 Overview of Supervised Learning Regression for predicting bivariate data, K nearest neighbors (KNN), bin smoothers, and an introduction to the bias/variance trade-off. PDF R
3-4 Linear Methods for Regression Subset selection and ridge regression. We will use singular value decomposition (SVD) and principal component analysis (PCA) to understand these methods. PDF R
5 Linear Methods for Classification Linear Regression, Linear Discriminant Analysis (LDA), and Logisitc Regression PDF R
6 Kernel Methods Kernal smoothers including loess. We will briefly describe 2 dimensional smoothers. We will also define degrees of freedom in the context of smoothing and learn about density estimators. PDF R
7 Model Assessment and Selection We revist the bias-variance tradeoff. We describe how monte-carlo simulations can be used to assess bias and variance. We then introduce cross-validation, AIC, and BIC. PDF R
8 The Bootstrap We give a short introduction to the bootstrap and demonstrate its utility in smoothing problems. PDF R
9-10 Splines, Wavelets, and Friends We give intuitive and mathematical description of Splines and Wavelets. We use the SVD to understand these better and see connections with signal processing methods. PDF R
11-12 Additive Models, GAM and Neural Networks We move back to cases with many covariates. We introduce projection pursuit, additive models as well as generalized additive models. We breifly describe neural networks and explain the connection to projection pursuit. PDF NA
13-14 CART, Boosting and Additive Trees We introduce classification algorithms and regression trees (CART) as well as the more modern versions such as random forrests. PDF archive for CART, archive for others
15 Model Averaging Bayesian Statistics, Boosting and Bagging PDF NA
16 Clustering algorithms Notes and code taking from my My microarray class PDF R

Homework:


Data-sets:


Recommended Books
Resources
Class General Info

Last updated: 4/18/2006