Introduction to pattern recognition, introduction to classifier design and supervised learning from data, classification and regression, basics of Bayesian decision theory, Bayes and nearest neighbour classifiers, parametric and non-parametric estimation of density functions, linear discriminant functions, Perceptron, linear least-squares regression, LMS algorithm.
Fisher linear discriminant, introduction to statistical learning theory and empirical risk minimization, non-linear methods for classification and regression, artificial neural networks for pattern classification and regression, multilayer feedforward networks, backpropagation, RBF networks, Optimal separating hyperplanes, Supoort Vector Machines and some variants, Assessing generalization abilities of a classifier, Bias-variance trade-off, crossvalidation, bagging and boosting, AdaBoost algorithm, brief discussion of feature selection and dimensionality reduction methods.
The course is designed for graduate students (i.e. first year ME or research students). The course is intended to give the students a fairly comprehensive view of fundamentals of classification and regression. However, not all topics are covered. For example, we do not discuss Decision tree classifiers. Also, the course deals with neural networks models only from the point of view of classification and regression. For example, no recurrent neural network models (e.g., Boltzman machine) are included. The main reason for leaving out some topics is to keep the course content suitable for a one semester course.
Course details:
- Module 1 - Overview of Pattern classification and regression: Introduction to Statistical Pattern Recognition, Overview of Pattern Classifiers.
- Module 2 - Bayesian decision making and Bayes Classifier: The Bayes Classifier for minimizing Risk, Estimating Bayes Error; Minimax and Neymann-Pearson classifiers.
- Module 3 - Parametric Estimation of Densities: Implementing Bayes Classifier; Estimation of Class Conditional Densities, Maximum Likelihood estimation of different densities, Bayesian estimation of parameters of density functions, MAP estimates, Bayesian Estimation examples; the exponential family of densities and ML estimates, Sufficient Statistics; Recursive formulation of ML and Bayesian estimates.
- Module 4 - Mixture Densities and EM Algorithm: Mixture Densities, ML estimation and EM algorithm, Convergence of EM algorithm; overview of Nonparametric density estimation.
- Module 5 - Nonparametric density estimation: Convergence of EM algorithm; overview of Nonparametric density estimation, Nonparametric estimation, Parzen Windows, nearest neighbour methods.
- Module 6 - Linear models for classification and regression: Linear Discriminant Functions; Perceptron - Learning Algorithm and convergence proof, Linear Least Squares Regression; LMS algorithm, AdaLinE and LMS algorithm; General nonliner least-squares regression, Logistic Regression; Statistics of least squares method; Regularized Least Squares, Fisher Linear Discriminant, Linear Discriminant functions for multi-class case; multi-class logistic regression.
- Module 7 - Overview of statistical learning theory, Empirical Risk Minimization and VC-Dimension: Learning and Generalization; PAC learning framework, Overview of Statistical Learning Theory; Empirical Risk Minimization, Consistency of Empirical Risk Minimization, Consistency of Empirical Risk Minimization; VC-Dimension, Complexity of Learning problems and VC-Dimension, VC-Dimension Examples; VC-Dimension of hyperplanes.
- Module 8 - Artificial Neural Networks for Classification and regression: Overview of Artificial Neural Networks, Multilayer Feedforward Neural networks with Sigmoidal activation functions, Backpropagation Algorithm; Representational abilities of feedforward networks, Feedforward networks for Classification and Regression; Backpropagation in Practice, Radial Basis Function Networks; Gaussian RBF networks, Learning Weights in RBF networks; K-means clustering algorithm.
- Module 9 - Support Vector Machines and Kernel based methods: Support Vector Machines - Introduction, obtaining the optimal hyperplane, SVM formulation with slack variables; nonlinear SVM classifiers, Kernel Functions for nonlinear SVMs; Mercer and positive definite Kernels, Support Vector Regression and ε-insensitive Loss function, examples of SVM learning, Overview of SMO and other algorithms for SVM; ν-SVM and ν-SVR; SVM as a risk minimizer, Positive Definite Kernels; RKHS; Representer Theorem.
- Module 10 - Feature Selection, Model assessment and cross-validation: Feature Selection and Dimensionality Reduction; Principal Component Analysis, No Free Lunch Theorem; Model selection and model estimation; Bias-variance trade-off, Assessing Learnt classifiers; Cross Validation, Boosting and Classifier ensembles, Bootstrap, Bagging and Boosting; Classifier Ensembles; AdaBoost, Risk minimization view of AdaBoost.