Introduction to Machine Learning

203.4770 (3770)

Semester B

General Course Information


Meeting Times: Monday  10-12 ; Thursday 10-12

Location: Room

Instructor: Dr. Rita Osadchy

e-mail: rita [at]cs [dot]haifa.ac.il
Office: Jacobs 408
ญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญญ____________________________________________________________________________________________________________

Course Description

 Machine learning is concerned with the development of computer algorithms that are able to learn solving tasks given a set of examples of those tasks and some prior knowledge about them. Machine learning has a wide spectrum of applications including handwritten or speech recognition,  image classification, medical diagnosis, stock market analysis, bioinformatics etc. The goal of this course is to present the main concepts of modern machine learning methods including some theoretical background.

Recommended Prerequisites

The course assumes some basic knowledge of probability theory and linear algebra;
for example, you should be familiar with

  • Joint and marginal probability distributions
  • Normal (Gaussian) distribution
  • Expectation and variance
  • Statistical correlation and statistical independence

Tutorials of the above topics.

Problems, Concepts, Methods, and Tools within in the course

The list is partial and be can changed.

Problems

  • Regression and classification
  • Feature selection
  • Density estimation
  • Clustering
  • Model selection
  • Inference

Concepts

  • Estimation, bias, variance, loss, Empirical risk, maximum likelihood
  • Generalization, overfitting
  • Regularization
  • Capacity, VC-dimension
  • Generative/discriminative models
  • Minimum description length

Models and Methods

  • linear regression
  • Generalized Linear Models
  • Non-parametric Models
  • Neural networks
  • Support Vector Machine (SVM)
  • Decision Trees
  • Boosting

Tools

  • Cross-validation
  • Gradient descent
  • Quadratic programming
  • Forward-backward algorithm

 

The course will furthermore use several real-life applications to illustrate the interest of statistical machine learning.

_____________________________________________________________________

Announcements

Grades of home exam are available here.

“hazara” TBD

Both assignments should be submitted on 17/08!

Home exam will be sent out on 25/08 (morning) and should be submitted by 26/08, 23:55.

Home Assignment 2 is available. The data needed for this assignment can be downloaded here.

Home Assignment 1 is available. The problem set will be distributed by email, send a request to e-mail: rita [at]cs [dot]haifa.ac.il. The data needed for this assignment can be downloaded here.
 

______________________________________________________________________

Lecture Notes

12.5

Introduction

PDF

15.5

Probability Tutorial,

Introduction to Classification

PDF

PDF

19.5

Bayesian Decision Theory, ML, MAP classifiers

PDF

22.5

Tutorial on Bayesian Decision Theory, ML, MAP classifiers

PDF

26.5

Normal Variables and their discriminant functions

PDF

29.5

Parametric density estimation – MLE

MLE - tutorial

PDF

PDF

2.6

Parametric density estimation - Bayesian Estimation

PDF

 

5.6

Naïve Bayes

Non-parametric density estimation, Histogram, Parzen Window

PDF

PDF

9.6

shavuot

 

12.6

Non-parametric density estimation, nearest neighbors, KNN.

PDF

16.6

LDF

PDF

19.6

MSE

PDF

23.6

SVM (guest lecture)

PDF

26.6

Intro to Neural Networks(guest lecture)

PDF

30.6

Dimensionality Reduction PCA

PDF

 

3.7

FDA,MDA

PDF

7.7

Linear Regression, Regression with Shrinkage

PDF

10.7

Bias-Variance Decomposition Linear
Error Estimators
Cross-Validation

PDF

14.7

Computational Learning Theory

PDF

17.7

Decision Trees

Clustering

PDF

PDF

21.7

Clustering, EM

PDF

25.7

Boosting

PDF

 

 

 

 

 

 

 

 

______________________________________________________________________

Home Assignments:

-         All home assignments will be distributed by email. Send a request to e-mail: rita  [at]cs [dot]haifa.ac.il

-        Each assignment should be formatted as a report (only outputs, plots etc; no code).

-        Submissions via email or printed (don’t print your code).  

Data for Assignment 2:

The data contains a face data, two m-files for visualization, and a plant data. All files are archived in Data.zip

Data for Assignment 1:

file1, file2

______________________________________________________________________

Textbooks:

______________________________________________________________________

Probability tutorials:

 

http://www.autonlab.org/tutorials/prob18.pdf - first half

 

http://www-stat.stanford.edu/~susan/courses/s116/

 

Linear Algebra tutorial:

Eigen value decomposition

_______________________________________________________________

MATLAB resources:

  Introductory Tutorials

MATLAB tutorial from University of Utah

MATLAB tutorial from Carnegie Mellon University

MATLAB tutorial from Indiana University

  Slightly more advanced Tutorials

  More complete references/tutorials/FAQs

______________________________________________________________________