CS6140 Machine Learning

HW5

Make sure you check the syllabus for the due date.



Hand written digit recognition. Training data, labels.  Testing data, labels.

PROBLEM 1 [50 points]

 Run an SVM implementation by training over the training set and testing over the testing set. Try several kernels, including the polynomial and the RBF ones. Report the results. Use one of these packages: SVMlight, SGDSVM, osu SVMLIBSVM or other software  (here, here).  If you choose an SVM packge that does not provide multi-class implementations, you should write a wrapper that will make the code run for each class versus the others, separately.
This problem have been well researched, and there is a comprehensive archive of results here; take a look at the several SVM papers listed before you start working on it. Data is in the IDX format, described on the same page; you will have to write your own code to read the data, or to find some code that does just that.

PROBLEM 2 [50 points]

Same as problem 1. Instead of using a SVM package, implement your own using SMO optimization (do not have implement the quadratic solver). Compare with Problem 1 results

PROBLEM 3 [Extra Credit]

Same problem, but dont use SMO. Instead use a built in  (or existing library) quadratic solver from Matlab, Python, Java, C etc.

PROBLEM 4 [Extra Credit]

Same data as in problem 1. Use a neural network to make the classification. You can perhaps adapt the NN code from hw2.

The repository page points to many NN approaches; you should do a bit of research first. Report findings and be ready to give a short demo of your code.

PROBLEM 5 [20 points]

Explain why 0 α C/m is a constraint in the dual optimization with slack variables. (HINT: read Chris Burges tutorial first) Distinguish three cases, and explain them in terms of the classification and constraints: a) 0 = α ; b) 0 < α < C/m ; c) α = C/m. This has been discussed in class and in SVM notes; a detailed rigurous explanation is expected.

PROBLEM 6 [20 points]

Consider the following 6 points in 2D, for two classes:

class 0:   (1,1)   (2,2)    (2,0)

class 1:   (0,0)   (1,0)    (0,1)

a) Plot these 6 points, construct the optimal hyperplane by inspection and intuition (give the W,b) and calculate the margin.

b) Which points are support vectors ?

c) Construct the hyperplane by solving the dual optimization problem using the Lagrangian. Compare with part (a).


PROBLEM 5 [Extra Credit]

Prove of the harmonic functions property discussed in class based on this paper. Specifically, prove that to minimize the energy function

f must be harmonic, i.e. for all unlabeled datapoints j, it must satisfy