Make sure you check the syllabus for the due date.
Hand written digit recognition. Training data, labels. Testing data, labels.
A) Run an SVM implementation by training over the training set and testing over
the testing set. Try several kernels, including the polynomial and the RBF
ones. Report the results. Use one of these packages: SVMlight, osu SVM, LIBSVM or other
software (here, here). If you
choose an SVM packge that does not provide multi-class implementations, you
should write a wrapper that will make the code run for each class versus the
This problem have been well researched, and there is a comprehensive archive of results here; take a look at the several SVM papers listed before you start working on it. Data is in the IDX format, described on the same page; you will have to write your own code to read the data, or to find some code that does just that.
B) Same as problem 1. Instead of using a SVM package, implement your own. You do not have to implement the quadratic solver. Report your findings and prepare a 5 minutes demo.
Same data as in problem 1. Use a neural network to make the classification. You can perhaps adapt the NN code from hw2.
The repository page points to many NN approaches; you should do a bit of research first. Report findings and be ready to give a short demo of your code.
Explain why 0 is a constraint in the dual optimization with slack variables. (HINT: read Chris Burges tutorial first) Distinguish three cases, and explain them in terms of the classification and constraints: a) 0 ; b) 0 ; c)
Consider the following 6 points in 2D, for two classes:
class 0: (1,1) (2,2) (2,0)
class 1: (0,0) (1,0) (0,1)
a) Plot these 6 points, construct the optimal hyperplane by inspection and intuition (give the W,b) and calculate the margin.
b) Which points are support vectors ?
c) Construct the hyperplane by solving the dual optimization problem using the Lagrangian. Compare with part (a).
Prove of the harmonic functions property discussed in class based on this paper. Specifically, prove that to minimize the energy function
f must be harmonic, i.e. for all unlabeled datapoints j, it must satisfy