Make sure you check the syllabus for the due date. Please use the notations adopted in class, even if the problem is stated in the book using a different notation.
SpamBase-Poluted dataset:
      the same datapoints as in the original Spambase dataset, only with
      a lot more columns (features) : either random values, or somewhat
      loose features, or duplicated original features.
      
      SpamBase-Poluted with missing values dataset: train,
      test.
      Same dataset, only some values (picked at random) have been
      deleted. 
Extract Harr features for each image on the Digits Dataset
      (Training data,
      labels. 
      Testing data,
      labels).
      
      Train 10-class ECOC-Boosting on the extracted features and report
      performance.
      (HINT: For parsing MNIST dataset, please see python Code:mnist.py;   
      MATLAB code: MNIST_Dataset)
    
    
A) Run Boosting (Adaboost or Rankboost or Gradient Boosting) to
      text documents from 20 Newsgroups without extracting features in
      advance. Extract features for each round of boosting based on
      current boosting weights. 
    
B) Run Boosting (Adaboost or Rankboost or Gradient Boosting) to
      image datapints from Digit Dataset without extracting features in
      advance. Extract features for each round of boosting based on
      current boosting weights. You can follow this paper. 
    
    
Prove of the harmonic functions property discussed in class based on this paper. Specifically, prove that to minimize the energy function
f must be harmonic, i.e. for all unlabeled datapoints j, it must satisfy