Class Schedule: Tuesday and Friday 9:50-11:30am, Ryder Hall 155.
Office Hours: Tuesday, 11:30am-12:30pm, WVH 348
“Big-data” analytics has enabled a number of compute-intensive applications (such as machine translation, speech recognition and precision medicine) with large positive impact to our daily lives. Not surprisingly, “security analytics”, the application of machine learning and data mining in the field of cyber security, is effective as well in learning and predicting attacker behavior, detecting malicious infrastructures and designing more effective defensive techniques. This class will cover various practical applications of machine learning techniques in network security, web security, malware detection and usable authentication.
Compared to other areas benefiting from machine learning, security applications exhibit additional challenges due to limited availability of attack datasets, difficulty of validating new findings, high cost of false positives, and the risk of potential adversarial tampering with the datasets and models. The course will also discuss directions for addressing these challenges and include advanced topics in the areas of adversarial machine learning and privacy-preserving analytics.
We will be reading and discussing recent research papers from security and machine learning conferences. A major component of the class is a research project conducted in a small team of 1-2 students. A detailed project report suitable for a workshop submission is expected at end of class.
· Fundamental Networking
· Introductory security preferable
· Basic data mining preferable
The grade will be based on:
- Class participation – 20%
· Participation in discussing the papers in class
· Leading the discussion for several papers
- Paper summaries - 20%
· Submit paper summaries before class
· Detailed comments on weaknesses, strengths and contributions
- Research project - 60%
· 10 % project proposal - Due 10/04
· 30% final project report
· 20% presentation in class
Reading will be assigned for each lecture. The day before lecture (at midnight), every student must submit a report for each assigned paper. The report should contain a one-paragraph summary of the paper, description of three strong points of the paper and three weak points of the paper, discussion on data collection and machine learning methodology. Instructor will provide the template for paper summaries.
Please send the reports in Piazza.
- Problem addressed by the project
- Proposed approach
- Milestones (main steps and timeline)
- References: additional literature survey that you intend to do
- Tools: software, packages
- Data sources: publicly available datasets for your research
- Deliverable items: implementation, simulation results, graphs, visualizations, etc.
- Motivation of addressed problem
- Description of public dataset used
- Proposed solution/algorithm including technical details
- Comparison with related work
- Experimental results
Anomaly Detection: A Survey. V. Chandola, A. Banerjee, and V. Kumar
[ESL] The elements of statistical learning. Data mining, Inference, and Prediction. T. Hastie, R. Tibshirani and J. Friedman