Module / Live Stream |
Topic / Recorded.Lecture |
Other Reading |
Assignment |
- Module 1 : Data Basics, Similarity, KNN
Week 1 : Intro, Data Features, Mining Rules
|
|
|
|
|
Slides: Distance and Similarity
Paper: Distance / Similarity Measures
|
|
|
Module 2: Clustering
Week 3 : KMeans
Lecture 4 Notes
|
Slides: Intro to Clustering
Cluster Evaluation (Aggarwal)
Cluster Evaluation (Stanford NLP)
|
|
|
Week 4 : soft KMeans / Gaussian Mixture EM
Lecture 5 Notes
Lecture 6 Notes
|
Notes: Gaussian Mixtures
Mixture Matlab code
|
|
|
Week 5 : Hierarchical, DBScan
Lecture 7 Notes
Ward distance
|
|
|
|
Module 3: Dim Reduction, Feature Selection
Week 6 : PCA, feature Selection
Lecture 8 (PCA, kernelPCA)
|
Notes: PCA
Class notes (handwritten+ DHS book): PCA
PCA demo (Matlab): PCA
|
Kernel PCA (slides)
Kernels for ML (article)
optional:
UMAP Dimension Reduction , UMAP-paper
|
|
Juneteenth - NO CLASS
|
|
|
|
-->
Week 7 : tSNE, Feature Selection
Lecture 9 Notes
|
Paper: Harr Features
Notes:ChiSquare_FeatureSelection
Wikipedia: Mutual Information
Slides: tSNE /
paper /
implementation
(optional) tSNE gradient calculation
|
StanfordNLP: ChiSquare Feature Selection
StanfordNLP: Mutual Information Feature Selection
Paper: Feature Section for Gaussian Mixtures
|
|
Week 8 : Supervised Classification
Lecture 10 Notes
Linear Regression
|
Notes: Linear Regression
Notes: Logistic Regression
Notes: Regression Regularization
|
|
|
Module 3: Classification
Week 8 : Supervised Classification
Neural Networks
Decision Trees
Lecture 12 Notes
Decision Notes (Virgil)
Boosting Notes
|
Notes: Decision Trees
Notes: Perceptrons, Neural Networks
Slides (Mitchell book): Neural Networks
TF Visualizer (toy data)
NN interactive tutorial
Word2Vec Tutorial
|
Word2Vec paper 1
Word2Vec paper 2
|
|
Week 11 : Summarization
Lecture 15 NMF
Lecture 16 Summarization
|
Paper: Text Summarization Survey
Paper: Topic Modeling Summarization
Paper: ROUGE Evaluation for Summaries
Slides: ROUGE
|
IR/Linguistics old paper: Automatic Abstracts
Summarization basics
|
|
Module 4: Text Modeling
Week 9 : Topic Models, LDA
Lecture 13 Notes
Lecture 14 Notes
Lecture 15 NMF
|
Slides: NMF
paper: NMF
Slides: LDA
paper: LDA simplified
|
paper: LDA
More Slides: LDA
paper: Bayesian Parameter Estimation for text
Book on Implementing LDA with R code
paper: LDA vs NMF
|
|
Week 10 : Sampling
Lecture 17 Markov chains
Lecture 18: Sampling
Stevens Method: Sample Non-uniform Without Repetition
|
Sampling Basics (Matlab)
Rejection Sampling
Inverse Transform Sampling
Book: Un-uniform Sampling Procedures
Gibbs Sampling for LDA
Sampling MC/ Gibbs Demo
|
paper: Gibbs explained
|
|
-->
Module 5: Graphs/ Social Mining
Week 12 : Social Graphs
Lecture on PageRank, Markov Chain
Lecture 19 Graph Intro/Communities
Lecture 20 Graph Communities
|
Textbook: Aggarwal, Data Mining, ch 18-19
Slides: Girvan - Newman Algorithm
Python Community Visualization
|
Paper1: Girvan - Newman Algorithm
Paper2: Girvan - Newman Algorithm
Paper3: Girvan - Newman Algorithm
|
|
Week 13 : Social Mining
Lecture 18 Collab Filtering
Lecture 21 KB-QA
|
Textbook: Aggarwal, Data Mining, ch 18-19
Notes: collaborative fiiltering basic formula
Slides: Netflix User Profiles
|
|
|
|
FINAL EXAM 8/14 4pm-8pm in class, WVH212
You will need a computer for the exam problems, and might be called to explain/demo your code after.
Submit a copy of your code on gradescope together with running instructions.
|
|
|