Home     MCADS Lab     Research     Publications     Activities     Codes     Data     Teaching


DS 4400: Machine Learning and Data Mining 1


GENERAL INFORMATION

  • Instructor: Prof. Ehsan Elhamifar
  • Instructor Office Hours: Fridays, 4:30pm—5:15pm, 310E WVH
  • Class: Tuesdays and Fridays 1:35pm—3:15pm, Behrakis Health Sciences Cntr 325
  • TA: Dat Huynh (huynh.dat [at] husky.neu.edu), Office Hours: Mon (10:45-11:30am), Thu (4:00-4:45pm), 162 WVH
  • Discussions, Lectures, Homeworks on Piazza
  • DESCRIPTION

    This course covers practical algorithms for supervised machine learning from a variety of perspectives. Topics include generative/discriminative learning, parametric/non-parametric learning, deep neural networks, support vector machines, decision trees as well as learning theory. The course will also discuss recent applications of machine learning, such as computer vision, data mining, natural language processing, speech recognition and robotics.

    SYLLABUS
    1. Linear regression, Overfitting, Regularization, Sparsity

    2. Maximum likelihood estimation

    3. Logistic regression

    4. Naive Bayes

    5. Perceptron

    6. Convex optimization, SGD

    7. SVM and kernels

    8. Neural networks and deep learning: DNNs, CNNs

    9. Decision trees

    10. Hidden Markov Models

    11. Bayesian learning

    12. Ethics and Fairness in ML

    GRADING

    Homeworks are due at the beginning of the class on the specified dates. No late homeworks or projects will be accepted.

    • Homeworks: 4 HWs (40%)

    • Project (20%)

    • Two Midterm Exams (40%)

    Homework consist of both analytical questions and programming assignments. Programming assignments must be done via Python. Both codes and results of running codes on data must be submitted.

    The exam consist of analytical questions from topics covered in the class. Students are allowed to bring a single cheat sheet to the exam.

    TEXTBOOKS

    • [CB] Christopher Bishop, Pattern recognition and machine learning. [Optional]

    • [KM] Kevin P. Murphy, Machine Learning: A Probabilistic Perspective. [Optional]

    READINGS

      Lecture 1: Introduction to ML, Linear Algebra Review

      Lecture 2: Linear Algebra Review, Introduction to Regression

      • Chapter 3 from CB book.

      Lecture 3: Linear Regression: Convexity, Closed-form Solution, Gradient Descent

      • Chapter 3 from CB book.

      Lecture 4: Robust Regression, Overfitting, Regularization

      • Chapter 3 from CB book.

      Lecture 5: Basis Function Expansion, Hyper-parameter Tuning, Cross Validation, Probability Review

      Lecture 6: Maximum Likelihood Estimation

      • Chapter 2 from CB book.

      Lecture 7: Bayesian Learning, Maximum A Posteriori (MAP) Estimation, Classification

      • Chapter 3 and 4.3 from CB book.

      Lecture 8: Logistic Regression, Parameter Learning via Maximum Likelihood, Overfitting

      • Chapter 4.3 from CB book.

      Lecture 9: Softmax Regression, Discriminate vs Generative Modeling, Generative Classification

      • Chapter 4.2 from CB book.

      Lecture 10: Generative Classification, Naive Bayes

      • Chapter 4.2 from CB book.

      Lecture 11: Generative Classification, Naive Bayes

      • Chapter 4.2 from CB book.

      Lecture 12: Convex Optimization, Lagrangian Function, KKT Conditions

      • See lecture notes on piazza.

      Lecture 13: Suport Vector Machines

      • Chapter 7 from CB book.

      Lecture 14: Suport Vector Machines: Vanilla SVM, Dual SVM

      • Chapter 7 from CB book.

      Lecture 15: Suport Vector Machines: Soft-Margin SVM, Kernel SVM, Multi-Class SVM

      • Chapter 7 from CB book.

      Lecture 16: Neural Networks

      • Chapter 5 from CB book.

      Lecture 17: Neural Networks: Training, Forward and Back Propagation

      • Chapter 5 from CB book.

      Lecture 18: Convolutional Neural Network

      • See slides on piazza.

      Lecture 19: Sequential Data Modeling and HMMs

      • See slides on piazza.

      Lecture 20: Ethics and Fairness in ML, part 1

      • See slides on piazza.

      Lecture 21: Ethics and Fairness in ML, part 2

      • See slides on piazza.

    ADDITIONAL RESOURCES

    ETHICS

    All students in the course are subject to the Northeastern University's Academic Integrity Policy. Any submitted report/homework/project by a student in this course for academic credit should be the student's own work. Collaborations are only allowed if explicitly permitted. Per CCIS policy, violations of the rules, including cheating, fabrication and plagiarism, will be reported to the Office of Student Conduct and Conflict Resolution (OSCCR). This may result in deferred suspension, suspension, or expulsion from the university.