Welcome#

to our Knowledge Discovery and Data mining(KDD) course. In this course, we’ll delve into the world of data mining, uncovering valuable insights from vast datasets. Explore techniques for identifying meaningful patterns, correlations, and trends, and apply them to real-world and synthetic data. Topics encompass all stages of knowledge discovery, from association rules to cluster analysis, classification, and regression. Through hands-on coding, students will implement essential data mining algorithms and use existing tools to expand their skill set in practical applications

Course Information#

Instructor: Dr. Yong Zhuang

Class Schedule#

Section 03#

  • Midterm: The week starting Monday, February 17

  • Final Exam: The week starting Monday, April 21

Preference Books#

There is no main textbook for the class. However, you may use materials from the following books as a reference. Lecture slides and additional reading materials will be provided on the class website.

  • Data Mining Concepts and Techniques (4th Edition) by Jiawei Han, Jian Pei, and Hanghang Tong. Publication Date: 2023. (free at GVSU library)

Tentative Schedule#

  • To execute the sample Jupyter Notebook code , click on the rocket icon at the top of the page, which will open the notebook in Google Colab for interactive use.

Week

Content

Reading

1 (01/06)

Syllabus
What is Data Mining: slides | video
Data Mining Tasks: slides | video
Introduction to Python: code

resources

2 (01/13)

Descriptive Statistics: slides | code | video
Data Visualization: slides | video
Introduction to Numpy: code
Introduction to Pandas: code

resources

3 (01/20)

Data Cleaning & Integration: slides | video | code
Data Transformation: slides | video | code
Data Compression & Sampling: slides | video | code

resources

4 (01/27)

Object Analysis: Similarity and distance measures: slides | video

resources

5 (02/03)

Feature Analysis: Relationships: slides

resources

6 (02/10)

Midterm Exam: topics | practice
Data Transformation II: slides

resources

7 (02/17)

Midterm Exam: questions

resources

8 (02/24)

Feature Extraction: slides | code | video
Feature Selection: slides | code
Markov Blanket: slides

resources

9 (03/03)

Spring Break (No Class)

resources

10 (03/10)

Decision Tree: slides

resources

11 (03/17)

Classifier Evaluation, Model Selection: slides
Bayesian Classification: slides
Quiz 5 – Due: 03/26

resources

12 (03/24)

Linear, Logistic regression and Perceptron: slides
Lazy learning: slides
Clustering: slides | video

resources

13 (03/31)

Neural Network: slides | video
CNN: slides | video

resources

14 (04/07)

RNN: slides | video
Attention: slides | video
Transformer: slides | video | code

resources

15 (04/14)

Project Presentation
Final Exam: topics | practice

resources

16 (04/21)

Final Exam