COMP7103B - Data mining

Semester 2, 2020-21

Instructor
Professor Mauro Sozio
Teaching assistant
Mr. Zhiyong Wu
Syllabus Data mining is the automatic discovery of statistically interesting and potentially useful patterns from large amounts of data. The goal of the course is to study the main methods used today for data mining and on-line analytical processing. Topics include Data Mining Architecture; Data Preprocessing; Mining Association Rules; Classification; Clustering; On-Line Analytical Processing (OLAP); Data Mining Systems and Languages; Advanced Data Mining (Web, Spatial, and Temporal data).
Introduction by Instructor Advances in data collection and generation technologies are producing massive amounts of data from which valuable information and knowledge can be derived. In this course we study various data mining techniques, which are powerful tools for data analysts to process data and to extract from it interesting patterns and models. These models allow new scientific discoveries and intelligent business decisions be made.
Learning Outcomes
Course Learning Outcomes Relevant Programme Learning Outcome
CLO1. Understand the knowledge discovery process, which includes data collection, data cleaning, model building, model testing and evaluation. PLO.6, 7, 8, 9, 10, 11, 12
CLO2. Understand the various data mining tasks and the principle algorithms for addressing those tasks. PLO.5, 6, 7, 8, 9, 16
View Programme Learning Outcomes
Pre-requisites Nil
Compatibility Nil
Topics covered
Course Content No. of Hours Course Learning Outcomes
1. Data Cleaning 4 CLO1
2. Data Exploration 4 CLO1
3. Ranking 4 CLO1
4. Clustering 6 CLO2
5. Association Rules 4 CLO2
6. Recommender Systems 3 CLO2
7. Advanced applications 5 CLO1, CLO2
 
Assessment
Description Type Weighting * Examination Period ^ Course Learning Outcomes
Lab Session Continuous Assessment 25% - CLO1, CLO2
Lab Session Continuous Assessment 25% - CLO1, CLO2
Written exam Written Examination 50% May 10 to May 29, 2021 CLO1, CLO2
* The weighting of coursework and examination marks is subject to approval
^ The exact examination date uses to be released when all enrolments are confirmed after add/drop period by the Examinations Office.  Students must oblige to the examination schedule. Students should NOT enrol in the course if they are not certain that they will be in Hong Kong during the examination period.  Absent from examination may result in failure in the course. There is no supplementary examination for all MSc curriculums in the Faculty of Engineering.
Course materials Prescribed textbook:
  • Introduction to Data Mining, by Tan, Steinbach, and Kumar, Addison Wesley, 2006.
  • Mining of Massive Datasets, J. Leskovec, A. Rajaraman, J. D. Ullman, Cambridge 2014 (Optional).
Session dates
Date Time Venue Remark
Session 1 28 Jan 2021 (Thu) 7:00pm - 10:00pm Online Zoom
Session 2 4 Feb 2021 (Thu) 7:00pm - 10:00pm Online Zoom
Session 3 25 Feb 2021 (Thu) 7:00pm - 10:00pm Online Zoom
Session 4 4 Mar 2021 (Thu) 7:00pm - 10:00pm Online Zoom
Session 5 18 Mar 2021 (Thu) 7:00pm - 10:00pm Online Zoom
Session 6 25 Mar 2021 (Thu) 7:00pm - 10:00pm Online Zoom
Session 7 1 Apr 2021 (Thu) 7:00pm - 10:00pm Online Zoom
Session 8 8 Apr 2021 (Thu) 7:00pm - 10:00pm Online Zoom
Session 9 15 Apr 2021 (Thu) 7:00pm - 10:00pm Online Zoom
Session 10 22 Apr 2021 (Thu) 7:00pm - 10:00pm Online Zoom
Add/drop 18 January, 2021 - 4 February, 2021
Quota 110   [For Engineering TPG students]
Back