COMP7103A - Data mining

Semester 2, 2017-18

Professor Ben C.M. Kao
Teaching assistant
Mr. Kevin Y.K. Lam
Syllabus Data mining is the automatic discovery of statistically interesting and potentially useful patterns from large amounts of data. The goal of the course is to study the main methods used today for data mining and on-line analytical processing. Topics include Data Mining Architecture; Data Preprocessing; Mining Association Rules; Classification; Clustering; On-Line Analytical Processing (OLAP); Data Mining Systems and Languages; Advanced Data Mining (Web, Spatial, and Temporal data).
Introduction by Instructor Advances in data collection and generation technologies are producing massive amounts of data from which valuable information and knowledge can be derived. In this course we study various data mining techniques, which are powerful tools for data analysts to process data and to extract from it interesting patterns and models. These models allow new scientific discoveries and intelligent business decisions be made.
Learning Outcomes
Course Learning Outcomes Relevant Programme Learning Outcome
CLO1. Understand the knowledge discovery process, which includes data collection, data cleaning, model building, model testing and evaluation. PLO.6, 7, 8, 9, 10, 11, 12
CLO2. Understand the various data mining tasks and the principle algorithms for addressing those tasks. PLO.5, 6, 7, 8, 9, 16
View Programme Learning Outcomes
Pre-requisites Nil
Compatibility Nil
Topics covered
Course Content No. of Hours Course Learning Outcomes
1. Data Cleaning 4 CLO1
2. Data Exploration 4 CLO1
3. Ranking 4 CLO1
4. Clustering 6 CLO2
5. Association Rules 4 CLO2
6. Recommender Systems 3 CLO2
7. Advanced applications 5 CLO1, CLO2
Description Type Weighting * Examination Period ^ Course Learning Outcomes
Lab Session Continuous Assessment 25% - CLO1, CLO2
Lab Session Continuous Assessment 25% - CLO1, CLO2
Written exam Written Examination 50% May 7 to 26, 2018 CLO1, CLO2
* The weighting of coursework and examination marks is subject to approval
^ The exact examination date uses to be released when all enrolments are confirmed after add/drop period by the Examinations Office.  Students must oblige to the examination schedule. Students should NOT enrol in the course if they are not certain that they will be in Hong Kong during the examination period.  Absent from examination may result in failure in the course. There is no supplementary examination for all MSc curriculums in the Faculty of Engineering.

For reference:
Course materials Prescribed textbook:
  • Introduction to Data Mining, by Tan, Steinbach, and Kumar, Addison Wesley, 2006.
  • Mining of Massive Datasets, J. Leskovec, A. Rajaraman, J. D. Ullman, Cambridge 2014 (Optional).
Session dates
Date Time Venue Remark
Session 1 15 Jan 2018 (Mon) 9:30am - 12:30pm MB-G07  
Session 2 22 Jan 2018 (Mon) 9:30am - 12:30pm MB-G07  
Session 3 29 Jan 2018 (Mon) 9:30am - 12:30pm MB-G07  
Session 4 5 Feb 2018 (Mon) 9:30am - 12:30pm MB-G07  
Session 5 12 Feb 2018 (Mon) 9:30am - 12:30pm MB-G07  
Session 6 26 Feb 2018 (Mon) 9:30am - 12:30pm MB-G07  
Session 7 12 Mar 2018 (Mon) 9:30am - 12:30pm MB-G07  
Session 8 19 Mar 2018 (Mon) 9:30am - 12:30pm MB-G07  
Session 9 26 Mar 2018 (Mon) 9:30am - 12:30pm MB-G07  
Session 10 9 Apr 2018 (Mon) 9:30am - 12:30pm MB-G07  
MB - Main Building
Add/drop 15 January, 2018 - 28 January, 2018
Quota 100