COMP7104A - Advanced database systems

Semester 1, 2023-24

Professor
Bogdan Cautis
Teaching assistants
Yaozhu Sun
Tianle Wang
Mingruo Yuan
Syllabus The course will study some advanced topics and techniques in database systems, with a focus on the aspects of database systems design & algorithms and big data processing for structured data. Traditional topics include query optimization, physical database design, transaction management, crash recovery, parallel databases. The course will also survey some the recent developments in selected areas such as NoSQL databases and SQL-based big data management systems for relational (structured) data.

Prerequisites: A course of introduction to databases and basic programming skills.
Introduction by Professor We are living the Big Data era, with virtually all domains of innovation (medicine, finance, sports, physics, astronomy, etc) becoming data rich.  The traditional database management systems (originally designed for processing efficiently business data) have been rapidly evolving, and new ones have emerged as well, in order to support a rich variety of data-intensive applications, with their specific characteristics and requirements.  These applications are oftentimes characterized by the so called four “Vs” of Big Data, namely Volume (data size), Variety (of data formats), Velocity (e.g., streaming sources), and Veracity (uncertain data, data reliability).  Their requirements may correspond to different perspectives on aspects such data storage architecture (e.g., cloud-based), fault tolerance, or flexibility of the data or computation model.  This course reviews data processing architectures and algorithms for managing large quantities of data, covering both transactional processing and data analytics.  While the focus is on core aspects of database architectures and algorithms, the discussions are relevant to adjacent domains, such as knowledge management, recommender systems, or machine learning at scale.

Special Note: This course is co-coded with DASC7104 which is for MDASC students.

Learning Outcomes
Course Learning Outcomes Relevant Programme Learning Outcome
CLO1. Understand trade-offs in database systems techniques, able to apply the acquired knowledge on the state-of-the-art in modern data management for designing holistic solutions based on database systems / Big Data systems and techniques, justify design decisions in the context of a data management solution. PLO 1, 2, 4, 5, 7, 8, 9, 10, 11, 14, 15, 16
CLO2. Able to implement and evaluate complex, scalable database systems solutions, develop new methods in databases based on the acquired knowledge of existing techniques. PLO 1, 2, 4, 5, 7, 8, 9, 10, 11, 14, 15, 16
View Programme Learning Outcomes
Pre-requisites The course assumes basic prior knowledge on database systems (e.g., relational databases and SQL). Students should be familiar with at least one programming language such as Java, C++, Python, etc. and able to pick up other similar languages.
Compatibility Nil
Topics covered
Course Content No. of Hours Course Learning Outcomes
1. Query processing and optimization for relational database systems 6 CLO1, CLO2
2. Transaction management and crash recovery 6 CLO1, CLO2
3. Indexing methods 3 CLO1, CLO2
4. Parallel database systems 4.5 CLO1, CLO2
5. Big Data systems (with Hadoop, Spark) 6 CLO1, CLO2
6. NoSQL systems 4.5 CLO1, CLO2
 
Assessment
Description Type Weighting * Examination Period ^ Course Learning Outcomes
Laboratory Continuous Assessment 25% - CLO1, CLO2
Mid-term quiz Continuous Assessment 30% - CLO1, CLO2
Written exam, covering all taught content in the course. Written Examination 45% 8 - 23 December 2023 CLO1, CLO2
* The weighting of coursework and examination marks is subject to approval
^ The exact examination date uses to be released when all enrolments are confirmed after add/drop period by the Examinations Office.  Students are obliged to follow the examination schedule.  Students should NOT enrol in the course if they are not certain that they will be in Hong Kong during the examination period.  Absent from examination may result in failure in the course. There is no supplementary examination for all MSc curriculums in the Faculty of Engineering.
Course materials No textbook is required, recommended readings in:
  • Database Management Systems (3rd edition), by R. Ramakrishnan, J. Gehrke
  • Database Systems: The Complete Book, by H. Garcia Molina et al.
  • Hadoop: The Definitive Guide, by T. White
  • Cassandra: The Definitive Guide, by E. Hewitt
  • Graph Databases, by I. Robinson et al.
  • Learning Spark, by H. Karau et al.
Session dates
Date Time Venue Remark
Session 1 12 Sep 2023 (Tue) 7:00pm - 10:00pm CYC-P1  
Session 2 19 Sep 2023 (Tue) 7:00pm - 10:00pm CYC-P1  
Session 3 26 Sep 2023 (Tue) 7:00pm - 10:00pm CYC-P1  
Session 4 10 Oct 2023 (Tue) 7:00pm - 10:00pm CYC-P1  
Session 5 17 Oct 2023 (Tue) 7:00pm - 10:00pm CYC-P1  
Session 6 24 Oct 2023 (Tue) 7:00pm - 10:00pm CYC-P1  
Session 7 7 Nov 2023 (Tue) 7:00pm - 10:00pm CYC-P1  
Session 8 14 Nov 2023 (Tue) 7:00pm - 10:00pm HW-311 & HW-312  
Session 9 21 Nov 2023 (Tue) 7:00pm - 10:00pm CYC-P1  
Session 10 28 Nov 2023 (Tue) 7:00pm - 10:00pm CYC-P1  
CYC - Chong Yuet Ming Chemistry Building HW - Haking Wong Building
Add/drop 1 September, 2023 - 19 September, 2023
Maximum class size 150
Back