The course will study some advanced topics and techniques in database
systems, with a focus on the aspects of database systems design & algorithms
and big data processing for structured data. Traditional topics include
query optimization, physical database design, transaction management, crash
recovery, parallel databases. The course will also survey some the recent
developments in selected areas such as NoSQL databases and SQL-based big
data management systems for relational (structured) data.
Prerequisites: A course of introduction to databases and basic programming
|Introduction by Professor
We are living the Big Data era, with virtually all domains of innovation
(medicine, finance, sports, physics, astronomy, etc) becoming data rich.
The traditional database management systems (originally designed for
processing efficiently business data) have been rapidly evolving, and new
ones have emerged as well, in order to support a rich variety of
data-intensive applications, with their specific characteristics and
requirements. These applications are oftentimes characterized by the
so called four “Vs” of Big Data, namely Volume (data size), Variety (of data
formats), Velocity (e.g., streaming sources), and Veracity (uncertain data,
data reliability). Their requirements may correspond to different
perspectives on aspects such data storage architecture (e.g., cloud-based),
fault tolerance, or flexibility of the data or computation model. This
course reviews data processing architectures and algorithms for managing
large quantities of data, covering both transactional processing and data
analytics. While the focus is on core aspects of database
architectures and algorithms, the discussions are relevant to adjacent
domains, such as knowledge management, recommender systems, or machine
learning at scale.
Special Note: This course is co-coded with DASC7104 which is for MDASC students.
||The course assumes basic prior knowledge on
database systems (e.g., relational databases and SQL). Students should be
familiar with at least one programming language such as Java, C++, Python,
etc. and able to pick up other similar languages.
||No textbook is required, recommended
- Database Management Systems (3rd edition), by R.
Ramakrishnan, J. Gehrke
- Database Systems: The Complete Book, by H. Garcia Molina
- Hadoop: The Definitive Guide, by T. White
- Cassandra: The Definitive Guide, by E. Hewitt
- Graph Databases, by I. Robinson et al.
- Learning Spark, by H. Karau et al.
||1 September, 2023 - 19 September, 2023
|Maximum class size
|Moodle course website
(Login using your HKU Portal UID and PIN)
- Please note that the professor maintains and controls when to release the Moodle teaching website to students.
- Enrolled students should visit the Moodle teaching website regularly for latest announcements, course materials, assignment submission, discussion forum, etc.