| Instructor |
Professor Bogdan Cautis
|
| Teaching assistant |
Mr. Kevin Y.K. Lam
|
| Syllabus |
The course will study some advanced topics and techniques in database
systems, with a focus on the aspects of big data analytics, algorithms, and
system design & organisation. It will also survey the recent development and
progress in selected areas. Topics include: query optimization,
spatial-spatiotemporal data management, multimedia and time-series data
management, information retrieval and XML, data mining. |
| Introduction by Instructor |
We are living the Big Data era, with virtually all domains of innovation
(medicine, finance, sports, physics, astronomy, etc) becoming data rich.
The traditional database management systems (originally designed for
processing efficiently business data) have been rapidly evolving, and
new ones have emerged as well, in order to support a rich variety of
data-intensive applications, with their specific characteristics and
requirements. These applications are oftentimes characterized by the so
called four “Vs” of Big Data, namely Volume (data size), Variety (of
data formats), Velocity (e.g., streaming sources), and Veracity
(uncertain data, data reliability). Their requirements may correspond to
different perspectives on aspects such data storage architecture (e.g.,
cloud-based), fault tolerance, or flexibility of the data or computation
model. This course reviews data processing architectures and algorithms
for managing large quantities of data, covering both transactional
processing and data analytics. While the focus is on core aspects of
database architectures and algorithms, the discussions are relevant to
adjacent domains, such as knowledge management, recommender systems, or
machine learning at scale.
Special Note: This course is co-coded with DASC7104 and is mainly for MDASC students. The quota for MSc(CompSc) student is limited to 20.
|
| Learning Outcomes |
|
| Pre-requisites |
The course assumes basic prior knowledge on
database systems (e.g., relational databases and SQL). Students should be
familiar with at least one programming language such as Java, C++, Python,
etc. and able to pick up other similar languages. |
| Compatibility |
Nil |
| Topics covered |
|
| Assessment |
|
| Course materials |
No textbook is required, recommended
readings in:
- Database Management Systems, by R. Ramakrishnan
and J. Gehrke
- Database Systems: The Complete Book, by H. Garcia Molina
et al.
- Hadoop: The Definitive Guide, by T. White
- Cassandra: The Definitive Guide, by E. Hewitt
- Graph Databases, by I. Robinson et al.
- Learning Spark, by H. Karau et al.
|
| Session dates |
|
| Add/drop |
14 January, 2019 - 9 April, 2019 |
| Quota |
37 [For MSc(CompSc) & other Engineering TPG students] |