COMP7607A - Natural language processing

Semester 1, 2022-23

Professor
Lingpeng Kong
Teaching assistant
Yiheng Xu
Syllabus Natural language processing (NLP) is the study of human language from a computational perspective. The course will be focusing on machine learning and corpus-based methods and algorithms. We will cover syntactic, semantic and discourse processing models. We will describe the use of these methods and models in applications including syntactic parsing, information extraction, statistical machine translation, dialogue systems, and summarization. This course starts with language models (LMs), which are both front and center in natural language processing (NLP), and then introduces key machine learning (ML) ideas that students should grasp (e.g. feature-based models, log-linear models and then the neural models). We will land on modern generic meaning representation methods (e.g. BERT/GPT-3) and the idea of pretraining / finetuning.
Introduction by Professor Natural language processing (NLP) is the study of human language from a computational perspective. The course will investigate modern NLP algorithms from the machine learning perspective. We will cover syntactic, semantic and other useful models for language. We will describe the use of these methods and models in applications including syntactic parsing, information extraction, statistical machine translation, dialogue systems, and summarization. This course starts with language models (LMs), which are both front and center in natural language processing (NLP), and then introduces key machine learning (ML) ideas that students should grasp (e.g., feature-based models, log-linear models and then the neural models). We will land on modern generic meaning representation methods (e.g., BERT/GPT-3) and the idea of pretraining / finetuning / prompt-based learning.
Learning Outcomes
Course Learning Outcomes Relevant Programme Learning Outcome
CLO1. Able to understand the motivations and principles for building natural language processing systems PLO. 1,3,4,5,6,7
CLO2. Able to master a set of key machine learning / statistical methods which are widely used in and beyond NLP PLO. 1,2,3,4,8,9,10
CLO3. Able to implement practical applications of NLP using tools such as NLTK, Pytorch and Dynet PLO.2,3,11,14,15,16
View Programme Learning Outcomes
Pre-requisites -
Compatibility Nil
Prior knowledge expected Basic knowledge about Machine Learning, Probability, Statistics, and Programming
Topics covered
Course Content No. of Hours Course Learning Outcomes
1. Introduction to NLP, Language Models, RNNLMs 3 CLO1
2. BERT, Pretraining + Fine-tuning 3 CLO1, CLO2, CLO3
3. Computational Graphs and Sequence to Sequence Model 3 CLO1, CLO2
4. Attention Mechanism and Transformers 3 CLO1, CLO2
5. Parsing, Context-free Grammars, Probabilistic Context-free Grammars 3 CLO1, CLO2
6. Recursive Neural Networks, Shift-reduce Parsing and Stack-LSTMs, Dependency Parsing, Recurrent Neural Network Grammars 3 CLO1, CLO2
7. Large Pretrained Models, Prompt, Prefix-Tuning and Adaptors 3 CLO1, CLO2
8. Natural Language Generation, Controllable Text Generation 3 CLO1, CLO2, CLO3
9. Question Answering 2 CLO2, CLO3
10. Multilinguality, Multimodality, NLP + Vision 2 CLO2, CLO3
11. Model Interpretability, Social NLP 2 CLO2, CLO3
 
Assessment
Description Type Weighting * Tentative Assessment Period /
Examination Period ^
Course Learning Outcomes
Quiz-based Assignment Continuous Assessment 25% - CLO1, CLO2
Programming-based Assignment Continuous Assessment 25%   CLO1, CLO2, CLO3
Project Continuous Assessment 50% - CLO1, CLO2
Written exam covering all course contents Written Examination 0% - -
* The weighting of coursework and examination marks is subject to approval
^ The exact examination date uses to be released when all enrolments are confirmed after add/drop period by the Examinations Office.  Students are obliged to follow the examination schedule.  Students should NOT enrol in the course if they are not certain that they will be in Hong Kong during the examination period.  Absent from examination may result in failure in the course. There is no supplementary examination for all MSc curriculums in the Faculty of Engineering.
Course materials Prescribed textbook:
  •  Jurafsky, Daniel, and James H. Martin. "Speech and Language Processing."
Session dates
Date Time Venue Remark
Session 1 5 Sep 2022 (Mon) 7:00pm - 10:00pm CPD-LG.09 Face-to-face
Session 2 19 Sep 2022 (Mon) 7:00pm - 10:00pm CPD-LG.09 Face-to-face
Session 3 26 Sep 2022 (Mon) 7:00pm - 10:00pm CPD-LG.09 Face-to-face
Session 4 3 Oct 2022 (Mon) 7:00pm - 10:00pm CPD-LG.09 Face-to-face
Session 5 17 Oct 2022 (Mon) 7:00pm - 10:00pm CPD-LG.09 Face-to-face
Session 6 24 Oct 2022 (Mon) 7:00pm - 10:00pm CPD-LG.09 Face-to-face
Session 7 31 Oct 2022 (Mon) 7:00pm - 10:00pm CPD-LG.09 Face-to-face
Session 8 7 Nov 2022 (Mon) 7:00pm - 10:00pm CPD-LG.09 Face-to-face
Session 9 14 Nov 2022 (Mon) 7:00pm - 10:00pm CPD-LG.09 Face-to-face
Session 10 21 Nov 2022 (Mon) 7:00pm - 10:00pm CPD-LG.09 Face-to-face
CPD - Central Podium Levels (Centennial Campus)
Add/drop 1 September, 2022 - 19 September, 2022
Maximum class size 150
Back