This course offers a comprehensive study of Natural Language Processing (NLP). We will delve into word representations, the construction of language models, the application of deep learning in NLP, and Large Language Models (LLMs). Specifically, we will discuss the architectural engineering, data engineering, prompt engineering, training techniques, and efficiency enhancements of LLMs. Students will gain insights into the application of NLP and LLMs in various domains, such as text mining, search engines, human-machine interaction, medical-legal consulting, low-resource languages, and AI for Science, as well as how to handle issues of data privacy, bias, and ethics. The course will also investigate the limitations of NLP and LLMs, such as challenges in alignment. The curriculum includes guest lectures on advanced topics and in-class presentations to stimulate practical understanding. This course is ideal for anyone seeking to master the use of NLP and LLMs in their field.
这门课程提供了对自然语言处理(NLP)的全面研究。我们将深入探讨词表示、语言模型的构建、深度学习在NLP中的应用和大型语言模型(LLMs)。特别地,我们将讨论大型语言模型的架构工程、数据工程、提示工程、训练技巧以及效率提升。学生将了解到自然语言处理和大型语言模型在各个领域的应用,如文本挖掘、搜索引擎、人机交互、医疗法律咨询、低资源语言和AI for Science,以及如何处理数据隐私、偏见和伦理问题。课程还将研究NLP和LLMs的局限性以及模型对齐问题。该课程包括高级主题的客座讲座和课堂演示。

Teaching team


Instructor
Benyou Wang

Benyou Wang is an assistant professor in the School of Data Science, The Chinese University of Hong Kong, Shenzhen. He has achieved several notable awards, including the Best Paper Nomination Award in SIGIR 2017, Best Explainable NLP Paper in NAACL 2019, Best Paper in NLPCC 2022, Marie Curie Fellowship, Huawei Spark Award. His primary focus is on large language models.

Leading TA
Juhao Liang
TA
Fei Yu
-->

Logistics


Course Information


This comprehensive course on Natural Language Processing (NLP) offers a deep dive into the field, providing students with the knowledge and skills to understand, design, and implement NLP systems. Starting with an overview of NLP and foundational linguistic concepts, the course moves on to word representation and language modeling, essential for understanding text data. It explores how deep learning, from basic neural networks to advanced transformer models, has revolutionized NLP and its diverse applications, such as text mining, information extraction, and machine translation. The course emphasizes large language models (LLMs), their scaling laws, emergent abilities, training strategies, and associated knowledge representation and reasoning. Students will apply their learning in final projects, for example, exploring NLP beyond text with multi-modal LLMs, AI for Science, vertical applications and agents. There are guest lectures and in-class paper discussions that could learn the cut-edge research. The course also concludes with an examination of NLP's limitations and ethical considerations. In particular, the topics include:

  • Introduction to NLP
  • Basics of Linguistics
  • Word Representation and Language Modeling
  • Deep Learning in NLP
  • Applications of NLP
  • Large Language Models (LLMs)
  • Prompt Engineering
  • Training Large Language Models
  • Final Projects: Custom or Default Topics and Practical Tips
  • NLP Beyond NLP
  • Guest Lectures
  • In-class Paper Sharing
  • Limitations and Ethics in NLP
  • Exam: Final Projects

Prerequisites

Learning Outcomes

Grading Policy (CSC6052/DDA6307/MDS6002)

Assignments (40%)

Final project (55%)

The project could be done by a group but each indivisual is separately evaluated. You need to write a project report (max 6 pages) for the final project. Here is the report template. You are also expected to make a project poster presentation. After the final project deadline, feel free to make your project open source; we appreciate if you acknowledge this course

Participation (5%)

Here are some ways to earn the participation credit, which is capped at 5%.

Late Policy

The penalty is 0.5% off the final course grade for each late day.

Schedule


--> -->
Date Topics Recommended Reading Pre-Lecture Questions Lecture Note Coding Events Deadlines
Jan 8-12 self-study; do not come to the classroom Tutorial 0: GitHub, LaTeX, Colab, and ChatGPT API OpenAI's blog
LaTeX and Overleaf
Colab
GitHub
Jan. 12th Lecture 1: Introduction to NLP Hugging Face NLP Course
Course to get into NLP with roadmaps and Colab notebooks.
LLM-Course
On the Opportunities and Risks of Foundation Models
Sparks of Artificial General Intelligence: Early experiments with GPT-4
What is NLP? [slide] [Phoenix]
Jan. 19th Lecture 2: Basics of Linguistics Universal Stanford Dependencies: A cross-linguistic typology
Insights between NLP and Linguistics
End-to-end Neural Coreference Resolution
What is structure of language (string of words)? [slide] [Linguistics repo]
Jan. 26th Lecture 3: Word Representation and Language Modeling Efficient Estimation of Word Representations in Vector Space (original word2vec paper)
A Neural Probabilistic Language Model
Evaluation methods for unsupervised word embeddings
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
how to model language and the inside words? [slide] [ word2vec ] Assignment 1 out
Feb. 2th Tutorial 1: Pratice for word vectors
Feb. 2th Lecture 4: Deep Learning in NLP Attention Is All You Need
HuggingFace's course on Transformers
Scaling Laws for Neural Language Models
The Transformer Family Version 2.0
On Position Embeddings in BERT
How to better compose words semantically as language? [slide] [Transformer]
Feb. 2th Tutorial 2: Backpropogation in neural networks [ slide ] [ Code ]
Mar. 1th Lecture 5: Large Language Models (LLMs) Training language models to follow instructions with human feedback
LLaMA: Open and Efficient Foundation Language Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
OpenAI's blog
what are LARGE language models and why LARGE? [slide] [Fine-tune Llama 2] Assignment 1 due (11:59pm)
Mar. 8th Lecture 6: Prompt Engineering Best practices for prompt engineering with OpenAI API
prompt engineering
How to better prompt LLMs? [slide] [Prompt_engineer] Assignment 2 out
March 8th Tutorial 3: Promot engineering [ slide ] [ Code ]
Mar. 15th Lecture 7: Training Large Language Models FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
How to train LLMs from scratch? [slide] [HuatuoGPT]
[LLMZoo]
Assignment 3 out
March 15th Tutorial 4: train your own LLMs Are you ready to train your own LLMs? [LLMZoo], [nanoGPT], [LLMFactory]
Mar. 22th Lecture 8: Final Projects: Custom or Default Topics and Practical Tips Assignment 2 due (11:59pm)
Final Projectout
Mar. 29th Lecture 9: NLP Beyond NLP Blog post: Generalized Visual Language Models
Can large models speak, see and perform actions ? [NExT-GPT]
Apr. 12th Lecture 10: LLM agents ToolBench
AgentBench
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
LLM Powered Autonomous Agents
[slide] Final Project Proposal due (11:59pm)
Apr. 19th Lecture 11: Guest lecture How NLP evolves? [HuatuoGPT-II]
Apr. 26th Lecture 12: Pratical Tips [slide]
Apr. 28th What are LLMs' limitations? [ Improve ChatGPT with Knowledge Graphs ] Assignment 3 due (11:59pm)
May 6th A whole-day Final Project Supervision QA What are LLMs' limitations? [ Improve ChatGPT with Knowledge Graphs ]
May. 10th Lecture 14: Future of NLP Large Language Models Encode Clinical Knowledge Survey of Hallucination in Natural Language Generation
Superalignment
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models
May. 17th Lecture 15: Exam: Final Projects (Poster presentation)

Acknowledgement

We borrowed some concepts and the website template from [CSC3160/MDS6002] where Prof. Zhizheng Wu is the instructor.

Website github repo is [here] .