AB 114
Mondays 9-11am
Nick Koudas
of final grade
10% each
Proposal due November 3 2025
Final report due end of year
This course explores advanced topics in data systems with a focus on vector databases and unstructured data query processing. Students will learn about information retrieval, embeddings, different types of indexing techniques in high dimensions, and modern approaches to multimodel query processing. We will also cover recent techniques for querying unstructured data and touch upon numerous research directions in this area.
Lecture | Title | Readings & Materials |
---|---|---|
1 | Introduction / Fundamentals of Information Retrieval | Stanford IR Book (Chapters 1-4) |
2 | Information Retrieval and Ranking | Stanford IR Book (Chapter 6, 11,12) |
3 | Introduction to Embeddings | Chapter 5 (Embeddings) Glove: Global Vectors for Word Representation Efficient Estimation of Word Representations in Vector Space |
4 | BERT / Transformers |
Reading materials to be added
|
5 | Multimodal Embeddings |
Reading materials to be added
|
6 | Indexing in High Dimensions |
Reading materials to be added
|
7 | Tree / Table Based Indexing |
Reading materials to be added
|
8 | Graph Indexing |
Reading materials to be added
|
9 | Graph Indexing (Continued) |
Reading materials to be added
|
10 | Filtered Search |
Reading materials to be added
|
11 | Document Query Processing |
Reading materials to be added
|
12 | Document Query Processing (Continued) |
Reading materials to be added
|
The research project is a significant component of this course. Students are encouraged to propose their own projects related to vector databases and unstructured data processing. The instructor will also provide suggested topics and will highlight open problems during the lectures.