Prof. Annie Lee

Annie En-Shiun Lee, PhD

Ontario Tech University (Assistant Professor)
University of Toronto (Status-only Assistant Professor)

Biography

Annie En-Shiun Lee is an Assistant Professor at Ontario Tech University (OTU) and a Status-only Assistant Professor at University of Toronto. Her goal is to make language technology as inclusive and accessible to as many people as possible. She directs the Lee Language Lab (L³), focusing on language diversity, multilinguality, and multiculturalism, aligning with OTU’s vision for “Tech with a Conscience”. Hers research has been published in Nature Digital Medicine, ACM Computing Surveys, *ACL, SIGCSE, IEEE TKDE, and Bioinformatics.

Dr. Lee is the demo co-chair for NAACL 2024 and has received numerous recognitions, including Outstanding Paper Award and Best Theme Paper Award at NAACL 2025, Women in AI Researcher of the Year Award 2025. for MScAC 2024, as well as nominations for the Tim McTiernan Student Mentorship Award 2025 and Women in AI Researcher of the Year Award 2025.

She was an Assistant Professor (teaching stream) at the University of Toronto for the MScAC. She earned her PhD from the University of Waterloo, was a visiting researcher at the Fields Institute and the Chinese University of Hong Kong, and worked as a research scientist at VerticalScope (Research lead) and Stradigi AI.

Research Interests

Multilinguality, Language Diversity, and Low-Resource Languages

Multilinguality, language diversity, and low-resource languages

Multicultural Bias and Multimodal Applications

Multicultural Bias and Multimodal Applications

Pedagogy for Natural Language Processing and Machine Learning

Pedagogy for Natural Language Processing and Machine Learning

Projects

ProxyLM (Findings NAACL 2025)

A lightweight performance proxy that predicts LM accuracy using ~30× less compute. Enables faster model selection, fine-tuning, and prompt iteration while maintaining high predictive reliability.

AlignFreeze (NAACL 2025)

Freezes early transformer layers to preserve syntactic knowledge during fine-tuning. Boosts zero-shot and cross-domain performance with minimal additional training, improving stability and efficiency.

WorldCuisine (NAACL 2025 – Best Theme Paper)

1.2M image–question pairs across 30 languages, capturing global culinary knowledge. A benchmark for cross-cultural multimodal reasoning with applications in cultural AI research.

Teaching NLP

Award-winning Teaching NLP workshop on empowering multilinguality in NLP education. Showcases teaching strategies, open resources, and collaborative projects funded by NSERC USRA and the Fields Institute.

URIEL+ World Language Database (COLING 2025)

Expanded typological and geographic language database with improved NLP integration. Includes a Python package for multilingual benchmarking, cross-lingual transfer, and dataset alignment.

AiTaigi Hokkien Learning App

Multimodal app for Taiwanese Hokkien featuring speech, text, and audio examples. Developed as a student-led project and awarded the Student Engagement Award by U of T Computer Science.

TranslationCorrect (*ACL 2025)

Interactive translation quality assistant that detects and corrects machine translation errors. Enhances translator efficiency while maintaining linguistic fluency and semantic accuracy.

Multilingual Understanding and Reasoning of LLMs (Findings of EMNLP 2024)

This project aims to strengthen the multilingual understanding and reasoning capabilities of large language models (LLMs), with a focus on low-resource languages.

Full list on Google Scholar

Browse the complete, up-to-date publication list, citations, and co-authors.

Join Us

Choose the path that fits you. Please carefully follow the instructions below:

Collaborations (Academic & Industry)

Propose joint research, co-supervision, visiting positions, R&D projects, evaluations, or joint award submissions with L³.

Undergraduate Students

Research and project opportunities for undergraduate students with L³.

MSc / PhD Applicants

Application process and requirements for graduate study with L³.

Letters of Recommendation (Former Students)

For former L³ students requesting an academic reference letter.

Teaching Experience

Ontario Tech University (OTU)

University of Toronto

York University

Students

David Anugraha

David Anugraha

  • Lead author: WorldCuisines & ProxyLM (Multilingual VQA, LM performance prediction).
  • Co-author: URIEL+ Typological Knowledge Base.
  • Co-author: MT performance on low-resource languages.
Enrique David Guzman Ramírez

Enrique David Guzman Ramírez

  • MScAC student, University of Toronto.
  • Vector Scholarship in AI (2022–23).
Kosei Uemura

Kosei Uemura

  • Focus: Multilingual NLP & reasoning in LLMs.
  • Lead author: AfriInstruct (instruction tuning for African languages).
  • Co-author: Empowering the Future with Multilinguality & Language Diversity.
Mason Shipton

Mason Shipton

Labib Rahman

Labib Rahman

  • ExploRIEL — UI with chatbot for URIEL+ language distances & vectors.
  • SoulsBot+ — LLM-powered tutorial chatbot for Dark Souls: The Board Game.
  • LinguaQuest — RPG-style educational game for linguists.
  • Master’s student, Ontario Tech; researcher at Lee Lab & UXRLab.
Quang Phuoc Nguyen

Quang Phuoc Nguyen

  • Data Selection for Multilingual Alignment — selects optimal languages for LM fine-tuning.
  • Merlin: Curriculum Alignment — encoder–decoder stacking to improve multilingual alignment.
  • Game Dialogue Translation — survey of LLM performance in game localization.
Malikeh Ehghaghi

Malikeh Ehghaghi

Amane Takeuchi

Amane Takeuchi

  • Research Project Lead & RA (ML model interpretation in clinical apps, NLP, CS education & EDI; PyTorch).
  • BSc Applied Math; Specialist Data Science; Major CS; Minor Math — University of Toronto (Dean’s List 2023).
  • Vice-Chair & Career Event Director, UofT Japan Network; TA for MAT135/136/235.
Tong Su

Tong Su

  • MSc Advanced Computer Science, University of Oxford (2024–2025).
  • Former Full-Stack Developer, Northbridge Financial (Angular, Django, .NET; 10,000+ users).
  • TA & Course Supporter, University of Toronto (Python, Unix/Git, Research Software).
  • Research Assistant (Lee Lab & AI for Justice): PEFT for low-resource NMT; first author — NAACL 2024.
  • Passed CFA Program Level I (Oct 2024).
Syed Mekhael Wasti

Syed Mekhael Wasti

  • Lead author, *ACL 2025 demo: TranslationCorrect.
  • Co-author, TeachNLP 2024: Multilinguality paper.
  • MSc, Queen’s University (Vector Scholar), Fall 2025.
Hasti Toossi

Hasti Toossi

  • Research: NLP; Programming Languages (Type Theory).
  • Recent graduate, University of Toronto.
Aditya Khan

Aditya Khan

Vincent Shuai

Vincent Shuai

  • Student Engagement Award (2024), University of Toronto CS.
Shou-Yi Hung (Ray)

Shou-Yi Hung (Ray)

Instagram

Awards

Awards & Recognitions: Research Grants: