Multilinguality, Language Diversity, and Low-Resource Languages
Annie En-Shiun Lee is an Assistant Professor at Ontario Tech University (OTU) and a Status-only Assistant Professor at University of Toronto. Her goal is to make language technology as inclusive and accessible to as many people as possible. She directs the Lee Language Lab (L³), focusing on language diversity, multilinguality, and multiculturalism, aligning with OTU’s vision for “Tech with a Conscience”. Hers research has been published in Nature Digital Medicine, ACM Computing Surveys, *ACL, SIGCSE, IEEE TKDE, and Bioinformatics.
Dr. Lee is the demo co-chair for NAACL 2024 and has received numerous recognitions, including Outstanding Paper Award and Best Theme Paper Award at NAACL 2025, and the ARIA Spotlight Award for MScAC 2024, as well as nominations for the Tim McTiernan Student Mentorship Award 2025 and the Women in AI Researcher of the Year Award 2025.
She was an Assistant Professor (teaching stream) at the University of Toronto for the MScAC. She earned her PhD from the University of Waterloo, was a visiting researcher at the Fields Institute and the Chinese University of Hong Kong, and worked as a research scientist at VerticalScope (Research lead) and Stradigi AI.
Expanded typological and geographic language database with improved NLP integration. Includes a Python package for multilingual benchmarking, cross-lingual transfer, and dataset alignment.
A comprehensive benchmark for evaluating large language models across African languages. Recognized with the Outstanding Paper Award at NAACL 2024 for advancing inclusive NLP evaluation.
1.2M image–question pairs across 30 languages, capturing global culinary knowledge. A benchmark for cross-cultural multimodal reasoning with applications in cultural AI research.
Multilingual multimodal retrieval-augmented generation framework extending RAG to cross-lingual and multimodal settings, enabling richer knowledge retrieval across languages and modalities.
Multilingual multimodal reasoning research examining LLM reasoning capabilities across languages and modalities, with a focus on cross-lingual generalization and low-resource settings.
Interactive translation quality assistant that detects and corrects machine translation errors. Enhances translator efficiency while maintaining linguistic fluency and semantic accuracy.
Strengthens the multilingual understanding and reasoning capabilities of large language models, with a focus on low-resource languages.
Freezes early transformer layers to preserve syntactic knowledge during fine-tuning. Boosts zero-shot and cross-domain performance with minimal additional training, improving stability and efficiency.
A lightweight performance proxy that predicts LM accuracy using ~30× less compute. Enables faster model selection, fine-tuning, and prompt iteration while maintaining high predictive reliability.
Predicts machine translation performance on low-resource languages using language typology and similarity features, enabling efficient model selection without exhaustive evaluation.
Award-winning Teaching NLP workshop on empowering multilinguality in NLP education. Showcases teaching strategies, open resources, and collaborative projects funded by NSERC USRA and the Fields Institute.
An RPG-style educational game for linguists, blending gamified learning with language exploration. Makes linguistic concepts engaging and accessible for learners at all levels.
Browse the complete, up-to-date publication list, citations, and co-authors.
Choose the path that fits you. Please carefully follow the instructions below:
Propose joint research, co-supervision, visiting positions, R&D projects, evaluations, or joint award submissions with L³.
Research and project opportunities for undergraduate students with L³.
Application process and requirements for graduate study with L³.
For former L³ students requesting an academic reference letter.