About Me
Hey! I am Younes Hourri, currently pursuing an MSc in Computer Science at the University of Toronto, advised by Professor Maryam Mehri Dehnavi. Before this, I completed my undergraduate studies in Computer Science at McGill University.
My research focuses on making machine learning models more efficient and deployable. I’m particularly interested in model compression techniques such as structured sparsity, quantization, and low-rank approximation, and I explore ways these techniques can enable faster inference strategies for large-scale models.
Publications
PATCH: Learnable Tile-Level Pruning of Large Models
Y. Hourri*, M. Mozaffari*, M. Mehri Dehnavi (* Equal contribution)
Under review
- Introduced a hybrid tile-level mask that learns to select 2:4 or dense patterns across LLM weights.
- Achieved up to +3.68% accuracy improvement across LLM families and architectures over the state-of-the-art 2:4 pruning method MaskLLM.
- Integrated with the STOICC compiler (built on Triton) to reach 1.18×–1.38× inference speedup on LLaMA-2 7B.