Younes Hourri

About Me

Hey! I am Younes Hourri, currently pursuing an MSc in Computer Science at the University of Toronto, advised by Professor Maryam Mehri Dehnavi. Before this, I completed my undergraduate studies in Computer Science at McGill University.

My research focuses on making machine learning models more efficient and deployable. I’m particularly interested in model compression techniques such as structured sparsity, quantization, and low-rank approximation, and I explore ways these techniques can enable faster inference strategies for large-scale models.

Publications

PATCH: Learnable Tile-Level Pruning of Large Models

Y. Hourri*, M. Mozaffari*, M. Mehri Dehnavi (* Equal contribution)

Code · Triton Code

Under review

Introduced a hybrid tile-level mask that learns to select 2:4 or dense patterns across LLM weights.
Achieved up to +3.68% accuracy improvement across LLM families and architectures over the state-of-the-art 2:4 pruning method MaskLLM.
Integrated with the STOICC compiler (built on Triton) to reach 1.18×–1.38× inference speedup on LLaMA-2 7B.