MSc. Applied Computing (2017-2018), University of Toronto
A former Master of Engineering student at Ecole Centrale Paris (now CentraleSupelec),
I am a Master of Science in Applied Computing
student at University of Toronto.
My goal is to apply recent research breakthroughs to solve real-life problems and make a change.
I am interested in all AI applications, especially to the following sectors:
My research interests include:
- Unsupervised learning , since I first worked with autoencoders in Fall 2016
- NLP , since I discovered word2vec in Spring 2017
- Sequence models . I am fascinated by building models that can achieve good performance over more than one prediction step.
Courses taken at University of Toronto:
- Learning Discrete Latent Structure (STA4273/CSC2547, Winter 2018)
- Natural Language Computing (CSC2511, Winter 2018)
- Topics in Computational Social Science (CSC2552, Winter 2018)
- Machine Learning (CSC2515, Fall 2017)
- Computational Linguistics (CSC2501, Fall 2017)
- Computational Techniques in Statistics (STA2102, Fall 2017)
- Time Series Analysis (STA457, Fall 2018)
- Non-linear Optimization (APM462, Summer 2018)
- Machine Learning for Computer Vision (CSC2548, Winter 2018)
- Topics in Algorithms: Fast Algorithms via Continuous Methods (CSC2421, Fall 2017)
- 2016-2017: One year in industry (two long research internships)
2014-2016: Ecole Centrale Paris
2015-2016: First year of Master's degree in Engineering
2014-2015: BSc in Engineering
- 2011-2014: Undergraduate studies ( Prepa MPSI/MP* in Montaigne Bordeaux, France)
Online Conflicting Communities (CSC2552, Winter 2018)
In this study, we analyze pairs of Reddit communities. We cluster these pairs into three groups: random, similar, and clashing ; then analyze graph properties
(number of users active in both communities, degree distribution, etc). Our goal is to exhibit features caracterizing conflict.
ReGAN: GANs for Sequence Generation via Gradient Estimators. (CSC2547, Winter 2018)
GANs have been very successful at generating images in the last few years. However, less work has been done regarding sequence generation.
SeqGAN (Yu et al, 2017) makes use of REINFORCE to train a GAN to generate text, after pre-training the generator. In this paper, we go beyond this and
apply REINFORCE as well as state-of-the-art gradient estimation techniques REBAR and RELAX to train a GAN for sequence generation on toy data.
Gradient descent revisited with an adaptive learning rate (CSC2515, Fall 2017)
We explored gradient descent with an online adaptive learning rate. A each step, we pick the learning rate minimizing the loss the most.
For this purpose, we compared a first-order method (gradient descent) and a second-order method (Newton-Raphson) both approximated by finite differences.
Exploration of gradient descent's limits (Dec. 2017)
Altough widly used in deep learning, stochastic gradient descent is not an ideal optimizer. In particular, with ill-conditioned problems,
gradient descent struggles to reach convergence, even after learning rate tuning.
Machine Learning Research Intern at Layer 6 AI, Toronto (May 2018 - Dec. 2018), supervised by Tomi Poutanen and working with Maks Volkovs.
Academic supervisor: Assistant Prof. David Duvenaud .
I conducted research in sequential models, and worked on two main projects:
My internship report can be found here: [pdf].
A new regularization approach for neural language modeling.
Clustering of a large-scale population health cohort of patients with diabetes, where the data mostly comprises diagnosis codes.
As part of the MScAC program, I also presented my results at the Applied Research in Action (ARIA) event. The poster is here: [pdf]
TA at University of Toronto for STA130 Introduction to Statistical Reasoning and Data Science (Prof. Gibbs and Prof. Taback, Winter 2018).
Conducting a weekly 2-hour tutorial using R.
Machine Learning Research Intern at I2R, Singapore (Feb. 2017 - Jul. 2017)
Supervisor: Dr. Vijay Chandrasekhar
I was part of the deep learning team in the Visual Computing Lab for 5 months. I contributed to 3 projects:
Adding word features to previously existing video and audio features of Youtube videos (Youtube 8M dataset).
By doing so, we broke classification state-of-the-art on the validation dataset.
Exploring GANs architecture for semi-supervised image classification.
3D lung cancer detection. We worked with radiologists to help automatize lung cancer detection. This work started with the Kaggle Data Science Bowl 2017.
Machine Learning Research Intern at Thales Solutions Asia, Singapore (Aug. 2016 - Feb. 2017)
Supervisor: Dr. Antoine Fagette
I was part of the Smart Cities team for 6 months. I contributed to 2 projects:
Internship report: [pdf]
Underwater mines detection. Mines present a real security threat for ships in the Malacca straits.
We built a mine detector trained with synthetic data.
Crowd monitoring. We used a one-class classification approach based on autoencoders to build crowd detectors on CCTV camera images.