Sergio Casas

Sergio Casas

Research Scientist @ Uber ATG

Ph.D. Student @ University of Toronto

About me

I’m a Research Scientist at Uber ATG’s R&D team. Here, I apply my own research to the development of self-driving vehicle technology, focusing on autonomy algorithms ranging from perception to motion planning.

I am also a PhD student at the University of Toronto, and a member of the Machine Learning Group and the Vector Institute.


  • Machine Learning
  • Computer Vision
  • Robotics - Autonomous Driving
  • Generative Models
  • Imitation Learning


  • PhD in Computer Science, 2020 - Present

    University of Toronto

  • MSc in Computer Science, 2018 - 2020

    University of Toronto

  • BSc in Computer Science, 2013 - 2017

    Universitat Politècnica de Catalunya

  • BSc in Industrial Tech. Engineering, 2012 - 2017

    Universitat Politècnica de Catalunya


MP3: A Unified Model to Map, Perceive, Predict and Plan

Interpretable end-to-end neural motion planning without high-definition maps

LookOut: Diverse Multi-Future Prediction and Planning for Self-Driving

Contingency planning from diverse joint trajectory samples for all actors in the scene

TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors

Realistic long-term vehicle behavior simulation learned from imitation and common sense

AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles

Critical scenario generation by modifying the actors' trajectories in a physically plausible manner and updating the LiDAR sensor data to create realistic observations of the perturbed world

Deep Multi-Task Learning for Joint Localization, Perception, and Prediction

Efficient end-to-end joint localization, perception, prediction able to correct localization errors

Diverse Complexity Measures for Dataset Curation in Self-driving

Model-agnostic approach to dataset curation for autonomy tasks

Safety-Oriented Pedestrian Motion and Scene Occupancy Forecasting

Hybrid instance-based and instance-free approach to pedestrian behavior prediction

Strobe: Streaming Object Detection from LiDAR Packets

@ Conference on Robot Learning (CoRL), 2020
Existing LiDAR perception systems wait 100ms just to build a sweep. StrObe instead does streaming detection from LiDAR packets and achieve an end-to-end latency of 21ms

Implicit Latent Variable Model for Scene-Consistent Motion Forecasting

@ European Conference on Computer Vision (ECCV), 2020
ILVM characterizes the joint distribution over multiple actors' future trajectories

Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable Semantic Representations

@ European Conference on Computer Vision (ECCV), 2020
End-to-end neural motion planner based on interpretable semantic scene occupancies

RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects

@ European Conference on Computer Vision (ECCV), 2020
Multi-level fusion of LiDAR & Radar for object detection and velocity estimation

The Importance of Prior Knowledge in Precise Multimodal Prediction

@ International Conference on Intelligent Robots and Systems (IROS), 2020
Incorporate non-differentiable prior knowledge for behavior forecasting

PnPNet: End-to-End Perception and Prediction with Tracking in the Loop

@ Computer Vision and Pattern Recognition (CVPR), 2020
Tracking in the loop in joint perception and prediction

Spatially-Aware Graph Neural Networks for Relational Behavior Forecasting from Sensor Data

@ International Conference on Robotics and Automation (ICRA), 2020
Relational reasoning for multi-agent behavior prediction from sensors

Discrete Residual Flow for Probabilistic Pedestrian Behavior Prediction

@ Conference on Robot Learning (CoRL), 2019
Long-term pedestrian forecasting with occupancy grid maps

End-to-end Interpretable Neural Motion Planner

@ Computer Vision and Pattern Recognition (CVPR), 2019
Neural motion planner from LiDAR and HD maps

Intentnet: Learning to Predict Intention from Raw Sensor Data

@ Conference on Robot Learning (CoRL), 2018
Joint perception and prediction from LiDAR point clouds and HD maps



Research Scientist

Uber Advanced Technologies Group

Oct 2017 – Present Toronto, Canada
Research in Autonomous Driving: Perception, Prediction and Motion Planning systems.

Research Assistant

University of Toronto

Feb 2017 – Jul 2017 Toronto, Canada
Research in spatio-temporal reasoning for sports analytics. Worked on automatizing the NBA Play-by-Play reports. Supervised by Prof. Urtasun.

Data Analytics Consultant

Arcvi Big Data Agency

Jun 2016 – Jan 2017 Toronto, Canada
Creation of strategy solutions using simple Machine Learning techniques. Advised multiple retail, insurance and credit companies.

Software Engineering Intern

Psycle Interactive Ltd.

Jun 2016 – Jan 2017 Toronto, Canada
Mobile application development and UI/UX design. Research project on document topic classification and information retrieval.