This class is a graduate course in visual perception for autonomous driving. The class will briefly cover topics in localization, ego-motion estimaton, free-space estimation, visual recognition (classification, detection, segmentation), etc

Prerequisites: A good knowledge of statistics, linear algebra, calculus is necessary as well as good programming skills. A good knowledge of computer vision and machine learning is strongly recommended.

back to top

  • August 12th: Course webpage has been created
  • back to top

    When emailing me, please put CSC2541 in the subject line.

    back to top

    Each student will need to write two paper reviews each week, present once or twice in class (depending on enrollment), participate in class discussions, and complete a project (done individually or in pairs).


    The final grade will consist of the following
    Participation (attendance, participation in discussions, reviews) 20%
    Presentation (presentation of papers in class)20%
    Project (proposal, final report, presentation)60%

    Paper reviewing

    Every week (except for the first two) we will read 2 to 3 papers. The success of the discussion in class will thus be due to how prepared the students come to class. Each student is expected to read all the papers that will be discussed and write two detailed reviews about the selected two papers. Depending on enrollment, each student will need to also present a paper in class. When you present, you do not need to hand in the review.

    Deadline: The reviews will be due one day before the class.

    Structure of the review
    Short summary of the paper
    Main contributions
    Positive and negatives points
    How strong is the evaluation?
    Possible directions for future work

    Presentation

    Depending on enrollment, each student will need to present a few papers in class. The presentation should be clear and practiced and the student should read the assigned paper and related work in enough detail to be able to lead a discussion and answer questions. Extra credit will be given to students who also prepare a simple experimental demo highlighting how the method works in practice.

    A presentation should be roughly 45 minutes long (please time it beforehand so that you do not go overtime). Typically this is about 30 slides. You are allowed to take some material from presentations on the web as long as you cite the source fairly. In the presentation, also provide the citation to the papers you present and to any other related work you reference.

    Deadline: The presentation should be handed in one day before the class (or before if you want feedback).

    Structure of presentation:
    High-level overview with contributions
    Main motivation
    Clear statement of the problem
    Overview of the technical approach
    Strengths/weaknesses of the approach
    Overview of the experimental evaluation
    Strengths/weaknesses of evaluation
    Discussion: future direction, links to other work

    Project

    Each student will need to write a short project proposal in the beginning of the class (in January). The projects will be research oriented. In the middle of semester course you will need to hand in a progress report. One week prior to the end of the class the final project report will need to be handed in and presented in the last lecture of the class (April). This will be a short, roughly 15-20 min, presentation.

    The students can work on projects individually or in pairs. The project can be an interesting topic that the student comes up with himself/herself or with the help of the instructor. The grade will depend on the ideas, how well you present them in the report, how well you position your work in the related literature, how thorough are your experiments and how thoughtful are your conclusions.

    back to top

    Coming soon


    back to top

    DateTopicReadingsPresentersSlides
    Jan 12, 19Introduction Raquel Urtasun intro
    Feb 2Stereo Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches, CVPR 2015 [PDF] [code]
    J. Zbontar and Y. LeCun

    Stereo Processing by Semi-Global Matching and Mutual Information, PAMI 2008 [PDF]
    H. Hirschmueller

    Efficient Joint Segmentation, Occlusion Labeling, Stereo and Flow Estimation, ECCV 2014 [PDF] [code]
    K. Yamaguchi, D. McAllester and R. Urtasun

    Wenjie Luo stereo
    Feb 2, 9Optical Flow Non-Local Total Generalized Variation for Optical Flow Estimation, ECCV 2014 [PDF
    R. Ranftl, K. Bredies and T. Pock

    Large displacement optical flow: Descriptor matching in variational motion estimation, PAMI 2011 [PDF]
    T. Brox and J. Malik

    FlowNet: Learning Optical Flow with Convolutional Networks, ICCV 2015 [PDF]
    A. Dosovitskiy, P. Fischer, E. Ilg, P. Häusser, C. Hazırbaş, V. Golkov, P. Smagt, D. Cremers,T. Brox

    A Quantitative Analysis of Current Practices in Optical Flow Estimation and The Principles Behind Them, IJCV 2011 [PDF]
    D. Sun, S. Roth and M. Black

    Shenlong Wang motion
    Feb 9, 16Optical Flow / Scene Flow EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow, CVPR 2015 [PDF] [code]
    J. Revaud, P. Weinzaepfel, Z. Harchaoui and C. Schmid

    DeepFlow: Large displacement optical flow with deep matching, ICCV 2013 [PDF]
    P. Weinzaepfel, J. Revaud, Z. Harchaoui and C. Schmid

    Robust Monocular Epipolar Flow Estimation, CVPR 2013 [PDF]
    K. Yamaguchi, D. McAllester and R. Urtasun

    3D Scene Flow Estimation with a Piecewise Rigid Scene Model, CVPR 2015 [PDF]
    C. Vogel, K. Schindler and S. Roth

    Min Bai scene_flow
    Feb 16Visual Odometry Visual-lidar Odometry and Mapping: Low- rift, Robust, and Fast, ICRA 2015 [PDF
    J. Zhang and S. Singh

    StereoScan: Dense 3d Reconstruction in Real-time, IV 2011 [PDF
    A. Geiger, J. Ziegler and C. Stiller

    Real-time stereo visual odometry for autonomous ground vehicles, IROS 2008 [PDF]
    A. Howard

    Patric McGarey visual odometry
    Feb 23SLAM DTAM: Dense tracking and mapping in real-time, ICCV 2011 [PDF
    Newcombe, R., Lovegrove, S., Davison, A

    Large-scale direct slam with stereo cameras, IROS 2015 [PDF
    J. Engel, J. Stuckler, and D. Cremers

    LSD-SLAM: Large-scale direct monocular SLAM, ECCV 2014 [PDF
    J. Engel, T. Schöps, and D. Cremers

    Relative continuous-time SLAM, International Journal of Robotics Research 2015 [PDF
    S. Anderson, K. MacTavish, and T. Barfoot

    Full STEAM ahead: Exactly sparse gaussian process regression for batch continuous-time trajectory estimation on SE (3), IROS 2015 [PDF
    Newcombe, R., Lovegrove, S., Davison, A

    Long-term 3D map maintenance in dynamic environments, ICRA 2014 [PDF
    F. Pomerleau, P. Krusi, F. Colas, P. Furgale and R. Siegwart

    Kirk MacTavish, Lingzhu Xiang SLAM
    March 1Free-Space Estimation Chapter 9 of Probabilistic Robotics Book [PDF
    S. Thrun, W. Burgard, D. Fox

    Free Space Computation Using Stochastic Occupancy Grids and Dynamic Programming In Workshop Dynamical Vision ICCV 2007 [PDF
    H. Badino, U. Franke and R. Mester

    The Stixel World - A Compact Medium Level Representation of the 3D-World DAGM 2009 [PDF
    H. Badino, U. Franke and D. Pfeiffer

    Hao Wu Free-Space
    March 82D Object Detection Rich feature hierarchies for accurate object detection and semantic segmentation CVPR 2014 [PDF
    R. Girshick, J. Donahue, T. Darrell, J. Malik

    Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks NIPS 2015 [PDF
    S. Ren, K. He, R. Girshick, J. Sun

    Deep Residual Learning for Image Recognition ArXiv Dec 2015 [PDF
    K. He, X. Zhang, S. Ren, J. Sun

    Renjie Liao 2D Detection
    March 83D Object Detection Data-Driven 3D Voxel Patterns for Object Category Recognition CVPR 2015 [PDF
    Y. Xiang, W. Choi, Y. Lin and S. Savarese

    3D Object Proposals for Accurate Object Class Detection NIPS 2015 [PDF
    X. Chen, K. Kundu, Y. Zhu, A. Berneshawi, H. Ma, S. Fidler and R. Urtasun

    Zhen Li 3D Detection
    March 15Semantic Segmentation Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs ICLR 2015 [PDF]; [Code];nbsp
    L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille

    SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation ArXiv 2015 [PDF] [Project]
    V. Badrinarayanan, A. Kendall, R. Cipolla

    Joint Semantic Segmentation and 3D Reconstruction from Monocular Video ECCV 2014 [PDF] [Project]
    A. Kundu, Y. Li, F. Dellaert, F. Li, and J. M. Rehg

    Stefania Raimondo Semantic Segmentation
    March 15Instance-level Segmentation Instance Segmentation of Indoor Scenes using a Coverage Loss ECCV 2014 [PDF
    N. Silberman, D. Sontag, R. Fergus

    Instance-Level Segmentation with Deep Densely Connected MRFs Arxiv Dec 2015 [PDF
    Z. Zhang, S. Fidler, and R. Urtasun

    Mengye Ren Instance-level Segmentation
    March 22Tracking Global Data Association for Multi-Object Tracking Using Network Flows CVPR 2008 [PDF
    L. Zhang and R. Nevatia

    Multiple Object Tracking using K-Shortest Paths Optimization PAMI 2011 [PDF
    J. Berclaz, F. Fleuret, E. Turetken, and P. Fua

    Multi-target tracking by discrete-continuous energy minimization PAMI 2016 [PDF
    A. Milan, K. Schindler, S. Roth

    Wenjie Luo Tracking
    March 29Place Recognition FAB-MAP: Probabilistic localization and mapping in the space of appearance IJRR 2008 [PDF
    M. Cummins and P. Newman

    Place recognition with ConvNet landmarks: Viewpoint-robust, condition-robust, training-free RSS 2015 [PDF
    N. Sünderhauf, S. Shirazi, and A. Jacobson

    Convolutional networks for real-time 6-DOF camera relocalization ArXiv 2015 [PDF
    A. Kendall, M. Grimes, and R. Cipolla

    Valentin Peretroukhin Place Recognition