Jun Gao

I am a PhD. student in the Machine Learning Group at the University of Toronto, supervised by Professor Sanja Fidler, and affiliated with Vector Institute. I am also a research scientist at NVIDIA Toronto AI Lab.

My research interests focus on the intersection of 3D computer vision and computer graphics, particularly developing machine learning tools to enable 3D content creation at scale and making impact in real-world applications.

I graduated from Peking University in 2018 with a Bachelor degree. I also interned in Stanford, MSRA and NVIDIA.

Email  /  CV  /  Google Scholar  /  Twitter  /  Github  /  Pro Bono

profile photo
  • Pro Bono: I decide to commit 1-2 hours per week (mostly on Sunday) to host pro bono office hours and provide guidance, suggestions or mentorship for students, especially people from underrepresented groups, please fill this form if you are interested. More details is here.
  • May 2023: I will server as an Area Chair for NeurIPS 2023.
  • April 2023: I gave talk at PKU, THU, MSRA, CUHK (Shenzhen) on Machine Learning for 3D Content Creation.
  • April 2023: One paper is accepted to SIGGRAPH 2023 on flexible and differentiable iso-surfacing.
  • March 2023: Two papers are accepted to CVPR 2023 on text to 3D generation, and inverse rendering for scene.
  • March 2023: Invited talk at ETH, Oxford, JHU, UofT.

  • Selected Publications

    My research contributes to the fundamental problems in 3D content creation, from the 3D representation, to learning algorithm, and to interactive control with texts. My research enables generating high quality 3D meshes with textures and arbitrary topologies, from multi-view images, in a generative model, or from text prompts. Many of my work have been successfully deployed in real-world applications, including NVIDIA Picasso, GANVerse3D, Neural DriveSim, Toronto Annotation Suite etc.

    Representative papers are highlighted, with full publication list in Google Scholar.

    3D Generation
    Neural Fields meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes
    Zian Wang, Tianchang Shen, Jun Gao, Shengyu Huang, Jacob Munkberg, Jon Hasselgren, Zan Gojcic, Wenzheng Chen, Sanja Fidler
    CVPR, 2023
    Project page / PDF / Video / bibtex
    Combined with other NVIDIA technology, FEGR is one component of Neural Reconstruction Engine announced in GTC Sept 2022 Keynote.
    Magic3D: High-Resolution Text-to-3D Content Creation
    Chen-Hsuan Lin*, Jun Gao*, Luming Tang*, Towaki Takikawa*, Xiaohui Zeng*, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu and Tsung-Yi Lin, (*, denote equal contribution)
    CVPR, 2023   (Highlight)
    Project page / arXiv
    We create high-quality 3D meshs with textures from text prompts, utilizing a two-stage pipeline with different diffusion models for fast and high-resolution text-to-3D generation.
    GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images
    Jun Gao, Tianchang Shen, Zian Wang, Wenzheng Chen, Kangxue Yin, Daiqing Li, Or Litany, Zan Gojcic and Sanja Fidler
    NeurIPS, 2022   (Spotlight Presentation)
    Project page / Code / PDF / arXiv / Two-Minute Paper Video (330k+ views)
    We develop a 3D generative model to generate meshes with textures, bridging the success in the differentiable surface modeling, differentiable rendering and 2D GANs.
    Extracting Triangular 3D Models, Materials, and Lighting From Images
    Jacob Munkberg, Jon Hasselgren, Tianchang Shen, Jun Gao, Wenzheng Chen, Alex Evans, Thomas Müller and Sanja Fidler
    CVPR, 2022   (Oral Presentation)
    Project page / arXiv / PDF / Code / Two-Minute Paper Video (370k+ views)
    Nvdiffrec reconstructs 3D mesh with materials from multi-view images by combining diff surface modeling with diff renderer. The method supports Nvidia Neural Drivesim.
    Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering
    Yuxuan Zhang*, Wenzheng Chen*, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, and Sanja Fidler (* denotes equal contribution)
    ICLR, 2021   (Oral Presentation)
    Project page / arXiv
    We exploit pretrained 2D StyleGAN to generating "pseudo" multi-view images for training single image 3D reconstruction network. We deployed it into Omniverse AI Toybox (I led the effort).

    2D/3D Representation
    Flexible Isosurface Extraction for Gradient-Based Mesh Optimization
    Tianchang Shen, Jacob Munkberg, Jon Hasselgren, Kangxue Yin, Zian Wang, Wenzheng Chen, Zan Gojcic, Sanja Fidler, Nicholas Sharp*, Jun Gao*.
    In ACM Transactions on Graphics (SIGGRAPH), 2023
    Project page / Paper / Supplement / arXiv / ACM
    Flexicubes is our updated differentiable isosurfacing method, which has a more uniform mesh tessellation, better flexibility to align with sharp features, and allows the gradient from geometrics, physics, or mesh energy to be backpropagated into mesh generation pipeline.
    Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis
    Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu and Sanja Fidler
    NeurIPS, 2021
    Project page / arXiv / Video
    DMTet differentiablely extracts the iso-surface from an implicit field via Marching Tetrahedra. We demonstrate the effectiveness in generating surface details from coarse voxels.
    Learning Deformable Tetrahedral Meshes for 3D Reconstruction
    Jun Gao, Wenzheng Chen, Tommy Xiang, Alec Jacobson, Morgan Mcguire, and Sanja Fidler
    NeurIPS, 2020  
    Project page / Code / arXiv
    We introduce tetrahedral meshe representation into 3D vision to support generating mesh with arbitrary topologies. A differentiable renderer is also devised for DefTet.
    Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid
    Jun Gao, Zian Wang Jinchen Xuan, and Sanja Fidler
    ECCV, 2020
    Project page / arXiv / Code / Presentation Video (short, long)
    We introduce a deformable grid representation for 2D images to better capture the high-frequency image content. We showcase applications including: replacing standard pooling methods with learnable geometric downsampling, semantic segmentation, object annotation, and unsupervised image partitioning.
    Fast Interactive Object Annotation with Curve-GCN
    Huan Ling*, Jun Gao*, Amlan Kar, Wenzheng Chen, and Sanja Fidler (* denotes equal contribution)
    CVPR, 2019  
    arXiv / Code/ Video / Demo
    We utilize polygons to represent object boundaries, facilitating interactive object annotation. Our work has been successfully integrated into Toronto Annotation Suite, making real-world annotation faster!
    DeepSpline: Data-Driven Reconstruction of Parametric Curves and Surfaces
    Jun Gao, Chengcheng Tang, Vignesh Ganapathi-Subramanian, Jiahui Huang, Hao Su and Leonidas J. Guibas
    arXiv (Non peer-reviewed, undergrad work)
    arXiv / Code
    We predict spline curves to represent images (e.g. MNIST digits), and further extend it with the surface of revolution to represent 3D objects (e.g. vase).

    Differentiable Rendering
    DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer
    Wenzheng Chen, Joey Litalien, Jun Gao, Zian Wang, Clement Fuji Tsang, Sameh Khamis, Or Litany and Sanja Fidler
    NeurIPS, 2021
    Project page / arXiv / bibtex
    DIB-R++ combines rasterization and ray-tracing to differentiably render the image with specular reflections, enabling us to predict geometry, reflectance and lighting from single image.
    Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer
    Wenzheng Chen Jun Gao* Huan Ling*, Edward J. Smith*, Jaakko Lehtinen, Alec Jacobson and Sanja Fidler (* denotes equal contribution)
    NeurIPS, 2019  
    Project page / arXiv / Code/ Two-Minute Paper Video (240k+ views)
    DIBR is an interpolation-based 3D mesh differentiable renderer that supports vertex, vertex color, multiple lighting models, texture mapping and could be easily embedded in neural network.
    Academic Service
    Area Chair: NeurIPS 2023

    Conference Reviewer: CVPR, ICCV, ECCV, SIGGRAPH, NeurIPS, ICML, ICLR

    Research Experience
    Research Scientist at NVIDIA Dec. 2019 - Present

    Research Intern at NVIDIA Oct. 2018 - Dec. 2019

    Research Intern at Microsoft Research, Asia Sep. 2017 - Jun. 2018

    Research Intern at Stanford University Jun. 2017 - Aug. 2017

    Research Assistant at Peking University Apr. 2016 - Jun. 2018

    Invited Talks
    Towards Generative Modeling of 3D Objects Learned from Images. ETH, Oxford, JHU, UofT, PKU, BAAI (2023)

    Learning Geometric Representation for Computer Vision. (GAMES-CN) Chinese [Link] (2020)

    Learning Geometric Representation from Images. University of Alberta (2020)

    Mentored Students/Interns
    Tianchang Shen (PhD, CS UofT) Working on differentiable isosurfacing methods.

    Weiwei Sun (PhD, CS UBC) Working on 3D generative models.

    Zian Wang (PhD, CS UofT) Working on lighting decomposition with differentiable renderers.

    Jinchen Xuan (Undergrad, CS PKU) Working on geometric image representation.

    Gary Leung (MSc, CS UofT) Working on Transformer for images.

    Pro bono office hours
    As a fifth-year PhD student, I always see the information asymmetry between junior students and senior students,researchers or professors on problems related to research topics/directions, future career, and failure (and also excitement!) in research. This problem is more severe for people from underrepresented group.

    Following Krishna, Wei-Chiu and Shangzhe, I decide to commit 1-2 hours per week (mostly on Sunday) to host pro bono office hours and provide guidance, suggestions or mentorship to reduce the information asymmetry mentioned above, please fill this form if you are interested.

    I borrowed the template from , ,,.