Jun Gao
I am a PhD. student in the Machine Learning Group at the University of Toronto , supervised by Professor Sanja Fidler . I am also a research scientist at NVIDIA Toronto AI Lab .
My research interests focus on the intersection of 3D computer vision and computer graphics, particularly developing machine learning tools to enable 3D content creation at scale and making impact in real-world applications.
I graduated from Peking University in 2018 with a Bachelor degree, where I was fortunate to work with Professor Liwei Wang . I also interned in Stanford, MSRA and NVIDIA.
Email  / 
CV  / 
Google Scholar  / 
Twitter  / 
Github
Featured Research
My research contributes to the fundamental problems in 3D content creation, from the 3D representation, to learning algorithm, and to interactive control with texts. My research enables generating high quality 3D meshes with textures and arbitrary topologies, from multi-view images, in a generative model, or from text prompts. Many of my work have been successfully depolyed in real-world applications, including GANVerse3D , Neural DriveSim , Toronto Annotation Suite etc.
Representative papers are highlighted , with full publication list in Google Scholar .
3D Generation
Your browser does not support the video tag.
Magic3D: High-Resolution Text-to-3D Content Creation
Chen-Hsuan Lin* ,
Jun Gao* ,
Luming Tang* ,
Towaki Takikawa* ,
Xiaohui Zeng* ,
Xun Huang ,
Karsten Kreis ,
Sanja Fidler† ,
Ming-Yu Liu† and
Tsung-Yi Lin ,
(*,† denote equal contribution)
CVPR, 2023
Project page /
arXiv
We create high-quality 3D meshs with textures from text prompts, utilizing a two-stage pipeline with different diffusion models for fast and high-resolution text-to-3D generation.
Your browser does not support the video tag.
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images
Jun Gao ,
Tianchang Shen ,
Zian Wang ,
Wenzheng Chen ,
Kangxue Yin ,
Daiqing Li ,
Or Litany ,
Zan Gojcic and
Sanja Fidler
NeurIPS, 2022
Project page / Code / PDF /
arXiv / Two-Minute Paper Video (330k+ views)
We develop a 3D generative model to generate meshes with textures, bridging the success in the differentiable surface modeling, differentiable rendering and 2D GANs.
Your browser does not support the video tag.
Extracting Triangular 3D Models, Materials, and Lighting From Images
Jacob Munkberg ,
Jon Hasselgren ,
Tianchang Shen ,
Jun Gao ,
Wenzheng Chen ,
Alex Evans ,
Thomas Müller and
Sanja Fidler
CVPR, 2022   (Oral Presentation)
Project page /
arXiv /
PDF / Code / Two-Minute Paper Video (370k+ views)
Nvdiffrec reconstructs 3D mesh with materials from multi-view images by combining diff surface modeling with diff renderer. The method supports Nvidia Neural Drivesim .
Your browser does not support the video tag.
Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering
Yuxuan Zhang* ,
Wenzheng Chen* ,
Huan Ling ,
Jun Gao ,
Yinan Zhang ,
Antonio Torralba , and
Sanja Fidler (* denotes equal contribution)
ICLR , 2021   (Oral Presentation)
Project page /
arXiv
We exploit pretrained 2D StyleGAN to generating "pseudo" multi-view images for training single image 3D reconstruction network. We deployed it into Omniverse AI Toybox (I led the effort).
2D/3D Representation
Your browser does not support the video tag.
Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis
Tianchang Shen ,
Jun Gao ,
Kangxue Yin ,
Ming-Yu Liu and
Sanja Fidler
NeurIPS , 2021
Project page /
arXiv /
Video
DMTet differentiablely extracts the iso-surface from an implicit field via Marching Tetrahedra. We demonstrate the effectiveness in generating surface details from coarse voxels.
Your browser does not support the video tag.
Learning Deformable Tetrahedral Meshes for 3D Reconstruction
Jun Gao ,
Wenzheng Chen ,
Tommy Xiang ,
Alec Jacobson ,
Morgan Mcguire , and
Sanja Fidler
NeurIPS , 2020  
Project page /
Code /
arXiv
We introduce tetrahedral meshe representation into 3D vision to support generating mesh with arbitrary topologies. A differentiable renderer is also devised for DefTet.
Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid
Jun Gao ,
Zian Wang
Jinchen Xuan , and
Sanja Fidler
ECCV, 2020
Project page /
arXiv /
Code /
Presentation Video
(short ,
long )
We introduce a deformable grid representation for 2D images to better capture the high-frequency image content. We showcase applications including: replacing standard pooling methods with learnable geometric downsampling, semantic segmentation, object annotation, and unsupervised image partitioning.
Fast Interactive Object Annotation with Curve-GCN
Huan Ling* ,
Jun Gao* ,
Amlan Kar ,
Wenzheng Chen , and
Sanja Fidler
(* denotes equal contribution)
CVPR , 2019  
arXiv
/
Code /
Video / Demo
We utilize polygons to represent object boundaries, facilitating interactive object annotation. Our work has been successfully integrated into Toronto Annotation Suite , making real-world annotation faster!
DeepSpline: Data-Driven Reconstruction of Parametric Curves and Surfaces
Jun Gao ,
Chengcheng Tang ,
Vignesh Ganapathi-Subramanian ,
Jiahui Huang ,
Hao Su and
Leonidas J. Guibas
arXiv (Non peer-reviewed, undergrad work)
arXiv / Code
We predict spline curves to represent images (e.g. MNIST digits), and further extend it with the surface of revolution to represent 3D objects (e.g. vase).
Differentiable Rendering
DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer
Wenzheng Chen ,
Joey Litalien ,
Jun Gao ,
Zian Wang ,
Clement Fuji Tsang ,
Sameh Khamis ,
Or Litany and
Sanja Fidler
NeurIPS, 2021
Project page /
arXiv /
bibtex
DIB-R++ combines rasterization and ray-tracing to differentiably render the image with specular reflections, enabling us to predict geometry, reflectance and lighting from single image.
Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer
Wenzheng Chen
Jun Gao*
Huan Ling* ,
Edward J. Smith* ,
Jaakko Lehtinen ,
Alec Jacobson and
Sanja Fidler
(* denotes equal contribution)
NeurIPS , 2019  
Project page ] /
arXiv
/
Code /
Two-Minute Paper Video (240k+ views)
DIBR is an interpolation-based 3D mesh differentiable renderer that supports vertex, vertex color, multiple lighting models, texture mapping and could be easily embedded in neural network.
Area Chair: NeurIPS 2023
Conference Reviewer: CVPR, ICCV, ECCV, SIGGRAPH, NeurIPS, ICML, ICLR
Towards Generative Modeling of 3D Objects Learned from Images.
ETH, Oxford, JHU, UofT, PKU, BAAI (2022)
Learning Geometric Representation for Computer Vision. (GAMES-CN) Chinese [Link] (2020)
Learning Geometric Representation from Images. University of Alberta (2020)
Mentored Students/Interns
Tianchang Shen (PhD, CS UofT) Working on differentiable isosurfacing methods.
Weiwei Sun (PhD, CS UBC) Working on 3D generative models.
Zian Wang (PhD, CS UofT) Working on lighting decomposition with differentiable renderers.
Jinchen Xuan (Undergrad, CS PKU) Working on geometric image representation.
Gary Leung (MSc, CS UofT) Working on Transformer for images.
I borrowed the template from ✩ , ✩ ,✩ ,✩ .