Alexander Gao
CV  | 
Google Scholar
 | 
Github
I am a second-year PhD student in Computer Science at University of Maryland, College
Park,
advised by Ming C. Lin.
My research draws from a variety of computer science subfields, including computer graphics,
machine learning & optimization, scientific computing, and computational geometry, and has
applications in fields such as AR/VR and Robotics. Specifically, I aim to develop novel
algorithms for simulation and rendering, by bridging classical geometric techniques and
physically-based modeling with learning-based paradigms and neural representations.
I previously interned on the Google GeoAR team, and worked as an Applied Scientist at AWS
Robotics. I received an MS in Computer Science from NYU and a BA in Film Production
from USC School of Cinematic Arts.
Don't hesitate to reach out at gaoalexander [at] gmail dot com.
News:
Jan 2023: Learning Simultaneous Navigation and
Construction is
accepted to ICLR 2023. Congrats to lead author Wenyu Han and all.
Dec 2022: We presented our paper NeuPhysics at
NeurIPS
2022 in New Orleans!
|
|
|
Learning Simultaneous Navigation and Construction in Grid Worlds
International Conference on Learning Representations (ICLR), 2023
Wenyu Han, Haoran Wu, Eisuke Hirota, Alexander Gao, Lerrel Pinto, Ludovic Righetti, and
Chen Feng.
Paper /
Code /
Website
We propose to study a new learning task, mobile construction, to enable an agent to build
designed structures in 1/2/3D grid worlds while navigating in the same evolving environments.
Unlike existing robot learning tasks such as visual navigation and object manipulation, this
task is challenging because of the interdependence between accurate localization and
strategic construction planning. In pursuit of generic and adaptive solutions to this
partially observable Markov decision process (POMDP) based on deep reinforcement learning
(RL), we design
a Deep Recurrent Q-Network (DRQN) with explicit recurrent position estimation in this dynamic
grid world. Our extensive experiments show that pre-training this position estimation module
before Q-learning can significantly improve the construction performance measured by the
intersection-over-union score, achieving the best results in our benchmark of various
baselines including model-free and model-based RL, a handcrafted SLAM-based policy, and human
players.
|
|
NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos
Conference on Neural Information Processing Systems (NeurIPS), 2022
Alexander Gao*, Yi-Ling Qiao*, and Ming C. Lin.
Paper /
Code /
Website
We present a method for learning 3D geometry and physics parameters of a dynamic scene from
only a monocular RGB video input. To decouple the learning of underlying scene geometry from
dynamic motion, we represent the scene as a time-invariant signed distance function (SDF)
which serves as a reference frame, along with a time-conditioned deformation field. We
further bridge this neural geometry representation with a differentiable physics simulator by
designing a two-way conversion between the neural field and its corresponding hexahedral
mesh, enabling us to estimate physics parameters from the source video by minimizing a cycle
consistency loss. Our method also allows a user to interactively edit 3D objects from the
source video by modifying the recovered hexahedral mesh, and propagating the operation back
to the neural field representation. Experiments show that our method achieves superior mesh
and video reconstruction of dynamic scenes compared to other competitive Neural Field
approaches, and we provide extensive examples which demonstrate its ability to extract useful
3D representations from videos captured with consumer-grade cameras.
|
*Denotes equal contribution.
|