Research Focus


Multi-view Reconstruction and Visual SLAM

Illustration of visual SLAM

Simultaneous Localization and Mapping involves inferring the camera position and scene structure from an input image sequence. Our recent work in this area has focused on bridging the gap between geometric methods and deep learning. The overarching goal is unifying geometry and learning in a manner that leverages the accuracy of traditional geometric approaches and the robustness of deep learning. This unity promises a leap in the accuracy and robustness of 3D reconstructions.


Scene Representations and Generative Models

Generative scene respresentations

Scene representations are an essential part of machine perception, as they give a computational interpretation of how scenes are structured. Two of our relevant contributions in this area are Volumetric Bundle Adjustment and CodeSLAM. CodeSLAM introduces a latent variable model to optimally compress scene geometry, allowing for efficient and detailed reconstructions. Volumetric bundle adjustment, uses a sparse data structure to create an efficient volumetric representation of space, providing photorealistic scene reconstruction with reduced space and computational requirements.


Pose and Semantics

Photo showing human pose estimation with 3D skeleton

As robots and devices become more capable, there’s a growing need to build higher-level understanding of the environment to do ever more complex tasks. One example of this is understanding what humans are doing, our work on Orientation Keypoints aims to discern and predict human body configurations from images or video. Some other examples of our work in this area includes Fusion++, which builds graph structured representations of scenes based on objects. This intersection of geometric vision and semantics paves the way for a more holistic understanding of visual content.


Physical Science Applications

Digital twin of cloud fields

Beyond traditional computer vision applications, there’s much promise in using computer in physical sciences, proving the versatility of computer vision techniques. One notable example of such interdisciplinary collaboration is the use of stereo camera pairs to reconstruct cloud fields. This method allows for a detailed 3D representation of atmospheric cloud structures, providing invaluable data for meteorological analysis and climate studies. It showcases the vast potential of combining computer vision methodologies with physical science research, opening doors to a multitude of innovative applications.