Going Out with KinectFusion

KinectFusion: Realtime 3D Reconstruction and Interaction Using a Moving Depth Camera

-Shahram Izadi, David Kim, Otmar Hilliges, David Molyneaux, Richard Newcombe, Pushmeet Kohli, Jamie Shotton, Steve Hodges, Dustin Freeman, Andrew Davison, Andrew Fitzgibbon

Simply amazing. In this phenomenal paper Izadi and colleagues describe some novel approaches to 3D scene reconstruction and interactions with the scene using commodity hardware. The first part of the paper describes techniques to obtain a depth map from the Kinect sensor and reconstructing the scene. The way this works is the depth map is used to generate voxels in the virtual scene. The voxels are initially grainy because of noise in the depth map. As the Kinect sensor moves around the scene, the newly acquired depth data along with knowledge of the 6DOF of the sensor can be used to update the partially generated virtual scene. This process iteratively refines the scene filling holes and removing noise present in each individual depth map. Secondly, the authors describe some novel GPU algorithms to register and render the scene in real time. Surprisingly all the results published were obtained from a commonplace NVidia GTX470 GPU. Lastly, the paper describes an approach to distinguish a user from the reconstructed scene and use this knowledge for AR interactions. The authors figured out a way to segment foreground such as a user’s hand from the background. This allows the hand to be tracked as it comes in contact with the various scene surfaces as shown by the 3D multi-touch paint application.

The results obtained in this work are truly breath taking. I am pleased to see the accuracy of the reconstruction as demonstrated by figure 4 in which the reconstructed model is printed using a 3D printer and closely resembles the original object. The project also opens up new ground in the AR domain by tracking 3D interactions with real world objects.


 

Going out: Robust Model-based Tracking for Outdoor Augmented Reality

-Gerhard Reitmayr, Tom W. Drummond

In this paper Reitmayr and Dummond describe a robust visual tracking system for augmented reality applications intended for outside use. The system relies on the availability of textured 3D models of the objects to be detected in the real world. The system can use data for gyroscopes and magnetic sensors to accurately detect the camera pose in the real world. It then renders a virtual scene based on the textured models already available to the system. The virtual scene is then fused with the video feed from the camera to identify a match of the features using edge tracking. The system also has a fallback algorithm based on past knowledge of camera poses and recorded video frames. This ensures robust performance in case the tracking is lost due to occlusions.

Although the system works well in its described use case scenarios I am a bit skeptical about its deployment. The unavailability of 3D textured models of many buildings or objects of interest will definitely hamper the practical usage of such a system. Also some of the open questions mentioned by the author like initializing the system using inaccurate GPS coordinates need to be answered before any AR applications are developed using this system.

Comments are closed.