Week 5 Summaries: Kinect Fusion and Going Out

By Mohini Thakkar on February 6th, 2013

KinectFusion: Realtime 3D Reconstruction and Interaction Using a Moving Depth Camera

This paper describes the currently trending technology: KinectFusion. KinectFusion creates 3D reconstructions of an indoor scene using just the depth data in real time within seconds. However the depth maps are noisy and contain numerous “holes” which are dealt with by continuously tracking 6DOF pose of the camera and fusing new viewpoints into a global model in real time. This allows refinement of the 3D reconstruction with time as the depth data from different viewpoints is gathered. What makes KinectFusion different from existing reconstruction systems that are either offline or non-interactive is that it supports real time interactive rates for both camera tracking and 3D reconstruction. It can be used a low cost object scanner to import the reconstructed 3D models in some modeling application or for 3D printing.

Features of KinectFusion that appeal to me are that it is a small portable device, low cost, can deal with dynamically changing scenes, supports whole room reconstructions and interactions, can segment the foreground from the background enabling direct interaction between the user and virtual objects. This makes a whole new bunch of applications possible such as real time multi touch interactions on any surface , geometry aware AR, physics based interactions and so on.

Other than discussing the features and applications of KinectFusion the paper also describes the underlying GPU pipeline and algorithms in detail. The GPU pipeline consists of Depth Map Conversion, Camera Tracking, Volumetric Integration and Raycasting.

I am amazed with KinectFusion’s capabilities and would want to know what limits it from being used outdoor. The paper mentions that it can be used for indoor scenes but doesn’t describe the problems faced in using it outdoors.

Going out: Robust Model-based Tracking for Outdoor Augmented Reality

Augmented Reality involves overlaying virtual information/objects on the real world and so accurate position and orientation information is required. Traditional AR systems for outdoor rely on GPS for position measurements and magnetic compasses and inertial sensors. However in urban setting, shadows from buildings and signal reflections deteriorate the performance of the GPS. Even the inertial and magnetic sensors are more prone to errors and drift in urban settings.

To overcome these problems, this paper describes a model based hybrid tracking system for AR in urban environments. It is a computer vision based approach that relies on existing textured 3D models of the world. The advantages of using textured 3D model over a pure edge model are the reduction in complexity and elimination of wrong data associations and incorrect pose estimates. The overall tracking framework consists of an edge based tracking system combined with a sensor for inertial and magnetic field measurements. A Kalman filter fuses measurements from both of these components. The tracking process comprises of rendering the view from a prior camera pose and then extracting edges from the grayscale image. The prior pose is then overlaid over the video image and updated with measurements to yield a posterior pose.

The system was demonstrated using a simple location based game. The goal of the game is to deliver a love letter by finding the ladder and placing it at the right window in the environment within the allotted time frame. For demonstration purposes, two sites were modeled and performance of the system was measured. The system operated at 15-17 frames per second. The system was quite accurate with two major flaws that buildings appeared closer than real and the camera trajectory deviated from the line. The system was quite robust and had a recovery mechanism in place to deal with occlusions of the view by passing vehicles or persons.

Although their approach worked for the game one question that bothers me is this technique practical for outdoor AR as it requires textured 3D models of the environment before it can be used in that environment. Given the complexity and size of the world around us I wonder if 3D models of the entire world can be made available.

Categorized under: week5.
Tagged with: no tags.

Comments are closed.

CS7497 Spring 2013

Virtual Environments