Week 5 Summaries

Week 5 KinectFusion

KinectFusion is an awesome tool for 3D surface reconstruction using the Kinect camera. It works in real-time, reconstructing and storing all the depth information it gets. The depth maps are a little noisy but there are optimizations to overcome those issues. Depth cameras have been around for a while but Kinect made them accessible and affordable which opens up a lot of options. Existing methods are either offline or non-interactive in nature but KinectFusion not only makes it easy interactive but also remains real-time, so one can take the camera around the room and reconstruct the whole scene in 3D as the system gathers more data.

Adding extensions to the core GPU pipeline allowed for object segmentation and user interaction directly in front of the camera, without degrading the camera tracking and reconstruction. This also enabled multi-touch interactions anywhere. The pipeline consists of consists of Depth Map Conversion, Camera Tracking, Volumetric Integration and Raycasting. The authors also explored the possibility of Geometric Aware AR where 3D virtual world is overlaid onto and interacts with real world representations. This combined with physics simulation can be used to create a more immersive VR or AR experience.

Kinect dev-kit being made openly available for had led to some really interesting ideas and this could be one of the best. To be able to reconstruct and interact in real time with a virtual world just by using a cheap depth-camera like Kinect is an awesome achievement.

Question: How is the scene affected if you move really fast or throw something from one side to another?

 

Week 5 Going Out: Robust Model-based Tracking for Outdoor Augmented Reality

The KinectFusion is awesome for the indoors but it does not work in the outdoors. This paper is not directly addressing or even using KinectFusion but it is an attempt at providing a better AR experience outdoors. Traditionally, for outdoor AR systems, we use GPS, magnetic compasses, inertial sensors etc. for tracking. And even though advances are being made in each of these, none of them alone have provided a good answer to high resolution tracking. These technologies especially struggle in an urban setting. So the authors attempt to combine a bunch of these (tracker, gyroscope, gravity and magnetic sensors) together to form a hybrid tracking system which enables realtime, accurate overlays on mobile devices.

The system is an edge-based system extending it to 3D textured models. The system first renders the camera pose, extracts the images from the greyscale image, overlaid on top of the video image for a match and updated with measurements. The recovery component of the system overcomes the problem of dynamic occluders which cause the tracking system to fail. It tries to compare the current video frame to the older, stored video frames. If there is a match, the system tries to recalculate the current position and orientation, else it is considered a failure and set the velocities to zero. Combining these features with the sensor data gives a more accurate and “correct” experience of AR. For testing the system, they focused on accuracy, robustness, performance and dynamic behavior.

The interesting realization at the end of this paper for me is how different the 2 worlds – indoors and outdoors are. The systems required for both are completely different because the problems faced by both are completely different.

Question: Is there a limitation with edge tracking if some place has 2 identical features? Will the tracking system alone be able to make up for this lack of information?

Comments are closed.