Many people have experimented with using the Kinect for more than just user interaction. One thing that I have been very interested in is extracting point clouds from the device.
People at the ROS (ros.org) project have gone to some trouble to determine empirical calibration parameters for the depth camera (disparity to real-world depth) here, and Nicolas Burrus has posted parameters for the RGB camera (relationship between the depth image and the RGB image) here.
Putting those together, one can take the depth image from the Kinect and turn it in to a metric point cloud with real distances. Then, those points can be projected back to the RGB camera centre to determine which RGB pixel corresponds to each depth point, and hence arrive a colour for each point in the cloud. This lets the surfaces captured in the image appear textured. With a bit of coding I came up with this:
A coloured metric point cloud taken inside my house. (That’s my hand in the foreground.)
One thing that I haven’t seen explored much is how the Kinect fares collecting point clouds outside. The claimed max range of the Kinect is 4m, but my device has been able to reach more than 5.5m inside.
Because the Kinect operates using infrared structured-light, infrared interference can reduce the range significantly or even result in no depth image at all. This is a problem when using the device outside as sunlight during the day plays havoc with the depth image returned by the Kinect. Of course, in the dark you will get a great depth image but no RGB image to colour the cloud!
There is a YouTube video posted by some robotics students showing how the sensor operates in sunlight:
Inspired by the video I decided to try it for myself – so I attached a Kinect to my car…
Using the software I had already written I could capture point clouds with metric distances relative to the Kinect. However since the Kinect itself is moving I wanted a different output. I wanted a real-world point cloud that spans many depth frames. Collecting all the information needed to reconstruct a 3D world that is spatially located meant I had to write a bit more software…
To spatially locate my car (and hence the Kinect itself) I used the GPS in my Google Nexus One along with BlueNMEA. This allowed me to get NMEA strings from the GPS in the phone via a TCP connection and log them. Using that information I could locate each depth frame and image frame and build a point cloud in a real-world coordinate system (so every point has the equivalent of a latitude, longitude, and altitude).
My software talks to the Kinect and Phone in real-time and logs all the data needed to export a point cloud. I wrote an exporter for the PLY format so I could easily view the data in the awesome open source MeshLab.
In the end I was able to capture some pretty cool looking things like this nice white picket fence:
Combining a section of depth frames you can get an idea of the power of the method. Here is a 26m section of road travelling at speed:
These points are all in real-world coordinates and could be put, say on Google Earth and appear in the right place. The point cloud is a bit messy because I did not have easy access to gyroscopes or accelerometers to track the motion of the car. Perhaps this is a good excuse to purchase a Nexus S! I did not bother to access the accelerometers in the Nexus One because it doesn’t have a gyro and so the outputs are of limited use for dead reckoning.
The project uses libfreenect and the .NET wrapper for same, along with OpenTK for the Matrix maths and Proj.NET for the spatial transforms. All amazing libraries and I’m in awe of the developers who spend their time maintaining them.
The code will live on GitHub here. It’s very hacky and mostly useful as a proof-of-concept. If you’re going to do anything with it, please wear a helmet.
Update #1: Reaction to Slashdotting