MIT research could turn your camera into the ultimate spy gadget


posted Wednesday, August 6, 2014 at 11:07 AM EST


Famous British sci-fi author Arthur C. Clarke once wrote that "any sufficiently advanced technology is indistinguishable from magic". Well, dear readers, we have a little piece of magic for your delectation today, courtesy of research from MIT, Microsoft, and Adobe. Together, the trio of tech titans have developed algorithms capable of reconstructing sound from a video feed that entirely lacks audio -- even if there was soundproof glass between the source of the sound, and the camera recording it. Bizarrely, the key could be something as simple as a bag of potato chips.

In laymans' terms, the algorithms detect minute vibrations in the scene being filmed that were caused by sound waves, and then use this nearly invisible subject motion to reconstruct a facsimile of the sound that caused it in the first place. And when we say nearly invisible, from a human perspective the scene appears completely static. We're talking about motions here that are on the order of a hundredth of a pixel, according to the researchers behind the tech. It's a similar technique to one actually used in spying, where a laser is bounced off a window and its motion recorded to detect sounds beyond the pane of glass, but the difference here is that regular video is enough to recreate the sound at your leisure, given the presence of an object in the scene that can be affected by the passing of sound waves.

Research from MIT, Microsoft and Adobe makes it possible to recreate sound from video shot through a soundproof barrier -- or even without a soundtrack at all.

The really interesting thing is that, although you obviously get better results when using specialized, high-speed cameras that are capable of capturing higher frequencies of motion in the first place, the technique can be extended to work at the frame rates typical of video from many consumer cameras. As you'll see in the video above, it's possible to get recognizable results from 60p DSLR video, with frequencies more than five times higher than the capture frame rate being recorded. The trick here is that as well as looking at subject motion in each distinct frame of video, the algorithms also use knowledge of the rolling shutter effect typical of consumer cameras to generate data at an even greater granularity.

It's truly astounding stuff that wouldn't seem out of place in a spy novel -- and who would have thought that rolling shutter would turn out to be a useful tool, rather than the annoying artifact we've long believed it to be? Find more details on the research in "The Visual Microphone: Passive Recovery of Sound from Video".

(via nofilmschool. 'Utz'-brand Crab Chips image courtesy of Dug Song / Flickr; used under a Creative Commons CC-BY-2.0 license. Image has been modified from the original.)