The Computer That Can See

Dear friend of CIFAR,

I will never get lost in a strange city again.

Once I manage to train computers to see as humans do, that is. Imagine this: You’re disoriented in a new city. You take out your cell phone and photograph a nearby building. Then you email the image to a server that identifies its contents, tells you where you are and gives you directions for where you want to go. This same service will also act as a tour guide – it will offer architectural and historical information about city sights, and point out the closest place to grab a coffee or stay the night.

The technology that will drive this ease of orientation is called “computer vision.”

Currently, a search for pictures with a tool like Google, Yahoo or Ask relies on the keywords that surround images. Computer vision uses algorithms to teach computers to search by the actual content of the images, rather than caption information. By “training” computers with these algorithms, we hope to give them the human-like ability to understand what they see in the world.

Today’s most sophisticated computers cannot match a five-year-old’s ability to correctly parse and identify the contents of a picture they have never seen before. When the human brain looks at a picture, its neuronal circuits process the lines, patterns and shapes to make sense of the scene. My research attempts to construct computer algorithms that mimic this process, to help explain how the brain processes visual information. Those algorithms can also then be used to teach computers how to see the way we see.

One of my colleagues in the Neural Computation and Adaptive Perception program has developed another means to teach computers to see – by memory. He trains computers to search the actual content of the images, rather than caption information, using an 80-million image database. This database categorizes pictures using a “nearest neighbour” algorithm. A computer can then identify an image it has never seen before by comparing to the many similar pictures it has already seen.

The potential uses for machines capable of computer vision are many.

Helping people improve their sense of direction is a major use. Another is boosting the accuracy of image search engines. Currently, keyword searches are powerful but flawed – enter “airplane” and the search engine will spit out pictures that have been labeled as airplanes. But because the search relies on words rather than images, it also returns images of airplane engines, posters from the movie Airplane!, fathers swinging their children in circles, Jefferson Airplane, etc.

Computer vision could drive computational photography as well. This type of photography uses computers to overcome the limitations of traditional cameras, producing a richer, less blurry and more informative representation of the visual world. The computer in the camera will know whether you’re taking a photo of a sportscar or a movie star, or whether your model’s eyes are open or closed.

When you travel in the future, you can look forward to better photographs and a richer sightseeing experience – all thanks to a better understanding of human sight.

Best wishes from the frontiers of human knowledge.

Bruno Olshausen
Fellow, Neural Computation and Adaptive Perception
Canadian Institute for Advanced Research
———————————————————————————————————

Send this E-Postcard to a Friend!
  1. (required)
  2. (valid email required)
  3. (required)
  4. (valid email required)