Google taps 16k computers to look for cats –for Science!

Published June 27, 2012 3:19pm

Scientists at Google have built a neural network out of 16,000 computer processors to simulate the human brain’s learning process. And the first thing it learned was... to look for cats.

But who can blame Google’s latest creation, when it was set loose on an Internet full of cat videos —with over 10 million digital images on YouTube alone?

The New York Times reported the neural network learned by itself to recognize cats, a feat that surpassed previous efforts to recognize objects in a list of 20,000 items. Here, kitty, kitty

“We never told it during the training, ‘This is a cat,’” said Google fellow Jeff Dean, who helped Google design the software that breaks programs into many tasks that can be computed simultaneously.

“It basically invented the concept of a cat. We probably have other ones that are side views of cats,” he added.

Potential applications include improvements to image search, speech recognition and machine language translation, the NYT said.

The researchers are to present the results of their work at a conference in Edinburgh, Scotland.

In a separate blog post, Dean and visiting faculty Andrew Ng said they believe machine learning could be far more accurate, and smarter computers could make everyday tasks much easier. Learning how to learn

They said recent research on self-taught learning and deep learning suggests researchers can rely instead on unlabeled data, such as random images fetched off the web or out of YouTube videos.

These algorithms work by building artificial neural networks, which loosely simulate neuronal (the brain’s) learning processes.

“Neural networks are very computationally costly, so to date, most networks used in machine learning have used only 1 to 10 million connections. But we suspected that by training much larger networks, we might achieve significantly better accuracy. So we developed a distributed computing infrastructure for training large-scale neural networks. Then, we took an artificial neural network and spread the computation across 16,000 of our CPU cores (in our data centers), and trained models with more than 1 billion connections,” they said in their blog post.

They also said they are working with other groups within Google on applying this artificial neural network approach to other areas such as speech recognition and natural language modeling.

"Someday this could make the tools you use every day work better, faster and smarter," they said.

New generation of computer science

Google’s research represents a new generation of computer science exploiting the falling cost of computing and the availability of huge clusters of computers in giant data centers, the NYT said.

It said this is leading to significant advances in areas such as machine vision and perception, speech recognition and language translation.

Only last year, Microsoft scientists presented research showing that the techniques could be applied as well to build computer systems that understand human speech.

“This is the hottest thing in the speech recognition field these days,” said Yann LeCun, a computer scientist who specializes in machine learning at the Courant Institute of Mathematical Sciences at New York University.

Where are all the cats?

The Google research team led by Stanford University computer scientist Andrew Ng and the Google fellow Dean used an array of 16,000 processors to create a neural network with more than one billion connections.

It was fed random thumbnails of images, one each randomly extracted from 10 million YouTube videos.

During the study, the NYT said the software-based neural network “appeared to closely mirror theories developed by biologists that suggest individual neurons are trained inside the brain to detect significant objects.”

the Google research, the machine was given no help in identifying features.

“The idea is that instead of having teams of researchers trying to find out how to find edges, you instead throw a ton of data at the algorithm and you let the data speak and have the software automatically learn from the data,” Ng said.

Digital cat image

NYT said the Google brain assembled a dreamlike digital image of a cat by using memory locations to cull out general features after being exposed to millions of images.

But the scientists also said it appeared they had developed a cybernetic cousin to what takes place in the brain’s visual cortex.

Neuroscientists referred to the “grandmother neuron,” specialized cells in the brain that fire when they are exposed repeatedly to recognize a particular face of an individual.

“You learn to identify a friend through repetition,” said Gary Bradski, a neuroscientist at Industrial Perception, in Palo Alto, Calif.

Still, Ng said he was cautious about drawing parallels between his software system and biological life.

“A loose and frankly awful analogy is that our numerical parameters correspond to synapses,” he said. No comparison

Still, the scientists admitted there is no accepted way to compare artificial neural networks to biological brains, as an adult human brain typically has many times more connections —around 100 trillion, on average.

"So we still have lots of room to grow," they said.

A key difference was that, despite the immense computing capacity that the scientists used, it was still dwarfed by the number of connections found in the brain.

“It is worth noting that our network is still tiny compared to the human visual cortex, which is a million times larger in terms of the number of neurons and synapses,” NYT quoted the researchers as saying.

Still, the Google research provides new evidence that existing machine learning algorithms improve greatly as the machines are given access to large pools of data. Done within the decade?

“The Stanford/Google paper pushes the envelope on the size and scale of neural networks by an order of magnitude over previous efforts,” said David Bader, executive director of high-performance computing at the Georgia Tech College of Computing.

Bader added rapid increases in computer technology would close the gap within a relatively short time.

He said the scale of modeling the full human visual cortex “may be within reach before the end of the decade.”

Out of Google X

Google scientists said the research project had now moved out of the Google X laboratory and was being pursued in the division that houses the company’s search business and related services.

Despite their success, the Google researchers remained cautious about whether they have indeed created machines that can teach themselves.

“It’d be fantastic if it turns out that all we need to do is take current algorithms and run them bigger, but my gut feeling is that we still don’t quite have the right algorithm yet,” said Ng. — TJD, GMA News

Tags: google, brain, computers, computernetwork, science