I have a major love/hate affair with google. This MindBlog uses google's blogger service, and so utterly depends on it, all my email addresses forward to my gmail account, I use it to synchronize all my calendar, documents, spreadsheets, and contacts, across multiple devices. I use google voice for phoning, google+ hangouts for video chats, etc. etc. Google's services have become such a prosthesis for me that I am quite helpless away from its Cloud. At the same time, I resist as many of the 'connectivity' efforts as much as I can. I emphatically do not want to know whether a friend is nearby, and don't want people following me. I think we are constantly flirting with the 'uncanny valley' effect, where what might be useful suddenly becomes very spooky.
In this vein, a
recent article noting google's efforts to model the human brain made me both excited, interested, and terrified at the same time. Google's brain used an array of 16,000 processors to create a neural network with more than one billion connections, and presented it with 10 million digital images found in YouTube videos. Without any instructions or labels, it learned to detect faces, human bodies, and cats! This suggests that the human brain, which has at least a million times more connections than this model, could learn significant classes of stimuli with minimum genetic nudging other than instructions for making nerves cells whose connections can be shaped by the sensory input received.
Here is the abstract from Le et al.(
PDF here):
We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, s it possible to learn a face detector using only unlabeled images? To answer this, we train a 9- layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also found that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art.