For decades, the dream of making a machine that can think and learn like human seems perpetually out of reach, untile the deep learning technology comes. Like a brain, a deep nerual network has layers of neurons. When a neuron fires, it sends signals to connected neurons in the layer above. Deep nerual networks learn by adjusting the strenghs of their connections to better convey input signals through multiple layers to neurons associated with the right general concepts. The machines that powered with deep nerual network have learned to converse, drive cars beat video games, paint pictures and so on, they have also confounded their human creators, who would never expected so-called “deep-learning” algorithms to work so well. Experts wonder what’s it is about deep learning that it works so well?
Recently, Naftali Tishby, from the Hebrew University of Jerusalem, presented evidence in support of a new theory explaining how deep learning works. Tishby argues that deep neural networks learn according to a procedure called the “information bottleneck”, which he and two collaborators first described in purely theoretical terms in 1999. The idea is that a network rids noisy input data of extraneous details as if by squeezing the information through a bottleneck, retaining only the features most relevant to general concepts. A new experiment conducted by Tishby’s team reveals how this sequeezing procedure happens during deep learning, at least in the cases they studied. The scientist from google, Alex Alemi, said “the bottleneck could serve not only as a theoretical tool for understanding why neural networks work, but also as a tool for constructing new objectives and architectrures of networks”.