Neural networks power consumption could help make the systems portable

In recent years, the best-performing artificial-intelligence systems — in areas such as autonomous driving, speech recognition, computer vision, and automatic translation — have come courtesy of software systems known as neural networks.

But neural networks take up a lot of memory and consume a lot of power, so they usually run on servers in the cloud, which receive data from desktop or mobile devices and then send back their analyses.

Last year, MIT associate professor of electrical engineering and computer science Vivienne Sze and colleagues unveiled a new, energy-efficient computer chip optimized for neural networks, which could enable powerful artificial-intelligence systems to run locally on mobile devices.

Now, Sze and her colleagues have approached the same problem from the opposite direction, with a battery of techniques for designing more energy-efficient neural networks. First, they developed an analytic method that can determine how much power a neural network will consume when run on a particular type of hardware. Then they used the method to evaluate new techniques for paring down neural networks so that they’ll run more efficiently on handheld devices.

The researchers describe the work in a paper they’re presenting next week at the Computer Vision and Pattern Recognition Conference. In the paper, they report that the methods offered as much as a 73 percent reduction in power consumption over the standard implementation of neural networks, and as much as a 43 percent reduction over the best previous method for paring the networks down.

Energy evaluator

Loosely based on the anatomy of the brain, neural networks consist of thousands or even millions of simple but densely interconnected information-processing nodes, usually organized into layers. Different types of networks vary according to their number of layers, the number of connections between the nodes, and the number of nodes in each layer.

The connections between nodes have “weights” associated with them, which determine how much a given node’s output will contribute to the next node’s computation. During training, in which the network is presented with examples of the computation it’s learning to perform, those weights are continually readjusted, until the output of the network’s last layer consistently corresponds with the result of the computation.

“The first thing we did was develop an energy-modeling tool that accounts for data movement, transactions, and data flow,” Sze says. “If you give it a network architecture and the value of its weights, it will tell you how much energy this neural network will take. One of the questions that people had is ‘Is it more energy efficient to have a shallow network and more weights or a deeper network with fewer weights?’ This tool gives us better intuition as to where the energy is going, so that an algorithm designer could have a better understanding and use this as feedback. The second thing we did is that, now that we know where the energy is actually going, we started to use this model to drive our design of energy-efficient neural networks.”