IBM is back with an updated neuromorphic processor chip that it hopes will enable it to more efficiently scale up to powerful AI hardware systems, at least as far as inferencing goes.
The NorthPole chip is claimed to be inspired by the structure of the brain, and in a paper published in Science, is said to be 25 times more energy efficient than a GPU while outperforming it in latency when inferencing using the ResNet-50 neural network model.
Part of the reason for energy efficiency, IBM told us, is because all of the memory needed for processing is on the chip. This eliminates the need to shuffle data around, which Big Blue has previously identified as a source of energy inefficiency as well as introducing delays.
IBM also recently detailed a mixed-signal analog chip for AI inferencing, claimed to be able to match GPUs while consuming considerably less power.
We can’t run GPT-4 on this, but we could serve many of the models enterprises need
NorthPole is said to be a development of a previous IBM neural chip, TrueNorth, which was announced nearly a decade ago. The new device was presented at last month’s Hot Chips conference by IBM Fellow and Chief Scientist Dr. Dharmendra S. Modha, who was behind both chips.
The latest brain-inspired silicon is made up of 256 cores, with each core a vector matrix multiplication engine capable of 2,048 operations per cycle at 8-bit precision. These share 192MB of memory, plus 32MB framebuffer for IO tensors. It is claimed to be roughly 4,000 times faster than TrueNorth.
The paper describes the chip as “a low-precision, massively parallel, densely interconnected, energy-efficient, and spatial computing architecture with a co-optimized, high-utilization programming model.”
According to Modha, it blurs the boundary between compute and memory. It appears as an active memory from outside the chip, he said, which should make NorthPole easier to integrate into other systems.
However, as IBM concedes, NorthPole is limited by the amount of data it can fit into that on-chip memory space. The workaround to enable NorthPole to support larger neural networks is by breaking them down into smaller sub-networks that will fit, and connecting these sub-networks together across multiple NorthPole chips.
“We can’t run GPT-4 on this, but we could serve many of the models enterprises need,” Modha explained. “And, of course, NorthPole is only for inferencing,” he added, meaning that the models need to be trained elsewhere on a different system, most likely using GPUs.
Modha told nature that the cores in NorthPole are wired together in a network “inspired by the white-matter connections between parts of the human cerebral cortex”, and claimed this architecture is part of what enables it to beat existing AI architectures.
NorthPole is of course a research prototype, and is fabricated using a 12nm production process. If the design were to be implemented using more up-to-date manufacturing processes, its efficiency would be 25 times better than that of current designs, Nature said, meaning there is plenty of scope for IBM to further improve its performance.
IBM said the tests for NorthPole focused primarily on uses related to computer vision, partly because funding for the project came from the US Department of Defense.
However, it claims that many edge applications that require massive amounts of data processing in real time would be suitable for NorthPole, such as enabling autonomous vehicles to react to unexpected situations, something that is a real challenge with current technology.
But it seems likely that any products based on NorthPole are unlikely to come to market soon. As The Register noted last year, neuromorphic chips have been around for years, and each new development is hailed as a breakthrough, but somehow fails to translate into a commercial product. ®