Edge-Inference Architectures Proliferate (Part 1)

via SemiconductorEngineering

What makes one AI system better than another depends on a lot of different factors, including some that aren’t entirely clear.

The last year has seen a vast array of announcements of new machine-learning (ML) architectures for edge inference. Unburdened by the need to support training, but tasked with low latency, the devices exhibit extremely varied approaches to ML inference.

“Architecture is changing both in the computer architecture and the actual network architectures and topologies,” said Suhas Mitra, product marketing director for Tensilica AI products at Cadence.

Those changes are a reflection of both the limitations of scaling existing platforms and the explosion in data that needs to be processed, stored and retrieved. “General-purpose architectures have thrived and are very successful,” said Avi Baum, co-founder and CTO of Hailo. “But they’ve reached a limit.”

The new offerings exhibit a wide range of structure, technology, and optimization goals. All must be gentle on power, but some target wired devices while others target battery-powered devices, giving different power/performance targets. While no single architecture is expected to solve every problem, the industry is in a phase of proliferation, not consolidation. It will be a while before the dust settles on the preferred architectures.

ML networks

To make sense of the wide range of offerings, it’s important to bear in mind some fundamental distinctions in the ML world. There are at least three concepts that must be distinguished — an ML network, an ML model, and a hardware platform intended for implementing ML models.

An ML network is a physical arrangement of layers and nodes…