Categories: Blog, Industry News


Edge-Inference Architectures Proliferate (Part 1)

via SemiconductorEngineering

What makes one AI system better than another depends on a lot of different factors, including some that aren’t entirely clear.

The last year has seen a vast array of announcements of new machine-learning (ML) architectures for edge inference. Unburdened by the need to support training, but tasked with low latency, the devices exhibit extremely varied approaches to ML inference.

“Architecture is changing both in the computer architecture and the actual network architectures and topologies,” said Suhas Mitra, product marketing director for Tensilica AI products at Cadence.

Those changes are a reflection of both the limitations of scaling existing platforms and the explosion in data that needs to be processed, stored and retrieved. “General-purpose architectures have thrived and are very successful,” said Avi Baum, co-founder and CTO of Hailo. “But they’ve reached a limit.”

The new offerings exhibit a wide range of structure, technology, and optimization goals. All must be gentle on power, but some target wired devices while others target battery-powered devices, giving different power/performance targets. While no single architecture is expected to solve every problem, the industry is in a phase of proliferation, not consolidation. It will be a while before the dust settles on the preferred architectures.

ML networks

To make sense of the wide range of offerings, it’s important to bear in mind some fundamental distinctions in the ML world. There are at least three concepts that must be distinguished — an ML network, an ML model, and a hardware platform intended for implementing ML models.

An ML network is a physical arrangement of layers and nodes…


Related Posts

View all
  • Linley Fall Processor Conference November 1-2, 2022 Santa Clara, CA (+ Virtual) Please join BrainChip at the upcoming Linley Fall Processor Conference on November 1st and 2nd, 2022 at the Hyatt Regency Hotel, Santa Clara, CA (Virtual attendance option is available) Presentations will address processors and IP cores for AI applications, embedded, data-center, automotive, and server […]

    Continue reading
  • Continue reading
  • Conventional AI silicon and cloud-centric inference models do not perform efficiently at the automotive edge. As many semiconductor companies have already realized, latency and power are two primary issues that must be effectively addressed before the automotive industry can manufacture a new generation of smarter and safer cars. To meet consumer expectations, these vehicles need […]

    Continue reading
  • Join BrainChip at this upcoming Summit. September 14-15, 2022 – Santa Clara, CA The community’s goal is to reduce time-to-value in the ML lifecycle and to unlock new possibilities for AI development. This involves a full-stack effort of efficient operationalization of AI in organizations, productionization of models, tight hw/sw co-design, and best-in-class microarchitectures. The goal […]

    Continue reading