TENNs: It’s about Time! Unlocking the Power of Efficient Processing for Sequential Data (Video and Time Series Data)

 

Artificial Intelligence (AI) has made remarkable progress since the advent of Artificial Neural Networks (ANNs) over 50 years ago. However, as AI workflows increasingly rely on spatiotemporal data, the limitations of traditional approaches such as Convolutional Neural Networks (CNNs) have become evident. CNNs excel at processing spatial information in images but struggle with effectively utilizing temporal information. In this blog post, we explore a groundbreaking solution called Temporal Event-based Neural Networks (TENNs) developed by BrainChip, which efficiently combines spatial and temporal convolutions to process sequential data like never before.

The Challenge of Spatiotemporal Data

While CNNs have been the backbone of image classification for the past decade, they fall short when it comes to effectively encoding and processing spatiotemporal information. The two prevalent approaches to tackle this challenge are incorporating an internal state with temporal dynamics (as seen in recurrent neural networks) or computing a temporal convolution of a kernel over past inputs. However, each approach has its limitations, prompting the need for a more efficient solution.

Recurrent Neural Networks (RNNs) provide an internal state with temporal dynamics, allowing them to capture sequential dependencies. However, RNNs have their own challenges, such as vanishing gradients and difficulty in parallelization. Temporal Convolution involves computing a temporal convolution of a kernel over past inputs. While it can capture temporal correlations, it often requires extensive memory resources and can be power hungry at inference time for the Edge.

The Emergence of Transformers

Transformer models, which learn contextual relationships in sequential data, were initially considered a promising alternative to RNNs. Their ability to capture long-range dependencies and parallelize computations made them an efficient choice for many applications in the cloud, yet, typically too power-hungry and data-hungry for the Edge. However, recent developments have shown that a new class of RNNs has surpassed transformer networks in performance for certain tasks to be performed at
the Edge.

Introducing Temporal Event-based Neural Networks (TENNs)

BrainChip, with its expertise in Event-Based Processing and Spiking Neural Networks, has developed a groundbreaking solution to address the limitations of existing approaches. The 2nd Generation Akida platform includes support for Temporal Event-based Neural Networks (TENNs), which combine spatial and temporal convolutions to efficiently process sequential spatiotemporal data.

TENNs leverage the strengths of both CNNs and RNNs by incorporating temporal and spatial convolution layers throughout the network. Unlike traditional CNNs that focus solely on spatial dimensions, TENNs integrate the spatial and temporal characteristics of the data, enabling them to learn spatial and temporal correlations effectively. This distinguishes TENNs from state-space models that primarily deal with time series data lacking spatial components.

Flexibility and Configurability

TENNs are spatiotemporal networks that can be configured to operate either in temporal convolution mode or in a recurrent mode. This flexibility allows researchers and practitioners to adapt the network to the specific requirements of their applications. Furthermore, TENNs benefit from efficient training on parallel hardware, potentially in the cloud with GPUs and TPUs like convolutional networks, while maintaining the compactness of RNNs for inference at the Edge.

State-of-the-Art Performance and Versatility

TENNs have demonstrated state-of-the-art performance across various domains of sequential data, as highlighted in BrainChip’s recent white paper, “Temporal Event-based Neural Networks: A New Approach to Temporal Processing.” Notable achievements include Raw Audio Speech Classification on the 10-Class Speech Classification SC10 dataset, Vital Signs Prediction on the BIDMC dataset, 2D Object Detection on the KITTI Vision Benchmark Suite (frame-based camera video), and 2D Object Detection on the Event-Based Prophesee 1 Megapixel Automotive Detection Dataset.

The versatility of TENNs makes them suitable for a wide range of applications on sequential data, including speech recognition, medical equipment for patient monitoring, and streaming video object detection and tracking. Furthermore, TENNs offer superior performance with a fraction of the computational requirements and significantly fewer parameters compared to other network architectures. This efficiency makes them an elegant solution for highly accurate models that support
video and time series data at the Edge.

Enabling Intelligent Edge Solutions with TENNs on Brainchip 2nd Generation Akida

The groundbreaking capabilities of TENNs have far-reaching implications, particularly for intelligent Edge solutions. With their efficient training on parallel hardware like GPUs and TPUs, TENNs leverage the computational advantages of convolutional networks. Moreover, their compactness during inference enables efficient deployment at the Edge on BrainChip’s 2nd Generation Akida IP, catering to real-time and resource-constrained applications.

Conclusion

Temporal Event-based Neural Networks (TENNs) represent a significant advancement in processing sequential spatiotemporal data. By combining spatial and temporal convolutions, TENNs excel at capturing and exploiting the relationships within sequential data. With their impressive performance, low computational requirements, and wide-ranging applications, TENNs, running on BrainChip’s 2nd Generation Akida IP, are poised to unlock new possibilities in intelligent Edge solutions and empower various industries with highly accurate models for sequential data analysis.

Click here to read the White Paper for more details.