IN3: A Framework for In-Network Computation of Neural Networks in the Programmable Data Plane,IEEE Communications Magazine

当前位置： X-MOL 学术 › IEEE Commun. Mag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

IN3: A Framework for In-Network Computation of Neural Networks in the Programmable Data Plane
IEEE Communications Magazine ( IF 11.2 ) Pub Date : 2024-04-08 , DOI: 10.1109/mcom.001.2300587
Xiaoquan Zhang ₁ , Lin Cui ₁ , Fung Po Tso ₂ , Wenzhi Li ₁ , Weijia Jia ₃

Affiliation

Neural networks have been widely used in networking applications due to their high accuracy and generalization. However, the traditional approach of collecting network features from switches and transmitting them to the controller introduces high traffic overhead and extra communication latency. In-network computing (INC) mitigates this issue by running computing tasks directly in the networks on the data paths using programmable data planes (PDP). However, it is challenging to embed more sophisticated computing tasks, such as neural networks, in the networks due to the limitations in the computation and storage resources of PDP. To address this challenge, we propose IN3, a framework that enables complete neural network inference in PDP. IN3 uses model compression techniques to reduce the memory and computational requirements of given neural networks. Additionally, a purposely designed data plane pipeline for per-flow features computation and inference is proposed. We implemented a testbed prototype (based on Intel Tofino ASIC), and experimental results demonstrate that IN3 effectively reduces memory usage, while significantly decreasing the inference time. IN3 demonstrates the feasibility of implementing neural networks in PDP, and we identify potential future research directions for this issue.

中文翻译：

IN3：可编程数据平面神经网络的网内计算框架

神经网络由于其高精度和泛化性而在网络应用中得到了广泛的应用。然而，从交换机收集网络特征并将其传输到控制器的传统方法会带来高流量开销和额外的通信延迟。网络内计算 (INC) 通过使用可编程数据平面 (PDP) 直接在数据路径上的网络中运行计算任务来缓解此问题。然而，由于PDP计算和存储资源的限制，在网络中嵌入更复杂的计算任务（例如神经网络）具有挑战性。为了应对这一挑战，我们提出了 IN3，一个能够在 PDP 中实现完整神经网络推理的框架。 IN3 使用模型压缩技术来减少给定神经网络的内存和计算要求。此外，还提出了一种专门设计的数据平面管道，用于每流特征计算和推理。我们实现了一个测试平台原型（基于 Intel Tofino ASIC），实验结果表明 IN3 有效降低了内存使用量，同时显着缩短了推理时间。 IN3 展示了在 PDP 中实现神经网络的可行性，并且我们确定了该问题未来潜在的研究方向。

更新日期：2024-04-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>