ADS-CNN: Adaptive Dataflow Scheduling for lightweight CNN accelerator on FPGAs,Future Generation Computer Systems

当前位置： X-MOL 学术 › Future Gener. Comput. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

ADS-CNN: Adaptive Dataflow Scheduling for lightweight CNN accelerator on FPGAs
Future Generation Computer Systems ( IF 7.5 ) Pub Date : 2024-04-20 , DOI: 10.1016/j.future.2024.04.038
Yi Wan , Xianzhong Xie , Junfan Chen , Kunpeng Xie , Dezhi Yi , Ye Lu , Keke Gai

Lightweight convolutional neural networks (CNNs) enable lower inference latency and data traffic, facilitating deployment on resource-constrained edge devices such as field-programmable gate arrays (FPGAs). However, CNNs inference requires access to off-chip synchronous dynamic random-access memory (SDRAM), which significantly degrades inference speed and system power efficiency. In this paper, we propose an adaptive dataflow scheduling method for lightweight CNN accelerator on FPGAs named ADS-CNN. The key idea of ADS-CNN is to efficiently utilize on-chip resources and reduce the amount of SDRAM access. To achieve the reuse of logical resources, we design a time division multiplexing calculation engine to be integrated in ADS-CNN. We implement a configurable module for the convolution controller to adapt to the data reuse of different convolution layers, thus reducing the off-chip access. Furthermore, we exploit on-chip memory blocks as buffers based on the configuration of different layers in lightweight CNNs. On the resource-constrained Intel CycloneV SoC 5CSEBA6 FPGA platform, we evaluated six common lightweight CNN models to demonstrate the performance advantages of ADS-CNN. The evaluation results indicate that, compared with accelerators that use traditional tiling strategy dataflow, our ADS-CNN can achieve up to 1.29 speedup with the overall dataflow scale compression of 23.7%.

中文翻译：

ADS-CNN：FPGA 上轻量级 CNN 加速器的自适应数据流调度

轻量级卷积神经网络 (CNN) 可降低推理延迟和数据流量，从而促进在现场可编程门阵列 (FPGA) 等资源受限的边缘设备上的部署。然而，CNN 的推理需要访问片外同步动态随机存取存储器 (SDRAM)，这会显着降低推理速度和系统功效。在本文中，我们提出了一种用于 FPGA 上的轻量级 CNN 加速器的自适应数据流调度方法，称为 ADS-CNN。 ADS-CNN的关键思想是有效利用片上资源并减少SDRAM访问量。为了实现逻辑资源的重用，我们设计了一个时分复用计算引擎集成在ADS-CNN中。我们为卷积控制器实现了一个可配置模块，以适应不同卷积层的数据复用，从而减少片外访问。此外，我们根据轻量级 CNN 中不同层的配置，利用片上内存块作为缓冲区。在资源受限的 Intel CycloneV SoC 5CSEBA6 FPGA 平台上，我们评估了六种常见的轻量级 CNN 模型，以展示 ADS-CNN 的性能优势。评估结果表明，与使用传统分块策略数据流的加速器相比，我们的ADS-CNN可以实现高达1.29的加速比，整体数据流规模压缩23.7%。

更新日期：2024-04-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>