当前位置: X-MOL 学术Future Gener. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ADS-CNN: Adaptive Dataflow Scheduling for lightweight CNN accelerator on FPGAs
Future Generation Computer Systems ( IF 7.5 ) Pub Date : 2024-04-20 , DOI: 10.1016/j.future.2024.04.038
Yi Wan , Xianzhong Xie , Junfan Chen , Kunpeng Xie , Dezhi Yi , Ye Lu , Keke Gai

Lightweight convolutional neural networks (CNNs) enable lower inference latency and data traffic, facilitating deployment on resource-constrained edge devices such as field-programmable gate arrays (FPGAs). However, CNNs inference requires access to off-chip synchronous dynamic random-access memory (SDRAM), which significantly degrades inference speed and system power efficiency. In this paper, we propose an adaptive dataflow scheduling method for lightweight CNN accelerator on FPGAs named ADS-CNN. The key idea of ADS-CNN is to efficiently utilize on-chip resources and reduce the amount of SDRAM access. To achieve the reuse of logical resources, we design a time division multiplexing calculation engine to be integrated in ADS-CNN. We implement a configurable module for the convolution controller to adapt to the data reuse of different convolution layers, thus reducing the off-chip access. Furthermore, we exploit on-chip memory blocks as buffers based on the configuration of different layers in lightweight CNNs. On the resource-constrained Intel CycloneV SoC 5CSEBA6 FPGA platform, we evaluated six common lightweight CNN models to demonstrate the performance advantages of ADS-CNN. The evaluation results indicate that, compared with accelerators that use traditional tiling strategy dataflow, our ADS-CNN can achieve up to 1.29 speedup with the overall dataflow scale compression of 23.7%.

中文翻译:


ADS-CNN:FPGA 上轻量级 CNN 加速器的自适应数据流调度



轻量级卷积神经网络 (CNN) 可降低推理延迟和数据流量,从而促进在现场可编程门阵列 (FPGA) 等资源受限的边缘设备上的部署。然而,CNN 的推理需要访问片外同步动态随机存取存储器 (SDRAM),这会显着降低推理速度和系统功效。在本文中,我们提出了一种用于 FPGA 上的轻量级 CNN 加速器的自适应数据流调度方法,称为 ADS-CNN。 ADS-CNN的关键思想是有效利用片上资源并减少SDRAM访问量。为了实现逻辑资源的重用,我们设计了一个时分复用计算引擎集成在ADS-CNN中。我们为卷积控制器实现了一个可配置模块,以适应不同卷积层的数据复用,从而减少片外访问。此外,我们根据轻量级 CNN 中不同层的配置,利用片上内存块作为缓冲区。在资源受限的 Intel CycloneV SoC 5CSEBA6 FPGA 平台上,我们评估了六种常见的轻量级 CNN 模型,以展示 ADS-CNN 的性能优势。评估结果表明,与使用传统分块策略数据流的加速器相比,我们的ADS-CNN可以实现高达1.29的加速比,整体数据流规模压缩23.7%。
更新日期:2024-04-20
down
wechat
bug