第46卷第12期
2023年12月
合肥工业大学学报
JOURNAL OF HEFEI UNIVERSITY OF TECHNOLOGY (NATURAL SCIENCE)
Vol.46 No.12
Dec. 2023

DOI:10.3969/j.issn.1003-5060.2023.12.011

基于 Winograd 算法的高效神经网络加速器及 FPGA 实现

王帅帅,陈强,郭剑博,肖昊

(合肥工业大学微电子学院,安徽合肥230601)

摘要

为了加速卷积神经网络(convolutional neural networks, CNN)的推断过程,文章采用Winograd算法,基于现场可编程门阵列(field programmable gate array, FPGA)设计一种高效CNN加速器。为解决Winograd算法转置后的数据位宽与数字信号处理单元(digital signal processing, DSP)位宽失配问题,文章提出部分积切割方法,充分利用DSP实现单周期多输出功能;为降低片上内存占用率,设计一种输入特征图可复用的数据流完成片内外数据交互。所设计的加速器在XCKU060板卡上部署,其吞吐率和每个DSP运算效率分别达 $ 2.358\times10^{12} $ OPs和 $ 1.15\times10^{9} $ OPs。结果表明该文提出的加速方法有效提升CNN加速器运算单元效率。

关键词

卷积神经网络(CNN);Winograd算法;现场可编程门阵列(FPGA);处理单元;并行架构 中图分类号:TP183;TN791 文献标志码:A 文章编号:1003-5060(2023)12-1659-07

中图分类号:TP183

文献标志码:A

文章编号:1003-5060(2023)12-1659-07

Design of high-efficiency convolutional neural network accelerator and implementation of FPGA based on Winograd algorithm

WANG Shuaihuai, CHEN Qiang, GUO Jianbo, XIAO Hao

(School of Microelectronics, Hefei University of Technology, Hefei 230601, China)

Abstract

In order to improve the inference speed of convolutional neural networks (CNN), this paper utilizes Winograd algorithm to design high-efficiency CNN accelerator based on field programmable gate array (FPGA). In an effort to solve the mismatch problem between the bit width of data after Winograd transpose and the bit width of the digital signal processing (DSP) blocks, a multiplicative partial product cut method is proposed to make full use of DSP to realize the single-cycle multi-output function. In order to reduce the on-chip memory occupancy rate, a data stream with reusable input feature map is designed to complete data interaction between on-chip and off-chip. The designed accelerator is deployed to the XCKU060 board, whose throughput and computing efficiency of each DSP reach $ 2.358 \times 10^{12} $ OPs and $ 1.15 \times 10^{9} $ OPs, respectively. Experimental results show that the proposed acceleration method effectively improves the efficiency of CNN accelerator computing unit.

Keywords

convolutional neural networks (CNN); Winograd algorithm; field programmable gate array (FPGA); process element; parallel architecture

收稿日期:2022-12-30

修回日期:2023-03-16

基金项目:国家自然科学基金资助项目(61974039)