合肥工业大学校徽 合肥工业大学学报自科版

导航菜单

高能效低延迟的 BNN 硬件加速器设计

Design of energy-efficient low-latency BNN hardware accelerator

期刊信息

合肥工业大学(自然科学版),2024年12月,第47卷第12期:1655-1661

DOI: 10.3969/j.issn.1003-5060.2024.12.011

作者信息

周培培,杜高明,李桢旻,王晓蕾

(合肥工业大学微电子学院,安徽合肥230601)

摘要和关键词

摘要: 针对二值化神经网络(binary neural network, BNN)硬件设计过程中大量0值引发计算量增加以及BNN中同一权值数据与同一特征图数据多次重复运算导致计算周期和计算功耗增加的问题,文章分别提出全0值跳过方法和预计算结果缓存方法,有效减少网络的计算量、计算周期和计算功耗;并基于现场可编程门阵列(field programmable gate array, FPGA)设计一款BNN硬件加速器,即手写数字识别系统。实验结果表明,使用所提出的全0值跳过方法和预计算结果缓存方法后,在100 MHz的频率下,设计的加速器平均能效可达1.81 TOPS/W,相较于其他BNN加速器,提升了1.27~4.34倍。

关键词: 二值化神经网络(BNN);权值共享;重复运算;现场可编程门阵列(FPGA);硬件加速器

Authors

ZHOU Peipei, DU Gaoming, LI Zhenmin, WANG Xiaolei

(School of Microelectronics, Hefei University of Technology, Hefei 230601, China)

Abstract and Keywords

Abstract: There are a large number of zero values used in the operation of binary neural network (BNN) applications, which leads to the surge of computations, as well as computing delay and computing power caused by repeated operations of the same weight data and feature graph data in BNN. In this paper, the methods of all-zero skipping and precomputed result caching are proposed. The proposed methods can effectively reduce the computation cost, computing delay and computing power. In addition, a BNN hardware accelerator based on field programmable gate array (FPGA) is designed and applied to handwritten digit recognition system. The experimental results show that after applying the proposed methods, the average power efficiency of the accelerator can reach 1.81 TOPs/W at the frequency of 100 MHz, which is 1.27-4.34 times higher than that of other BNN accelerators.

Keywords: binary neural network(BNN); weight sharing; repeated operation; field programmable gate array(FPGA); hardware accelerator

基金信息

国家重点研发计划资助项目(2018YFB2202604);安徽省高校协同创新资助项目(GXXT-2019-030)

个人中心