合肥工业大学校徽 合肥工业大学学报自科版

导航菜单

面向深度强化学习自动驾驶决策算法的硬件加速器

Hardware accelerator for deep reinforcement learning based autonomous driving decision algorithm

期刊信息

合肥工业大学(自然科学版),2024年9月,第47卷第9期:1159-1169

DOI: 10.3969/j.issn.1003-5060.2024.09.002

作者信息

冉敬楠,倪伟,陈世宇

(合肥工业大学微电子学院,安徽合肥230601)

摘要和关键词

摘要: 针对自动驾驶决策计算低功耗、低延时、高精度的需求, 文章设计一种支持混合精度运算的深度强化学习自动驾驶决策算法的硬件加速器。通过多运算单元重构方式设计乘累加单元(multiply-and-accumulate unit, MAC), 支持多种精度模式的计算, 提高加速器的灵活性, 降低量化模型的部署成本; 通过多层次优化数据流, 提高复用程度, 优化加速器能耗比。在随机潜在演员评论家(stochastic latent actor-critic, SLAC)自动驾驶决策算法上测试该硬件加速器, 结果表明: 有效算力达到18.3 GOPS, 是CPU的10.7倍, GPU的3.3倍; 能效比达到2.197 GOPS/W, 是CPU的104倍, GPU的28倍。同时提出一种高位数据编码(most significant bit data coding, MSB-DC)方法实现层内混合精度特征图计算, 实验结果表明, 该方法能以较少的延迟成本有效降低量化所带来的误差。

关键词: 深度强化学习;自动驾驶;混合精度;神经网络量化;硬件加速

Authors

RAN Jingnan, NI Wei, CHEN Shiyu

(School of Microelectronics, Hefei University of Technology, Hefei 230601, China)

Abstract and Keywords

Abstract: In order to meet the requirements of low power consumption, low delay and high precision of autonomous driving decision calculation, a hardware accelerator for deep reinforcement learning based autonomous driving decision algorithm supporting mixed precision operation was designed. Multiply-and-accumulate unit(MAC) designed by multiple operation units reconstruction can support multiple precision mode calculation, thus improving the flexibility of accelerator and reducing the deployment cost of quantitative model. The multi-level optimization of the data flow improves the reuse degree and optimizes the accelerator energy consumption ratio. The effective computing power of the hardware accelerator for stochastic latent actor-critic(SLAC) based autonomous driving decision algorithm is 18.3 GOPS, which is 10.7 times that of CPU and 3.3 times that of GPU. The energy efficiency ratio is 2.197 GOPS/W, which is 104 times that of CPU and 28 times that of GPU. At the same time, the most significant bit data coding(MSB-DC) method is proposed to realize the calculation of intra-layer mixed precision feature map. Experiments show that this method can effectively reduce the error caused by quantization with less delay cost.

Keywords: deep reinforcement learning; autonomous driving; mixed precision; neural network quantization; hardware acceleration

基金信息

国家重点研发计划资助项目(2018YFB2202604)

个人中心