基于FPGA加速的低功耗的MobileNetV2網(wǎng)絡(luò )識別系統

首頁(yè) > 過(guò)刊瀏覽>2023年第31卷第5期 >221-227

基于FPGA加速的低功耗的MobileNetV2網(wǎng)絡(luò )識別系統
DOI:
                        
                    
CSTR:
                        [cstr]
                    
作者:
                        
                        
                    
作者單位:福州大學(xué)電氣工程與自動(dòng)化學(xué)院
作者簡(jiǎn)介:
通訊作者:
中圖分類(lèi)號:
基金項目:

FPGA-accelerated Low-power MobileNetV2 Network Identification System

Author:

Affiliation:

Fund Project:

摘要

圖/表

訪(fǎng)問(wèn)統計

參考文獻

相似文獻

引證文獻

資源附件

文章評論

摘要:

近年來(lái)，卷積神經(jīng)網(wǎng)絡(luò )由于其出色的性能被廣泛應用在各個(gè)領(lǐng)域，如圖像識別、語(yǔ)音識別與翻譯和自動(dòng)駕駛等；但是傳統卷積神經(jīng)網(wǎng)絡(luò )（Convolutional Neural Network，CNN）存在參數多，計算量大，部署在CPU與GPU上推理速度慢、功耗大的問(wèn)題。針對上述問(wèn)題，采用量化感知訓練（Quantization Aware Training，QAT）的方式在保證圖像分類(lèi)準確率的前提下，將網(wǎng)絡(luò )參數總量壓縮為原網(wǎng)絡(luò )的1/4；將網(wǎng)絡(luò )權重全部部署在FPGA的片內資源上，克服了片外存儲帶寬的限制，減少了訪(fǎng)問(wèn)片外存儲資源帶來(lái)的功耗；在MobileNetV2網(wǎng)絡(luò )的層內以及相鄰的點(diǎn)卷積層之間提出一種協(xié)同配合的流水線(xiàn)結構，極大的提高了網(wǎng)絡(luò )的實(shí)時(shí)性；提出一種存儲器與數據讀取的優(yōu)化策略，根據并行度調整數據的存儲排列方式及讀取順序，進(jìn)一步節約了片內BRAM資源。最終在Xilinx的Virtex-7 VC707開(kāi)發(fā)板上實(shí)現了一套性能優(yōu)、功耗小的輕量級卷積神經(jīng)網(wǎng)絡(luò )MobileNetV2識別系統，200HZ時(shí)鐘下達到了170.06 GOP/s的吞吐量,功耗僅為6.13W，能耗比達到了27.74 GOP/s/W，是CPU的92倍，GPU的25倍，性能較其他實(shí)現有明顯的優(yōu)勢。

Abstract:

In recent years, convolutional neural networks have been widely used in various fields, such as image recognition, speech recognition and translation, and autonomous driving, due to their excellent performance. However, traditional Convolutional Neural Network (CNN) has the problems of many parameters, large computation, slow inference speed and high power consumption when deployed on CPU and GPU. To address the above problem, Quantization Aware Training (QAT) is used to compress the total number of network parameters to 1/4 of the original network while ensuring the accuracy of image classification. All the network weights are deployed on the on-chip resources of FPGA, which overcomes the limitation of off-chip storage bandwidth and reduces the power consumption caused by accessing off-chip storage resources. A cooperative pipeline structure is proposed within the layers of the MobileNetV2 network and between adjacent point convolutional layers, which greatly improves the real-time performance of the network. An optimization strategy for memory and data reading is proposed to adjust the data storage arrangement and reading order according to the parallelism degree, further saving on-chip BRAM resources. Finally, a lightweight convolutional neural network MobileNetV2 recognition system with excellent performance and low power consumption was implemented on Xilinx's Virtex-7 VC707 development board. The 200HZ clock reached the throughput of 170.06 GOP/s, with power consumption of only 6.13W, energy consumption ratio of 27.74 GOP/s/W, 92 times that of CPU and 25 times that of GPU. The performance has obvious advantages over other implementations.

參考文獻

相似文獻

引證文獻

引用本文

孫小堅,林瑞全,方子卿,馬馳.基于FPGA加速的低功耗的MobileNetV2網(wǎng)絡(luò )識別系統計算機測量與控制[J].,2023,31(5):221-227.

復制

文章指標

點(diǎn)擊次數:
下載次數:
HTML閱讀次數:
引用次數:

歷史

收稿日期:2022-10-13
最后修改日期:2022-12-26
錄用日期:2022-11-03
在線(xiàn)發(fā)布日期: 2023-05-19
出版日期:

国产欧美精品一区二区,中文字幕专区在线亚洲,国产精品美女网站在线观看,艾秋果冻传媒2021精品,在线免费一区二区,久久久久久青草大香综合精品,日韩美aaa特级毛片,欧美成人精品午夜免费影视

引用本文

分享

文章指標

歷史

文章二維碼