题目/Title:LCP: a Layer Clusters Paralleling mapping method for accelerating Inception and Residual networks on FPGA
作者/Author:
Xinhan Lin,Shouyi Yin,Fengbin Tu,Leibo Liu,Xiangyu Li,Shaojun Wei
会议/Conference:DAC 2018
地点/Location:San Francisco, CA, USA
年份/Issue Date:2018.24-28 June
页码/pages:pp. 1 - 6
摘要/Abstract:
Deep convolutional neural networks (DCNNs) have been widely used in various AI applications. Inception and Residual are two promising structures adopted in many important modern DCNN models, including AlphaGo Zero's model. These structures allow considerably increasing the depth and width of the network to improve accuracy, without increasing the computational budget or the difficulty of convergence. Various accelerators for DCNNs have been proposed based on FPGA platform because it has advantages of high performance, good power efficiency, and fast development round, etc. However, previous FPGA mapping methods cannot fully adapt to the different data localities among layers and other characteristics of Inception and Residual, which leads to a under-utilization of FPGA resources. We propose LCP, a Layer Clusters Paralleling mapping method to classify the layers into clusters based on their differences of parameters and data localities, and then accelerate them in different partitions of FPGA. We evaluate our mapping method by implementing Inception/Residual modules from GoogLeNet [8] and ResNet-50 [4] on Xilinx VC709 (Virtex 690T) FPGA. The results show that the proposed method fully utilizes resources and achieves up to 4.03脳 performance than the baseline and 2.00脳 performance than the state-of-the-art methods.