VersaTile Convolutional Neural Network Mapping on FPGAs A. Muñío-Gracia, J. Fernández-Berni, R. Carmona-Galán and A. Rodríguez-Vázquez Conference · IEEE International Symposium on Circuits and Systems ISCAS 2020 abstractpdf
Convolutional Neural Networks (ConvNets) are directed acyclic graphs with node transitions determined by a set of configuration parameters. In this paper, we describe a dynamically configurable hardware architecture that enables data allocation strategy adjustment according to ConvNets layer characteristics. The proposed flexible scheduling solution allows the accelerator design to be portable across various scenarios of computation and memory resources availability. For instance, FPGA block-RAM resources can be properly balanced for optimization of data distribution and minimization of off-chip memory accesses. We explore the selection of tailored scheduling policies that translate into efficient on-chip data reuse and hence lower energy consumption. The system can autonomously adapt its behavior with no need of platform reconfiguration nor user supervision. Experimental results are presented and compared with state-of-the-art accelerators.
PhD Forum: Impact of CNNs Pooling Layer Implementation on FPGAs Accelerator Design A. Muñío-Gracia, J. Fernández-Berni, R. Carmona-Galán and Á. Rodríguez-Vázquez Conference · International Conference on Distributed Smart Cameras ICSDC 2019 abstract
Convolutional Neural Networks have demonstrated their competence in extracting information from data, especially in the field of computer vision. Their computational complexity prompts for hardware acceleration. The challenge in the design of hardware accelerators for CNNs is providing a sustained throughput with low power consumption, for what FPGAs have captured community attention. In CNNs pooling layers are introduced to reduce model spatial dimensions. This work explores the influence of pooling layers modification in some state-of-the-art CNNs, namely AlexNet and SqueezeNet. The objective is to optimize hardware resources utilization without negative impact on inference accuracy.
On the Balanced Allocation of Convolutional Neural Network Models on FPGAs A. Muñío-Gracia, J. Fernández-Berni, R. Carmona-Galán and A. Rodríguez-Vázquez Conference · Workshop on the Architecture of Smart Cameras WASC 2019 abstract
Deep Learning (DL) algorithms have demonstrated their competence in accurately extracting information from data, especially in the field of computer vision. DL has emerged as an end-to-end approach based on learned multi-level scene representations. A number of open-source frameworks have been created to describe convolutional neural network (CNN) models -a class of the deep neural networks (DNNs) that support DL. Their computational complexity prompts for hardware acceleration. The challenge in the design of hardware accelerators for CNNs is providing a sustained throughput with low power consumption. In order to test our architectural proposals, we will be employing FPGAs. They are reconfigurable, efficient, and have adjustable precision. FPGAs permit architectural exploration with shorter development time and lower cost than ASICs. This work introduces an scalable, frameworkagnostic, architecture whose behavior self-adapts to the selected CNN configuration. A design space analysis is performed for some state-of-the-art CNNs, namely VGG-16, Tiny DarkNet, and SqueezeNet. The objective is a balanced allocation of resources. For this, tiling parameterization will be optimized attending to decisive performance criteria such as the number of memory accesses, data movement policy and throughput.