Abstract
The demand for running NNs in embedded environments has increased significantly in recent years due to the significant success of convolutional neural network (CNN) approaches in various tasks, including image recognition and generation. The task of achieving high accuracy on resource-restricted devices, however, is still considered to be challenging, which is mainly due to the vast number of design parameters that need to be balanced. While the quantization of CNN parameters leads to a reduction of power and area, it can also generate unexpected changes in the balance between communication and computation. This change is hard to evaluate, and the lack of balance may lead to lower utilization of either memory bandwidth or computational resources, thereby reducing performance. This paper introduces a hardware performance analysis framework for identifying bottlenecks in the early stages of CNN hardware design. We demonstrate how the proposed method can help in evaluating different architecture alternatives of resource-restricted CNN accelerators (e.g., part of real-time embedded systems) early in design stages and, thus, prevent making design mistakes.
Original language | English |
---|---|
Article number | 717 |
Pages (from-to) | 1-20 |
Number of pages | 20 |
Journal | Sustainability (Switzerland) |
Volume | 13 |
Issue number | 2 |
DOIs | |
State | Published - 2 Jan 2021 |
Externally published | Yes |
Keywords
- Accelerators
- CNN architecture
- Neural networks
- Quantization
ASJC Scopus subject areas
- Computer Science (miscellaneous)
- Geography, Planning and Development
- Renewable Energy, Sustainability and the Environment
- Building and Construction
- Environmental Science (miscellaneous)
- Energy Engineering and Power Technology
- Hardware and Architecture
- Computer Networks and Communications
- Management, Monitoring, Policy and Law