Abstract
Network pruning aims to reduce the inference cost of large models and enable neural architectures to run on end devices such as mobile phones. We present NEON, a novel iterative pruning approach using deep reinforcement learning (DRL). While most reinforcement learning-based pruning solutions only analyze the one network they aim to prune, we train a DRL agent on a large set of randomly-generated architectures. Therefore, our proposed solution is more generic and less prone to overfitting. To avoid the long-running times often required to train DRL models for each new dataset, we train NEON offline on multiple datasets and then apply it to additional datasets without additional training. This setup makes NEON more efficient than other DRL-based pruning methods. Additionally, we propose a novel reward function that enables users to clearly define their pruning/performance trade-off preferences. Our evaluation, conducted on a set of 28 diverse datasets, shows that the proposed method significantly outperforms recent top-performing solutions in the pruning of fully-connected networks. Specifically, our top configuration reduces the average size of the pruned architecture by ×24.59, compared to ×13.26 by the leading baseline, while actually improving accuracy by 0.5%.
Original language | English |
---|---|
Pages (from-to) | 381-400 |
Number of pages | 20 |
Journal | Information Sciences |
Volume | 610 |
DOIs | |
State | Published - 1 Sep 2022 |
Keywords
- Deep reinforcement learning
- Pruning
ASJC Scopus subject areas
- Theoretical Computer Science
- Software
- Control and Systems Engineering
- Computer Science Applications
- Information Systems and Management
- Artificial Intelligence