Energy-Efficient VLSI Architecture for Lightweight CNN Inference on Edge Devices
Keywords:
Energy-efficient, VLSI architecture, Lightweight CNN, Edge AI, Fixed-point MAC units, Quantization-aware mapping.Abstract
The paper describes a low-power architecture for running lightweight Convolutional Neural Network (CNN) operations in edge AI machines with limited resources. User machines are built using fixed-point multiply-accumulate (MAC) units, clock gating, on-chip SRAM buffers, loop tiling and quantization-aware mapping to decrease dynamic power usage and ensure less use of external memory. The architecture is described in Verilog/SystemVerilog and implemented on both FPGA (Artix-7, Cyclone V) and ASIC (65nm CMOS) technologies. Testing experiments on CIFAR-10 and MNIST show that power decrease 40% and there is 30% more throughput with less than 1.5% accuracy drop. It is designed for real-time monitoring of situations and health and it forms a basis for upcoming neuromorphic and reconfigurable AI accelerators.
