A simple yet effective MXNet Neural Architecture Search (NAS) pipeline

Hi everyone, I had a hard time finding any Neural Architecture Search implementation in MXNet so I built one (Single Path One-Shot NAS MXNet) in my spare time. Below is some abstract info. If you find this topic interesting, inputs and comments will be really appreciated!

Single Path One Shot NAS

This repository contains Single Path One-shot NAS implementation on MXNet (Gluon). It can finish the whole training and searching pipeline on ImageNet within 60 GPU hours (on 4 V100 GPUs, including supernet training, supernet searching and the searched best subnet training) in the exploration space of about 32^20 choices. By utilizing this implementation, a new state-of-the-art NAS searched model has been found which outperforms other NAS models like FBNet, MnasNet, DARTS, NASNET, PNASNET and the original SinglePathOneShot by a good margin in all factors of FLOPs, parameters amount and top-1/5 accuracies. Also for considering Google’s MicroNet Challenge Σ Normalized Scores, before any quantization, it outperforms other popular handcrafted efficient models like MobileNet V1 V2, V3, ShuffleNet V1, V2 too.

NAS Model FLOPs # of Params Top - 1 Top - 5 Σ Normalized Scores Scripts Logs
OneShot+ Supernet 841.9M 15.4M 62.90 84.49 7.09 script log
OneShot-S+ (ours) 291M 3.5M 75.75 92.77 1.9166 script log
OneShot+ (ours) 297M 3.7M 75.24 92.58 1.9937 script log
OneShot (ours) 328M 3.4M 74.02* 91.60 2 script log
OneShot (official) 328M 3.4M 74.9* 92.0 2 - -
FBNet-B 295M 4.5M 74.1 - 2.19 - -
MnasNet 317M 4.2M 74.0 91.8 2.20 - -
DARTS 574M 4.7M 73.3 91.3 3.13 - -
NASNET-A 564M 5.3M 74.0 91.6 3.28 - -
PNASNET 588M 5.1M 74.2 91.9 3.29 - -
Model FLOPs # of Params Top - 1 Top - 5 Σ Normalized Scores Scripts Logs
OneShot-S+ (ours) 291M 3.5M 75.75 92.77 1.9166 script log
OneShot+ (ours) 297M 3.7M 75.24 92.58 1.9937 script log
OneShot (ours) 328M 3.4M 74.02 91.60 2 script log
MobileNetV3 Large 217M 5.4M 75.2 - 2.25 - -
MobileNetV2 (1.4) 585M 6.9M 74.7 - 3.81 - -
MobileNetV1 569M 4.2M 70.6 - 2.97 - -
ShuffleNetV2 2.0x 591M 7.4M 75.0 92.4 3.98 - -
ShuffleNetV1 2.0x 524M 5.4M 74.1 91.4 3.19 - -

Comparision to the official pytorch release

Single Path One Shot NAS provides an elegent idea to effortlessly search for optimized subnet structures, under different model size/latency constraints, with single time supernet training and multiple times low-cost searching procedures. The flexibility and efficiency of this approach can benefit to many pratical senarios where a neural network model needs to be deployed across platforms. With the aid of this approach, manually tuning the structures to meet different hardware constraits can be avoided. Unfortunately, the author hasn’t released the full Supernet Training and Searching parts yet. This repo makes up for the missing of them.

Model Official This repo
Subnet Training
Block Selection
Channel Selection ×
Supernet Training - With Block Choices
Supernet Training - With Channel Choices ×
Supernet Training - With FLOP/Param Constraints ×
Supernet Training - With Strolling Evolution Constraints -
General FLOPs & Parameters Counting Tool
Fast Counting Tool with pre-calculated lookup table ×
BN Stat Update for Val Acc ×
BN Stat Update for Supernet Searching ×
Random Search ×
Genetic Search - On Block Choices
Genetic Search - On Channel Choices ×
Genetic Search - Jointly ×
SE -
Efficient Last Conv Block -
Op to Op Profiling Tool -
Merge BN -
Int8 Quantization -