Slimming DenseNet #3

haithanhp · 2017-12-23T16:06:37Z

Thank you for a great work. I saw that you leveraged scaling factors of Batch normalization to prune incoming and outgoing weights at conv layers, However in DenseNet after a basic block (1x1 + 3x3) the previous features is concatenated to the current one and the dimension of scaling factors is not matched to that of the previous convolutional layer for pruning. So, How can you prune weights in this case?

By the way, when training sparsity DenseNet is finished with lambda 1e-5, I notice that many scaling factors are not small enough for pruning. Does this affect to the performance of compressed network?

Thanks,
Hai

liuzhuang13 · 2017-12-28T09:22:45Z

Thanks for your interest. We prune channels according to the BN's scaling factors, and after this process we set small factors (and biases) to 0, then we see which channels we can prune without affecting the network. This is applied to all network structures. In DenseNet actually dimension of scaling factors match the dimension of the convolution, because of the "pre-activation" structure.

The lambda parameter needs tuning for different datasets and hyperparameters (e.g. learning rate), so you may need to see the final performance.

haithanhp · 2017-12-28T18:24:33Z

Thanks for your answer. I have an example of one part of DenseNet-40 (k=12):

module.features.init_conv.weight : torch.Size([24, 3, 3, 3])
module.features.denseblock_1.dense_basicblock_1.conv_33.norm.weight : torch.Size([24])
module.features.denseblock_1.dense_basicblock_1.conv_33.norm.bias : torch.Size([24])
module.features.denseblock_1.dense_basicblock_1.conv_33.norm.running_mean : torch.Size([24])
module.features.denseblock_1.dense_basicblock_1.conv_33.norm.running_var : torch.Size([24])
module.features.denseblock_1.dense_basicblock_1.conv_33.conv.weight : torch.Size([12, 24, 3, 3])
module.features.denseblock_1.dense_basicblock_2.conv_33.norm.weight : torch.Size([36])
module.features.denseblock_1.dense_basicblock_2.conv_33.norm.bias : torch.Size([36])
module.features.denseblock_1.dense_basicblock_2.conv_33.norm.running_mean : torch.Size([36])
module.features.denseblock_1.dense_basicblock_2.conv_33.norm.running_var : torch.Size([36])
module.features.denseblock_1.dense_basicblock_2.conv_33.conv.weight : torch.Size([12, 36, 3, 3])
module.features.denseblock_1.dense_basicblock_3.conv_33.norm.weight : torch.Size([48])
module.features.denseblock_1.dense_basicblock_3.conv_33.norm.bias : torch.Size([48])
module.features.denseblock_1.dense_basicblock_3.conv_33.norm.running_mean : torch.Size([48])
module.features.denseblock_1.dense_basicblock_3.conv_33.norm.running_var : torch.Size([48])
module.features.denseblock_1.dense_basicblock_3.conv_33.conv.weight : torch.Size([12, 48, 3, 3])
module.features.denseblock_1.dense_basicblock_4.conv_33.norm.weight : torch.Size([60])
module.features.denseblock_1.dense_basicblock_4.conv_33.norm.bias : torch.Size([60])
module.features.denseblock_1.dense_basicblock_4.conv_33.norm.running_mean : torch.Size([60])
module.features.denseblock_1.dense_basicblock_4.conv_33.norm.running_var : torch.Size([60])
module.features.denseblock_1.dense_basicblock_4.conv_33.conv.weight : torch.Size([12, 60, 3, 3])

[N, C, K, K]: [#filters, #channels, kernel_size, kernel_size]

"norm.weight" here is the scaling factor in batch normalization. For me, each norm.weight layer I try to prune 40% #channels of batch normalization coresponding to #filters of previous conv.weight and #channels of latter conv.weight. How can you prune incoming and outgoing in this case? Please correct me if I make mistakes in pruning.

By the way, When parameters of layers are pruned, how does it affect to the performance of network? Is there any way to track how the performance changes ?

Thanks.

liuzhuang13 · 2017-12-29T22:40:40Z

In this basic DenseNet you can only prune outgoing weights. For example, if you set 10 of the 36 weights and biases in these
module.features.denseblock_1.dense_basicblock_2.conv_33.norm.weight : torch.Size([36])
module.features.denseblock_1.dense_basicblock_2.conv_33.norm.bias : torch.Size([36])
to zeros, you can prune away the corresponding weights (the second dimension) in
module.features.denseblock_1.dense_basicblock_2.conv_33.conv.weight : torch.Size([12, 36, 3, 3]).
Maybe you could visualize the scaling parameters like in Fig. 4 in the paper. Or you could monitor the performance on a validation set. Based on my experience it is not very hard to pick the value.

haithanhp · 2017-12-29T23:47:31Z

When the second dimension of conv.weight is pruned to 26 (prune away 10), the dimension of input activation is still 36 and it won't be matched. How can you do convolution operator in this case?
Thank you, I also try to visualize the values with lambda lasso of 1e-5 and 1e-4 and there are many values near zero.

liuzhuang13 · 2017-12-30T06:23:31Z

I wrote a channel selection layer and place it before the batch normalization layer. This layer selects the channels using the index of selected channels as the parameter. But in my implementation, it is very slow to run, maybe because of the memory copy involved. I'm not sure whether there is a solution for fast channel selection.

haithanhp · 2017-12-31T03:44:25Z

Yes, I see. Also, do you public the code for DenseNet and Resnet experiments? I also need to reproduce all your experiments for evaluation. Thanks.

liuzhuang13 · 2018-07-06T11:03:58Z

In case you're still interested, we've released our Pytorch implementation here https://github.com/Eric-mingjie/network-slimming, which supports ResNet and DenseNet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slimming DenseNet #3

Slimming DenseNet #3

haithanhp commented Dec 23, 2017

liuzhuang13 commented Dec 28, 2017

haithanhp commented Dec 28, 2017 •

edited

liuzhuang13 commented Dec 29, 2017

haithanhp commented Dec 29, 2017

liuzhuang13 commented Dec 30, 2017

haithanhp commented Dec 31, 2017

liuzhuang13 commented Jul 6, 2018

Slimming DenseNet #3

Slimming DenseNet #3

Comments

haithanhp commented Dec 23, 2017

liuzhuang13 commented Dec 28, 2017

haithanhp commented Dec 28, 2017 • edited

liuzhuang13 commented Dec 29, 2017

haithanhp commented Dec 29, 2017

liuzhuang13 commented Dec 30, 2017

haithanhp commented Dec 31, 2017

liuzhuang13 commented Jul 6, 2018

haithanhp commented Dec 28, 2017 •

edited