Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add only works for two layers of the same size #50

Open
AndreJFBico opened this issue Jul 28, 2017 · 23 comments
Open

Add only works for two layers of the same size #50

AndreJFBico opened this issue Jul 28, 2017 · 23 comments
Labels

Comments

@AndreJFBico
Copy link

AndreJFBico commented Jul 28, 2017

Hi, so im working on setting up a fast style transfer network in code instead of importing it from a .pb file, so i can start improving it.

However im having some issues trying to setup a residual layer, heres the network.

        styleNet.start
            ->> Convolution(convSize: ConvSize(outputChannels: 32, kernelSize: 9, stride: 1), neuronType: .none, id: "Variable_0")
            ->> InstanceNorm(shiftModifier: "1", scaleModifier: "2", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 64, kernelSize: 3, stride: 2), neuronType: .none, id: "Variable_3")
            ->> InstanceNorm(shiftModifier: "4", scaleModifier: "5", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 2), neuronType: .none, id: "Variable_6")
            ->> InstanceNorm(shiftModifier: "7", scaleModifier: "8", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> ResidualLayer(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), layers:
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, id: "Variable_9")
                    ->> InstanceNorm(shiftModifier: "10", scaleModifier: "11", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, id: "Variable_12")
                    ->> InstanceNorm(shiftModifier: "13", scaleModifier: "14", id: "Variable_"))
           

When initializing the following error appears: assertion failed: Add works for two layers of the same size: file /Users/Andre/Downloads/Bender-fix-issue-38/Sources/Layers/Add.swift, line 23

I also print the layers in the network.


"PRINTING LAYERS"
": Bender.Start"
": Bender.Convolution"
": Bender.InstanceNorm"
": Bender.Neuron"
": Bender.Convolution"
": Bender.InstanceNorm"
": Bender.Neuron"
": Bender.Convolution"
": Bender.InstanceNorm"
": Bender.Neuron"
": Bender.Dummy"
": Bender.Identity"
": Bender.Add"
assertion failed: Add works for two layers of the same size: file /Users/Andre/Downloads/Bender-fix-issue-38/Sources/Layers/Add.swift, line 23

If you wondering what kind of network im trying to emulate its this one:
https://github.com/lengstrom/fast-style-transfer/blob/master/src/transform.py

def net(image):
    conv1 = _conv_layer(image, 32, 9, 1)
    conv2 = _conv_layer(conv1, 64, 3, 2)
    conv3 = _conv_layer(conv2, 128, 3, 2)
    resid1 = _residual_block(conv3, 3)
    resid2 = _residual_block(resid1, 3)
    resid3 = _residual_block(resid2, 3)
    resid4 = _residual_block(resid3, 3)
    resid5 = _residual_block(resid4, 3)
    conv_t1 = _conv_tranpose_layer(resid5, 64, 3, 2)
    conv_t2 = _conv_tranpose_layer(conv_t1, 32, 3, 2)
    conv_t3 = _conv_layer(conv_t2, 3, 9, 1, relu=False)
    preds = tf.nn.tanh(conv_t3) * 150 + 255./2
    return preds

Note: i have changed instance norm input to allow specific shift/scale modifiers, its a temporary way to just setup the weight ids.

@bryant1410
Copy link
Member

The layers do sanity checks of the sizes. In this case is the Add, which comes from ResidualLayer.

So it seems that what comes before the residual layer and what it yields have different size. I'm trying to figure out why, as it looks good.

@bryant1410
Copy link
Member

A few tips while I tackle it 😄

You can create your own blocks, as in the file you cite, by extending CompositeLayer and ResidualLayer.

Convolution should not use bias here, as the example you provide uses TF convolutions, which have no bias.

I'm noticing there's a useless convSize in ResidualLayer. Gonna remove it.

@AndreJFBico
Copy link
Author

AndreJFBico commented Jul 28, 2017

The outputsize variable is passed down from the start layer when initialize is called for every incoming[0].outputsize

However for the Add itself the incoming[1] is nil

@bryant1410
Copy link
Member

so Add is not having 2 inputs?

@AndreJFBico
Copy link
Author

AndreJFBico commented Jul 28, 2017

The Add has 2 inputs, the output size of the second input is null, while the output size of the first is correct.

@dernster
Copy link
Contributor

@AndreJFBico Do you have a repo to take a look?

@AndreJFBico
Copy link
Author

AndreJFBico commented Aug 1, 2017

@dernster Yeah i can provide it, ill update the main post when it uploads.

In the mean time this is the network im running right now.

        //We have in total 48 layers and 48 weight variables
        styleNet.start
            ->> Convolution(convSize: ConvSize(outputChannels: 32, kernelSize: 9, stride: 1), neuronType: .none, useBias: false, id: "Variable_0")
            ->> InstanceNorm(shiftModifier: "1", scaleModifier: "2", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 64, kernelSize: 3, stride: 2), neuronType: .none, useBias: false, id: "Variable_3")
            ->> InstanceNorm(shiftModifier: "4", scaleModifier: "5", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 2), neuronType: .none, useBias: false, id: "Variable_6")
            ->> InstanceNorm(shiftModifier: "7", scaleModifier: "8", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> [Identity(), (
                 Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_9")
                    ->> InstanceNorm(shiftModifier: "10", scaleModifier: "11", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_12")
                    ->> InstanceNorm(shiftModifier: "13", scaleModifier: "14", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_15")
                    ->> InstanceNorm(shiftModifier: "16", scaleModifier: "17", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_18")
                    ->> InstanceNorm(shiftModifier: "19", scaleModifier: "20", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_21")
                    ->> InstanceNorm( shiftModifier: "22", scaleModifier: "23", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_24")
                    ->> InstanceNorm(shiftModifier: "25", scaleModifier: "26", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_27")
                    ->> InstanceNorm(shiftModifier: "28", scaleModifier: "29", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_30")
                    ->> InstanceNorm(shiftModifier: "31", scaleModifier: "32", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_33")
                    ->> InstanceNorm(shiftModifier: "34", scaleModifier: "35", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_36")
                    ->> InstanceNorm(shiftModifier: "37", scaleModifier: "38", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> ConvTranspose(size: ConvSize(outputChannels: 64, kernelSize: 3, stride: 2), id: "Variable_39")
            ->> InstanceNorm(shiftModifier: "40", scaleModifier: "41", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> ConvTranspose(size: ConvSize(outputChannels: 32, kernelSize: 3, stride: 2), id: "Variable_42")
            ->> InstanceNorm(shiftModifier: "43", scaleModifier: "44", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 3, kernelSize: 9, stride: 1), neuronType: .none, useBias: false, id: "Variable_45")
            ->> InstanceNorm( shiftModifier: "46", scaleModifier: "47", id: "Variable_")
            ->> Neuron(type: .tanh)
            ->> ImageLinearTransform()

It runs even though the end result still has some strange artifacts and it crashes on input resolution higher than 256(but thats other issues), the way i circumvented the add issue was to add a identity layer next to the instance norm layer, that way the outputSize is passed correctly to the Add layer.

@bryant1410
Copy link
Member

If the output is not ok, maybe what you can yield different layers outputs and compare them to the python implementation, to see if they match, in order to be able to find where the error is.

But have you tried saving the protobuf from the python code with benderthon and loading it with bender?

@AndreJFBico
Copy link
Author

AndreJFBico commented Aug 1, 2017

Good suggestion, in terms of trying to import a protobuf, yes i have done that and it works properly, it still crashes with input size of 1024 for some reason but for lower than 512 input size image it works ok.

The reason im trying to define the network in code is so that i can understand it better and also work with it.

@dernster
Copy link
Contributor

@AndreJFBico Hi! We couldn't reproduce the original issue, could you please provide a repo with it? Additionally, if you can provide an example of a larger input size causing a crash would be helpful.

@dernster dernster added the bug label Aug 10, 2017
@backnotprop
Copy link

backnotprop commented Sep 19, 2017

@AndreJFBico did you figure this out?

@dernster I recreated this by generating a model from fast-style-transfer, used benderthon to get the pb, and running the code taken from the example in my own project with the new pb. The only pb, g_and_w2, that works is the one given from the bender style example.

I also get the same error when I make the swap in the example project.

@bryant1410
Copy link
Member

Yeah, now we were able to reproduce the error, but we still don't know what is the cause. And we don't have an ETA for this currently.

If you want, you can go ahead and try to find it. We'll check it out when we have time (I hope it's in the following weeks).

@AndreJFBico
Copy link
Author

AndreJFBico commented Sep 19, 2017

Im sorry that i never got that repo ready, its just i started changing the core of bender itself and i no longer had things as they were done originally.

Well i sort of figured it out, however i couldn't make it work with the .pb file straight off the bat.
And since i wanted to experiment with the code itself i converted the network of lengstrom fast style transfer to benders.

        styleNet.start
            ->> Convolution(convSize: ConvSize(outputChannels: 32, kernelSize: 9, stride: 1), neuronType: .none, useBias: false, id: "Variable_0")
            ->> InstanceNorm(shiftModifier: "1", scaleModifier: "2", id: "Variable_")
            ->> Neuron( type: .elu)
            ->> Convolution(convSize: ConvSize(outputChannels: 64, kernelSize: 3, stride: 2), neuronType: .none, useBias: false, id: "Variable_3")
            ->> InstanceNorm(shiftModifier: "4", scaleModifier: "5", id: "Variable_")
            ->> Neuron( type: .elu)
            ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 2), neuronType: .none, useBias: false, id: "Variable_6")
            ->> InstanceNorm(shiftModifier: "7", scaleModifier: "8", id: "Variable_")
            ->> Neuron( type: .elu)
            ->> [Identity(), (
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_9")
                    ->> InstanceNorm(shiftModifier: "10", scaleModifier: "11", id: "Variable_")
                    ->> Neuron( type: .elu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_12")
                    ->> InstanceNorm(shiftModifier: "13", scaleModifier: "14", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_15")
                    ->> InstanceNorm(shiftModifier: "16", scaleModifier: "17", id: "Variable_")
                    ->> Neuron( type: .elu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_18")
                    ->> InstanceNorm(shiftModifier: "19", scaleModifier: "20", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_21")
                    ->> InstanceNorm( shiftModifier: "22", scaleModifier: "23", id: "Variable_")
                    ->> Neuron( type: .elu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_24")
                    ->> InstanceNorm(shiftModifier: "25", scaleModifier: "26", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_27")
                    ->> InstanceNorm(shiftModifier: "28", scaleModifier: "29", id: "Variable_")
                    ->> Neuron( type: .elu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_30")
                    ->> InstanceNorm(shiftModifier: "31", scaleModifier: "32", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_33")
                    ->> InstanceNorm(shiftModifier: "34", scaleModifier: "35", id: "Variable_")
                    ->> Neuron( type: .elu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_36")
                    ->> InstanceNorm(shiftModifier: "37", scaleModifier: "38", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> ConvTranspose(size: ConvSize(outputChannels: 64, kernelSize: 3, stride: 2), id: "Variable_39")
            ->> InstanceNorm(shiftModifier: "40", scaleModifier: "41", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> ConvTranspose(size: ConvSize(outputChannels: 32, kernelSize: 3, stride: 2), id: "Variable_42")
            ->> InstanceNorm(shiftModifier: "43", scaleModifier: "44", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 3, kernelSize: 9, stride: 1), neuronType: .none, useBias: false, id: "Variable_45")
            ->> InstanceNorm( shiftModifier: "46", scaleModifier: "47", id: "Variable_")
            ->> Neuron(type: .tanh)
            ->> ImageLinearTransform()

Its a one on one conversion from the following file https://github.com/lengstrom/fast-style-transfer/blob/master/src/transform.py

I also used benderthon to export the layer weights individually, i also had to alter its conversion script as only some weights require transposing.

I also changed a bit how the weights are searched but the principle should be the same.

About the add works for two layers of the same size issue, i figured it out that i had to add a identity layer next the instance normalization layer before an add layer, that way the issue disappeared.

I tried looking into the code of how the layers were setup but i found nothing problematic, its just somehow the second incoming node of the add layer looses its convsize atribute, i never really found out why.

If they manage to fix this somehow i bet your conversion should be simple and painless.

Cheers.

@bryant1410 bryant1410 changed the title Add works for two layers of the same size Add only works for two layers of the same size Sep 20, 2017
@backnotprop
Copy link

Does Bender by chance have notes on how they generated their pb files?

@bryant1410
Copy link
Member

The mnist sample came from the benderthon sample. The style transfer one probably came directly from the repo we were talking about, I don't know why it isn't working now. But we will take a look at this error as soon as we can.

@bryant1410
Copy link
Member

@mdramos I updated benderthon so the sample uses a simpler way to generate the protobuf. Maybe take a look at it.

@backnotprop
Copy link

ok thanks! yea I actually got benderthon working just fine yesterday

@PIRANAVARUBAN
Copy link

PIRANAVARUBAN commented May 21, 2018

I am using Using lengstrom project. While using protobuf file i faced

"Fatal error: Index out of range"
var kernelWidth: Int {
return Int(dim[1].size)
}

What is the Issue ?

Attached my Pb model

test.pb.zip

@PIRANAVARUBAN
Copy link

@mdramos what changes you have made it to working ?

@mats-claassen
Copy link
Member

The issue with your graph seems to be the ExpandDims at the beginning. I am looking into it.

@mats-claassen
Copy link
Member

There are two issues with your graph. The first one is fixed with #111 and has to do with the ExpandDims.

The second one is that Mul and Add with scalar are not supported yet. From what I saw they are only used at the end to scale the final result. What you can do there is to cut the graph after the Tanh (when freezing) node and then add a postprocessing layer to do the scaling like this:

Neuron(type: ActivationNeuronType.custom(neuron: MPSCNNNeuronLinear(device: Device.shared, a: 2.0, b: -1)), id: "scale_neuron")

where a and b are the scale and offset.

If there are any other questions please open a new issue.

@PIRANAVARUBAN
Copy link

Working ....
preds= tf.add(tf.nn.tanh(conv_t3)*150, 255./2,name="preds")

how you calculated a and b as 2 and -1. there is a variation in outimage in python and ios ?
What value we need to put for a and b ?

@mats-claassen
Copy link
Member

2 and -1 are an example.

MPSCNNNeuronLinear documentation says it calculates ax + b. So yours would be something like 150 and 255/2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants