Simple questions about the dimensions of convolution and max pool layer


#1

Say, if I want to build a convolutional model using gluon, and I have a sentence. I obtained the word embeddings for each word in the sentence using w2v and stack them together. Assume the size of this input is (60, 400), which means I have 60 word embeddings and each word embeddings’ length is 400.

I want to us gluon.nn.conv2d on the input first, and have a max pooling layer on the result of the conv2d. For the filters, I will have a filter size of 2, 3, 4, and let’s have 2 filters for each size.

Now, what should the kernel size of the filter be? I want to get the numbers correct first so I can debug my code. In my code, I set the kernel size to be (2, 400), (3, 400), and (4, 400), the channels to be the number of filters that is 2, and the strides to be (1, 400).

Assuming that everything above is correct (please correct me if it is not), now I want a max pool layer. Since I want to do the max pooling on the result of the convolution layer, the pool_size should be, in my understanding, (59, 1), (58,1), (57, 1), since there should be (60 - kernel_size) + 1 outputs.

Is this correct? I know it’s not because my code won’t run. Any advice would be mostly appreciated!