Tensorflow Reshape error with custom pooling/unpooling layer












1














I am trying to implement a smaller scale version of SegNet described in this paper (https://arxiv.org/pdf/1511.00561.pdf), but I'm trying to tailor it towards detecting edges



Dataset:
I am using the BSDS500 boundary dataset, I cropped and rotated the images so their sizes are 320x480x3 instead of 321x481x3



Input shapes, 200 training images and 100 validation images:



x_train: (200, 320, 480, 3)
x_val: (100, 320, 480, 3)
y_train: (200, 153600)
y_val: (100, 153600)


Framework:
I am using keras with tensorflow backend



These are the functions I am using for the custom pooling and unpooling layers:



def pool_argmax2D(x, pool_size=(2,2), strides=(2,2)):
padding = 'SAME'
pool_size = [1, pool_size[0], pool_size[1], 1]
strides = [1, strides[0], strides[1], 1]
ksize = [1, pool_size[0], pool_size[1], 1]
output, argmax = tf.nn.max_pool_with_argmax(
x,
ksize = ksize,
strides = strides,
padding = padding
)

return [output, argmax]


def unpool2D(pool, argmax, ksize=(2,2)):
with tf.variable_scope("unpool"):
input_shape = tf.shape(pool)
output_shape = [input_shape[0],
input_shape[1] * ksize[0],
input_shape[2] * ksize[1],
input_shape[3]]

flat_input_size = tf.cumprod(input_shape)[-1]
flat_output_shape = tf.cast([output_shape[0],
output_shape[1] * output_shape[2] * output_shape[3]], tf.int64)

pool_ = tf.reshape(pool, [flat_input_size])
batch_range = tf.reshape(tf.range(tf.cast(output_shape[0], tf.int64), dtype=tf.int64),
shape=[input_shape[0], 1, 1, 1])

b = tf.ones_like(argmax) * batch_range
b = tf.reshape(b, [flat_input_size, 1])

ind_ = tf.reshape(argmax, [flat_input_size, 1]) % flat_output_shape[1]
ind_ = tf.concat([b, ind_], 1)
ret = tf.scatter_nd(ind_, pool_, shape=flat_output_shape)
ret = tf.reshape(ret, output_shape)
return ret


This is the code for the model:



batch_size = 4
kernel = 3
pool_size=(2,2)
img_shape = (320,480,3)


inputs = Input(shape=img_shape, name='main_input')

conv_1 = Conv2D(32, (kernel, kernel), padding="same")(inputs)
conv_1 = BatchNormalization()(conv_1)
conv_1 = Activation("relu")(conv_1)
conv_2 = Conv2D(32, (kernel, kernel), padding="same")(conv_1)
conv_2 = BatchNormalization()(conv_2)
conv_2 = Activation("relu")(conv_2)

pool_1, mask_1 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_2)

conv_3 = Conv2D(64, (kernel, kernel), padding="same")(pool_1)
conv_3 = BatchNormalization()(conv_3)
conv_3 = Activation("relu")(conv_3)
conv_4 = Conv2D(64, (kernel, kernel), padding="same")(conv_3)
conv_4 = BatchNormalization()(conv_4)
conv_4 = Activation("relu")(conv_4)

pool_2, mask_2 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_4)

conv_5 = Conv2D(64, (kernel, kernel), padding="same")(pool_2)
conv_5 = BatchNormalization()(conv_5)
conv_5 = Activation("relu")(conv_5)

unpool_1 = Lambda(unpool2D, output_shape = (160,240,64), arguments={'ksize':pool_size, 'argmax': mask_2})(conv_5)

conv_6 = Conv2D(64, (kernel, kernel), padding="same")(unpool_1)
conv_6 = BatchNormalization()(conv_6)
conv_6 = Activation("relu")(conv_6)
conv_7 = Conv2D(64, (kernel, kernel), padding="same")(conv_6)
conv_7 = BatchNormalization()(conv_7)
conv_7 = Activation("relu")(conv_7)

unpool_2 = Lambda(unpool2D, output_shape = (320,480,64), arguments={'ksize':pool_size, 'argmax': mask_1})(conv_7)

conv_8 = Conv2D(32, (kernel, kernel), padding="same")(unpool_2)
conv_8 = BatchNormalization()(conv_8)
conv_8 = Activation("relu")(conv_8)
conv_9 = Conv2D(32, (kernel, kernel), padding="same")(conv_8)
conv_9 = BatchNormalization()(conv_9)
conv_9 = Activation("relu")(conv_9)

conv_10 = Conv2D(1, (1, 1), padding="same")(conv_9)
conv_10 = BatchNormalization()(conv_10)

flatten_1 = Flatten()(conv_10)

outputs = Activation('softmax')(flatten_1)

model = Model(inputs=inputs, outputs=outputs)


The model compiles properly when I run:



model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['accuracy'])
model.summary()

_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
main_input (InputLayer) (None, 320, 480, 3) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 320, 480, 32) 896
_________________________________________________________________
batch_normalization_1 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_1 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 320, 480, 32) 9248
_________________________________________________________________
batch_normalization_2 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_2 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
lambda_1 (Lambda) [(None, 160, 240, 32), (N 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 160, 240, 64) 18496
_________________________________________________________________
batch_normalization_3 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_3 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_4 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_4 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
lambda_2 (Lambda) [(None, 80, 120, 64), (No 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 80, 120, 64) 36928
_________________________________________________________________
batch_normalization_5 (Batch (None, 80, 120, 64) 256
_________________________________________________________________
activation_5 (Activation) (None, 80, 120, 64) 0
_________________________________________________________________
lambda_3 (Lambda) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_6 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_6 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_7 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_7 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
lambda_4 (Lambda) (None, 320, 480, 64) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 320, 480, 32) 18464
_________________________________________________________________
batch_normalization_8 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_8 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_9 (Conv2D) (None, 320, 480, 32) 9248
_________________________________________________________________
batch_normalization_9 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_9 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 320, 480, 1) 33
_________________________________________________________________
batch_normalization_10 (Batc (None, 320, 480, 1) 4
_________________________________________________________________
flatten_1 (Flatten) (None, 153600) 0
_________________________________________________________________
activation_10 (Activation) (None, 153600) 0
=================================================================
Total params: 205,893
Trainable params: 204,995
Non-trainable params: 898
_________________________________________________________________


However when trying to fit the model



history = model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=3, verbose=2, validation_data=(x_val,y_val))


I encounter this error:



InvalidArgumentError: Input to reshape is a tensor with 4915200 values, but the requested shape has 9830400
[[{{node lambda_4/unpool/Reshape_3}} = Reshape[T=DT_INT64, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](lambda_1/MaxPoolWithArgmax:1, lambda_4/unpool/Reshape_2/shape)]]
[[{{node lambda_4/unpool/strided_slice_6/_515}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1479_lambda_4/unpool/strided_slice_6", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]


I have looked over all the shapes after each layers and they are what I expect. I also tested out the pooling/unpooling functions on sample tensors and they produced expected output. What am I doing wrong here?



I've been pulling my hair out trying to solve this, any help is greatly appreciated!










share|improve this question





























    1














    I am trying to implement a smaller scale version of SegNet described in this paper (https://arxiv.org/pdf/1511.00561.pdf), but I'm trying to tailor it towards detecting edges



    Dataset:
    I am using the BSDS500 boundary dataset, I cropped and rotated the images so their sizes are 320x480x3 instead of 321x481x3



    Input shapes, 200 training images and 100 validation images:



    x_train: (200, 320, 480, 3)
    x_val: (100, 320, 480, 3)
    y_train: (200, 153600)
    y_val: (100, 153600)


    Framework:
    I am using keras with tensorflow backend



    These are the functions I am using for the custom pooling and unpooling layers:



    def pool_argmax2D(x, pool_size=(2,2), strides=(2,2)):
    padding = 'SAME'
    pool_size = [1, pool_size[0], pool_size[1], 1]
    strides = [1, strides[0], strides[1], 1]
    ksize = [1, pool_size[0], pool_size[1], 1]
    output, argmax = tf.nn.max_pool_with_argmax(
    x,
    ksize = ksize,
    strides = strides,
    padding = padding
    )

    return [output, argmax]


    def unpool2D(pool, argmax, ksize=(2,2)):
    with tf.variable_scope("unpool"):
    input_shape = tf.shape(pool)
    output_shape = [input_shape[0],
    input_shape[1] * ksize[0],
    input_shape[2] * ksize[1],
    input_shape[3]]

    flat_input_size = tf.cumprod(input_shape)[-1]
    flat_output_shape = tf.cast([output_shape[0],
    output_shape[1] * output_shape[2] * output_shape[3]], tf.int64)

    pool_ = tf.reshape(pool, [flat_input_size])
    batch_range = tf.reshape(tf.range(tf.cast(output_shape[0], tf.int64), dtype=tf.int64),
    shape=[input_shape[0], 1, 1, 1])

    b = tf.ones_like(argmax) * batch_range
    b = tf.reshape(b, [flat_input_size, 1])

    ind_ = tf.reshape(argmax, [flat_input_size, 1]) % flat_output_shape[1]
    ind_ = tf.concat([b, ind_], 1)
    ret = tf.scatter_nd(ind_, pool_, shape=flat_output_shape)
    ret = tf.reshape(ret, output_shape)
    return ret


    This is the code for the model:



    batch_size = 4
    kernel = 3
    pool_size=(2,2)
    img_shape = (320,480,3)


    inputs = Input(shape=img_shape, name='main_input')

    conv_1 = Conv2D(32, (kernel, kernel), padding="same")(inputs)
    conv_1 = BatchNormalization()(conv_1)
    conv_1 = Activation("relu")(conv_1)
    conv_2 = Conv2D(32, (kernel, kernel), padding="same")(conv_1)
    conv_2 = BatchNormalization()(conv_2)
    conv_2 = Activation("relu")(conv_2)

    pool_1, mask_1 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_2)

    conv_3 = Conv2D(64, (kernel, kernel), padding="same")(pool_1)
    conv_3 = BatchNormalization()(conv_3)
    conv_3 = Activation("relu")(conv_3)
    conv_4 = Conv2D(64, (kernel, kernel), padding="same")(conv_3)
    conv_4 = BatchNormalization()(conv_4)
    conv_4 = Activation("relu")(conv_4)

    pool_2, mask_2 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_4)

    conv_5 = Conv2D(64, (kernel, kernel), padding="same")(pool_2)
    conv_5 = BatchNormalization()(conv_5)
    conv_5 = Activation("relu")(conv_5)

    unpool_1 = Lambda(unpool2D, output_shape = (160,240,64), arguments={'ksize':pool_size, 'argmax': mask_2})(conv_5)

    conv_6 = Conv2D(64, (kernel, kernel), padding="same")(unpool_1)
    conv_6 = BatchNormalization()(conv_6)
    conv_6 = Activation("relu")(conv_6)
    conv_7 = Conv2D(64, (kernel, kernel), padding="same")(conv_6)
    conv_7 = BatchNormalization()(conv_7)
    conv_7 = Activation("relu")(conv_7)

    unpool_2 = Lambda(unpool2D, output_shape = (320,480,64), arguments={'ksize':pool_size, 'argmax': mask_1})(conv_7)

    conv_8 = Conv2D(32, (kernel, kernel), padding="same")(unpool_2)
    conv_8 = BatchNormalization()(conv_8)
    conv_8 = Activation("relu")(conv_8)
    conv_9 = Conv2D(32, (kernel, kernel), padding="same")(conv_8)
    conv_9 = BatchNormalization()(conv_9)
    conv_9 = Activation("relu")(conv_9)

    conv_10 = Conv2D(1, (1, 1), padding="same")(conv_9)
    conv_10 = BatchNormalization()(conv_10)

    flatten_1 = Flatten()(conv_10)

    outputs = Activation('softmax')(flatten_1)

    model = Model(inputs=inputs, outputs=outputs)


    The model compiles properly when I run:



    model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['accuracy'])
    model.summary()

    _________________________________________________________________
    Layer (type) Output Shape Param #
    =================================================================
    main_input (InputLayer) (None, 320, 480, 3) 0
    _________________________________________________________________
    conv2d_1 (Conv2D) (None, 320, 480, 32) 896
    _________________________________________________________________
    batch_normalization_1 (Batch (None, 320, 480, 32) 128
    _________________________________________________________________
    activation_1 (Activation) (None, 320, 480, 32) 0
    _________________________________________________________________
    conv2d_2 (Conv2D) (None, 320, 480, 32) 9248
    _________________________________________________________________
    batch_normalization_2 (Batch (None, 320, 480, 32) 128
    _________________________________________________________________
    activation_2 (Activation) (None, 320, 480, 32) 0
    _________________________________________________________________
    lambda_1 (Lambda) [(None, 160, 240, 32), (N 0
    _________________________________________________________________
    conv2d_3 (Conv2D) (None, 160, 240, 64) 18496
    _________________________________________________________________
    batch_normalization_3 (Batch (None, 160, 240, 64) 256
    _________________________________________________________________
    activation_3 (Activation) (None, 160, 240, 64) 0
    _________________________________________________________________
    conv2d_4 (Conv2D) (None, 160, 240, 64) 36928
    _________________________________________________________________
    batch_normalization_4 (Batch (None, 160, 240, 64) 256
    _________________________________________________________________
    activation_4 (Activation) (None, 160, 240, 64) 0
    _________________________________________________________________
    lambda_2 (Lambda) [(None, 80, 120, 64), (No 0
    _________________________________________________________________
    conv2d_5 (Conv2D) (None, 80, 120, 64) 36928
    _________________________________________________________________
    batch_normalization_5 (Batch (None, 80, 120, 64) 256
    _________________________________________________________________
    activation_5 (Activation) (None, 80, 120, 64) 0
    _________________________________________________________________
    lambda_3 (Lambda) (None, 160, 240, 64) 0
    _________________________________________________________________
    conv2d_6 (Conv2D) (None, 160, 240, 64) 36928
    _________________________________________________________________
    batch_normalization_6 (Batch (None, 160, 240, 64) 256
    _________________________________________________________________
    activation_6 (Activation) (None, 160, 240, 64) 0
    _________________________________________________________________
    conv2d_7 (Conv2D) (None, 160, 240, 64) 36928
    _________________________________________________________________
    batch_normalization_7 (Batch (None, 160, 240, 64) 256
    _________________________________________________________________
    activation_7 (Activation) (None, 160, 240, 64) 0
    _________________________________________________________________
    lambda_4 (Lambda) (None, 320, 480, 64) 0
    _________________________________________________________________
    conv2d_8 (Conv2D) (None, 320, 480, 32) 18464
    _________________________________________________________________
    batch_normalization_8 (Batch (None, 320, 480, 32) 128
    _________________________________________________________________
    activation_8 (Activation) (None, 320, 480, 32) 0
    _________________________________________________________________
    conv2d_9 (Conv2D) (None, 320, 480, 32) 9248
    _________________________________________________________________
    batch_normalization_9 (Batch (None, 320, 480, 32) 128
    _________________________________________________________________
    activation_9 (Activation) (None, 320, 480, 32) 0
    _________________________________________________________________
    conv2d_10 (Conv2D) (None, 320, 480, 1) 33
    _________________________________________________________________
    batch_normalization_10 (Batc (None, 320, 480, 1) 4
    _________________________________________________________________
    flatten_1 (Flatten) (None, 153600) 0
    _________________________________________________________________
    activation_10 (Activation) (None, 153600) 0
    =================================================================
    Total params: 205,893
    Trainable params: 204,995
    Non-trainable params: 898
    _________________________________________________________________


    However when trying to fit the model



    history = model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=3, verbose=2, validation_data=(x_val,y_val))


    I encounter this error:



    InvalidArgumentError: Input to reshape is a tensor with 4915200 values, but the requested shape has 9830400
    [[{{node lambda_4/unpool/Reshape_3}} = Reshape[T=DT_INT64, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](lambda_1/MaxPoolWithArgmax:1, lambda_4/unpool/Reshape_2/shape)]]
    [[{{node lambda_4/unpool/strided_slice_6/_515}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1479_lambda_4/unpool/strided_slice_6", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]


    I have looked over all the shapes after each layers and they are what I expect. I also tested out the pooling/unpooling functions on sample tensors and they produced expected output. What am I doing wrong here?



    I've been pulling my hair out trying to solve this, any help is greatly appreciated!










    share|improve this question



























      1












      1








      1


      1





      I am trying to implement a smaller scale version of SegNet described in this paper (https://arxiv.org/pdf/1511.00561.pdf), but I'm trying to tailor it towards detecting edges



      Dataset:
      I am using the BSDS500 boundary dataset, I cropped and rotated the images so their sizes are 320x480x3 instead of 321x481x3



      Input shapes, 200 training images and 100 validation images:



      x_train: (200, 320, 480, 3)
      x_val: (100, 320, 480, 3)
      y_train: (200, 153600)
      y_val: (100, 153600)


      Framework:
      I am using keras with tensorflow backend



      These are the functions I am using for the custom pooling and unpooling layers:



      def pool_argmax2D(x, pool_size=(2,2), strides=(2,2)):
      padding = 'SAME'
      pool_size = [1, pool_size[0], pool_size[1], 1]
      strides = [1, strides[0], strides[1], 1]
      ksize = [1, pool_size[0], pool_size[1], 1]
      output, argmax = tf.nn.max_pool_with_argmax(
      x,
      ksize = ksize,
      strides = strides,
      padding = padding
      )

      return [output, argmax]


      def unpool2D(pool, argmax, ksize=(2,2)):
      with tf.variable_scope("unpool"):
      input_shape = tf.shape(pool)
      output_shape = [input_shape[0],
      input_shape[1] * ksize[0],
      input_shape[2] * ksize[1],
      input_shape[3]]

      flat_input_size = tf.cumprod(input_shape)[-1]
      flat_output_shape = tf.cast([output_shape[0],
      output_shape[1] * output_shape[2] * output_shape[3]], tf.int64)

      pool_ = tf.reshape(pool, [flat_input_size])
      batch_range = tf.reshape(tf.range(tf.cast(output_shape[0], tf.int64), dtype=tf.int64),
      shape=[input_shape[0], 1, 1, 1])

      b = tf.ones_like(argmax) * batch_range
      b = tf.reshape(b, [flat_input_size, 1])

      ind_ = tf.reshape(argmax, [flat_input_size, 1]) % flat_output_shape[1]
      ind_ = tf.concat([b, ind_], 1)
      ret = tf.scatter_nd(ind_, pool_, shape=flat_output_shape)
      ret = tf.reshape(ret, output_shape)
      return ret


      This is the code for the model:



      batch_size = 4
      kernel = 3
      pool_size=(2,2)
      img_shape = (320,480,3)


      inputs = Input(shape=img_shape, name='main_input')

      conv_1 = Conv2D(32, (kernel, kernel), padding="same")(inputs)
      conv_1 = BatchNormalization()(conv_1)
      conv_1 = Activation("relu")(conv_1)
      conv_2 = Conv2D(32, (kernel, kernel), padding="same")(conv_1)
      conv_2 = BatchNormalization()(conv_2)
      conv_2 = Activation("relu")(conv_2)

      pool_1, mask_1 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_2)

      conv_3 = Conv2D(64, (kernel, kernel), padding="same")(pool_1)
      conv_3 = BatchNormalization()(conv_3)
      conv_3 = Activation("relu")(conv_3)
      conv_4 = Conv2D(64, (kernel, kernel), padding="same")(conv_3)
      conv_4 = BatchNormalization()(conv_4)
      conv_4 = Activation("relu")(conv_4)

      pool_2, mask_2 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_4)

      conv_5 = Conv2D(64, (kernel, kernel), padding="same")(pool_2)
      conv_5 = BatchNormalization()(conv_5)
      conv_5 = Activation("relu")(conv_5)

      unpool_1 = Lambda(unpool2D, output_shape = (160,240,64), arguments={'ksize':pool_size, 'argmax': mask_2})(conv_5)

      conv_6 = Conv2D(64, (kernel, kernel), padding="same")(unpool_1)
      conv_6 = BatchNormalization()(conv_6)
      conv_6 = Activation("relu")(conv_6)
      conv_7 = Conv2D(64, (kernel, kernel), padding="same")(conv_6)
      conv_7 = BatchNormalization()(conv_7)
      conv_7 = Activation("relu")(conv_7)

      unpool_2 = Lambda(unpool2D, output_shape = (320,480,64), arguments={'ksize':pool_size, 'argmax': mask_1})(conv_7)

      conv_8 = Conv2D(32, (kernel, kernel), padding="same")(unpool_2)
      conv_8 = BatchNormalization()(conv_8)
      conv_8 = Activation("relu")(conv_8)
      conv_9 = Conv2D(32, (kernel, kernel), padding="same")(conv_8)
      conv_9 = BatchNormalization()(conv_9)
      conv_9 = Activation("relu")(conv_9)

      conv_10 = Conv2D(1, (1, 1), padding="same")(conv_9)
      conv_10 = BatchNormalization()(conv_10)

      flatten_1 = Flatten()(conv_10)

      outputs = Activation('softmax')(flatten_1)

      model = Model(inputs=inputs, outputs=outputs)


      The model compiles properly when I run:



      model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['accuracy'])
      model.summary()

      _________________________________________________________________
      Layer (type) Output Shape Param #
      =================================================================
      main_input (InputLayer) (None, 320, 480, 3) 0
      _________________________________________________________________
      conv2d_1 (Conv2D) (None, 320, 480, 32) 896
      _________________________________________________________________
      batch_normalization_1 (Batch (None, 320, 480, 32) 128
      _________________________________________________________________
      activation_1 (Activation) (None, 320, 480, 32) 0
      _________________________________________________________________
      conv2d_2 (Conv2D) (None, 320, 480, 32) 9248
      _________________________________________________________________
      batch_normalization_2 (Batch (None, 320, 480, 32) 128
      _________________________________________________________________
      activation_2 (Activation) (None, 320, 480, 32) 0
      _________________________________________________________________
      lambda_1 (Lambda) [(None, 160, 240, 32), (N 0
      _________________________________________________________________
      conv2d_3 (Conv2D) (None, 160, 240, 64) 18496
      _________________________________________________________________
      batch_normalization_3 (Batch (None, 160, 240, 64) 256
      _________________________________________________________________
      activation_3 (Activation) (None, 160, 240, 64) 0
      _________________________________________________________________
      conv2d_4 (Conv2D) (None, 160, 240, 64) 36928
      _________________________________________________________________
      batch_normalization_4 (Batch (None, 160, 240, 64) 256
      _________________________________________________________________
      activation_4 (Activation) (None, 160, 240, 64) 0
      _________________________________________________________________
      lambda_2 (Lambda) [(None, 80, 120, 64), (No 0
      _________________________________________________________________
      conv2d_5 (Conv2D) (None, 80, 120, 64) 36928
      _________________________________________________________________
      batch_normalization_5 (Batch (None, 80, 120, 64) 256
      _________________________________________________________________
      activation_5 (Activation) (None, 80, 120, 64) 0
      _________________________________________________________________
      lambda_3 (Lambda) (None, 160, 240, 64) 0
      _________________________________________________________________
      conv2d_6 (Conv2D) (None, 160, 240, 64) 36928
      _________________________________________________________________
      batch_normalization_6 (Batch (None, 160, 240, 64) 256
      _________________________________________________________________
      activation_6 (Activation) (None, 160, 240, 64) 0
      _________________________________________________________________
      conv2d_7 (Conv2D) (None, 160, 240, 64) 36928
      _________________________________________________________________
      batch_normalization_7 (Batch (None, 160, 240, 64) 256
      _________________________________________________________________
      activation_7 (Activation) (None, 160, 240, 64) 0
      _________________________________________________________________
      lambda_4 (Lambda) (None, 320, 480, 64) 0
      _________________________________________________________________
      conv2d_8 (Conv2D) (None, 320, 480, 32) 18464
      _________________________________________________________________
      batch_normalization_8 (Batch (None, 320, 480, 32) 128
      _________________________________________________________________
      activation_8 (Activation) (None, 320, 480, 32) 0
      _________________________________________________________________
      conv2d_9 (Conv2D) (None, 320, 480, 32) 9248
      _________________________________________________________________
      batch_normalization_9 (Batch (None, 320, 480, 32) 128
      _________________________________________________________________
      activation_9 (Activation) (None, 320, 480, 32) 0
      _________________________________________________________________
      conv2d_10 (Conv2D) (None, 320, 480, 1) 33
      _________________________________________________________________
      batch_normalization_10 (Batc (None, 320, 480, 1) 4
      _________________________________________________________________
      flatten_1 (Flatten) (None, 153600) 0
      _________________________________________________________________
      activation_10 (Activation) (None, 153600) 0
      =================================================================
      Total params: 205,893
      Trainable params: 204,995
      Non-trainable params: 898
      _________________________________________________________________


      However when trying to fit the model



      history = model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=3, verbose=2, validation_data=(x_val,y_val))


      I encounter this error:



      InvalidArgumentError: Input to reshape is a tensor with 4915200 values, but the requested shape has 9830400
      [[{{node lambda_4/unpool/Reshape_3}} = Reshape[T=DT_INT64, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](lambda_1/MaxPoolWithArgmax:1, lambda_4/unpool/Reshape_2/shape)]]
      [[{{node lambda_4/unpool/strided_slice_6/_515}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1479_lambda_4/unpool/strided_slice_6", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]


      I have looked over all the shapes after each layers and they are what I expect. I also tested out the pooling/unpooling functions on sample tensors and they produced expected output. What am I doing wrong here?



      I've been pulling my hair out trying to solve this, any help is greatly appreciated!










      share|improve this question















      I am trying to implement a smaller scale version of SegNet described in this paper (https://arxiv.org/pdf/1511.00561.pdf), but I'm trying to tailor it towards detecting edges



      Dataset:
      I am using the BSDS500 boundary dataset, I cropped and rotated the images so their sizes are 320x480x3 instead of 321x481x3



      Input shapes, 200 training images and 100 validation images:



      x_train: (200, 320, 480, 3)
      x_val: (100, 320, 480, 3)
      y_train: (200, 153600)
      y_val: (100, 153600)


      Framework:
      I am using keras with tensorflow backend



      These are the functions I am using for the custom pooling and unpooling layers:



      def pool_argmax2D(x, pool_size=(2,2), strides=(2,2)):
      padding = 'SAME'
      pool_size = [1, pool_size[0], pool_size[1], 1]
      strides = [1, strides[0], strides[1], 1]
      ksize = [1, pool_size[0], pool_size[1], 1]
      output, argmax = tf.nn.max_pool_with_argmax(
      x,
      ksize = ksize,
      strides = strides,
      padding = padding
      )

      return [output, argmax]


      def unpool2D(pool, argmax, ksize=(2,2)):
      with tf.variable_scope("unpool"):
      input_shape = tf.shape(pool)
      output_shape = [input_shape[0],
      input_shape[1] * ksize[0],
      input_shape[2] * ksize[1],
      input_shape[3]]

      flat_input_size = tf.cumprod(input_shape)[-1]
      flat_output_shape = tf.cast([output_shape[0],
      output_shape[1] * output_shape[2] * output_shape[3]], tf.int64)

      pool_ = tf.reshape(pool, [flat_input_size])
      batch_range = tf.reshape(tf.range(tf.cast(output_shape[0], tf.int64), dtype=tf.int64),
      shape=[input_shape[0], 1, 1, 1])

      b = tf.ones_like(argmax) * batch_range
      b = tf.reshape(b, [flat_input_size, 1])

      ind_ = tf.reshape(argmax, [flat_input_size, 1]) % flat_output_shape[1]
      ind_ = tf.concat([b, ind_], 1)
      ret = tf.scatter_nd(ind_, pool_, shape=flat_output_shape)
      ret = tf.reshape(ret, output_shape)
      return ret


      This is the code for the model:



      batch_size = 4
      kernel = 3
      pool_size=(2,2)
      img_shape = (320,480,3)


      inputs = Input(shape=img_shape, name='main_input')

      conv_1 = Conv2D(32, (kernel, kernel), padding="same")(inputs)
      conv_1 = BatchNormalization()(conv_1)
      conv_1 = Activation("relu")(conv_1)
      conv_2 = Conv2D(32, (kernel, kernel), padding="same")(conv_1)
      conv_2 = BatchNormalization()(conv_2)
      conv_2 = Activation("relu")(conv_2)

      pool_1, mask_1 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_2)

      conv_3 = Conv2D(64, (kernel, kernel), padding="same")(pool_1)
      conv_3 = BatchNormalization()(conv_3)
      conv_3 = Activation("relu")(conv_3)
      conv_4 = Conv2D(64, (kernel, kernel), padding="same")(conv_3)
      conv_4 = BatchNormalization()(conv_4)
      conv_4 = Activation("relu")(conv_4)

      pool_2, mask_2 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_4)

      conv_5 = Conv2D(64, (kernel, kernel), padding="same")(pool_2)
      conv_5 = BatchNormalization()(conv_5)
      conv_5 = Activation("relu")(conv_5)

      unpool_1 = Lambda(unpool2D, output_shape = (160,240,64), arguments={'ksize':pool_size, 'argmax': mask_2})(conv_5)

      conv_6 = Conv2D(64, (kernel, kernel), padding="same")(unpool_1)
      conv_6 = BatchNormalization()(conv_6)
      conv_6 = Activation("relu")(conv_6)
      conv_7 = Conv2D(64, (kernel, kernel), padding="same")(conv_6)
      conv_7 = BatchNormalization()(conv_7)
      conv_7 = Activation("relu")(conv_7)

      unpool_2 = Lambda(unpool2D, output_shape = (320,480,64), arguments={'ksize':pool_size, 'argmax': mask_1})(conv_7)

      conv_8 = Conv2D(32, (kernel, kernel), padding="same")(unpool_2)
      conv_8 = BatchNormalization()(conv_8)
      conv_8 = Activation("relu")(conv_8)
      conv_9 = Conv2D(32, (kernel, kernel), padding="same")(conv_8)
      conv_9 = BatchNormalization()(conv_9)
      conv_9 = Activation("relu")(conv_9)

      conv_10 = Conv2D(1, (1, 1), padding="same")(conv_9)
      conv_10 = BatchNormalization()(conv_10)

      flatten_1 = Flatten()(conv_10)

      outputs = Activation('softmax')(flatten_1)

      model = Model(inputs=inputs, outputs=outputs)


      The model compiles properly when I run:



      model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['accuracy'])
      model.summary()

      _________________________________________________________________
      Layer (type) Output Shape Param #
      =================================================================
      main_input (InputLayer) (None, 320, 480, 3) 0
      _________________________________________________________________
      conv2d_1 (Conv2D) (None, 320, 480, 32) 896
      _________________________________________________________________
      batch_normalization_1 (Batch (None, 320, 480, 32) 128
      _________________________________________________________________
      activation_1 (Activation) (None, 320, 480, 32) 0
      _________________________________________________________________
      conv2d_2 (Conv2D) (None, 320, 480, 32) 9248
      _________________________________________________________________
      batch_normalization_2 (Batch (None, 320, 480, 32) 128
      _________________________________________________________________
      activation_2 (Activation) (None, 320, 480, 32) 0
      _________________________________________________________________
      lambda_1 (Lambda) [(None, 160, 240, 32), (N 0
      _________________________________________________________________
      conv2d_3 (Conv2D) (None, 160, 240, 64) 18496
      _________________________________________________________________
      batch_normalization_3 (Batch (None, 160, 240, 64) 256
      _________________________________________________________________
      activation_3 (Activation) (None, 160, 240, 64) 0
      _________________________________________________________________
      conv2d_4 (Conv2D) (None, 160, 240, 64) 36928
      _________________________________________________________________
      batch_normalization_4 (Batch (None, 160, 240, 64) 256
      _________________________________________________________________
      activation_4 (Activation) (None, 160, 240, 64) 0
      _________________________________________________________________
      lambda_2 (Lambda) [(None, 80, 120, 64), (No 0
      _________________________________________________________________
      conv2d_5 (Conv2D) (None, 80, 120, 64) 36928
      _________________________________________________________________
      batch_normalization_5 (Batch (None, 80, 120, 64) 256
      _________________________________________________________________
      activation_5 (Activation) (None, 80, 120, 64) 0
      _________________________________________________________________
      lambda_3 (Lambda) (None, 160, 240, 64) 0
      _________________________________________________________________
      conv2d_6 (Conv2D) (None, 160, 240, 64) 36928
      _________________________________________________________________
      batch_normalization_6 (Batch (None, 160, 240, 64) 256
      _________________________________________________________________
      activation_6 (Activation) (None, 160, 240, 64) 0
      _________________________________________________________________
      conv2d_7 (Conv2D) (None, 160, 240, 64) 36928
      _________________________________________________________________
      batch_normalization_7 (Batch (None, 160, 240, 64) 256
      _________________________________________________________________
      activation_7 (Activation) (None, 160, 240, 64) 0
      _________________________________________________________________
      lambda_4 (Lambda) (None, 320, 480, 64) 0
      _________________________________________________________________
      conv2d_8 (Conv2D) (None, 320, 480, 32) 18464
      _________________________________________________________________
      batch_normalization_8 (Batch (None, 320, 480, 32) 128
      _________________________________________________________________
      activation_8 (Activation) (None, 320, 480, 32) 0
      _________________________________________________________________
      conv2d_9 (Conv2D) (None, 320, 480, 32) 9248
      _________________________________________________________________
      batch_normalization_9 (Batch (None, 320, 480, 32) 128
      _________________________________________________________________
      activation_9 (Activation) (None, 320, 480, 32) 0
      _________________________________________________________________
      conv2d_10 (Conv2D) (None, 320, 480, 1) 33
      _________________________________________________________________
      batch_normalization_10 (Batc (None, 320, 480, 1) 4
      _________________________________________________________________
      flatten_1 (Flatten) (None, 153600) 0
      _________________________________________________________________
      activation_10 (Activation) (None, 153600) 0
      =================================================================
      Total params: 205,893
      Trainable params: 204,995
      Non-trainable params: 898
      _________________________________________________________________


      However when trying to fit the model



      history = model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=3, verbose=2, validation_data=(x_val,y_val))


      I encounter this error:



      InvalidArgumentError: Input to reshape is a tensor with 4915200 values, but the requested shape has 9830400
      [[{{node lambda_4/unpool/Reshape_3}} = Reshape[T=DT_INT64, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](lambda_1/MaxPoolWithArgmax:1, lambda_4/unpool/Reshape_2/shape)]]
      [[{{node lambda_4/unpool/strided_slice_6/_515}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1479_lambda_4/unpool/strided_slice_6", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]


      I have looked over all the shapes after each layers and they are what I expect. I also tested out the pooling/unpooling functions on sample tensors and they produced expected output. What am I doing wrong here?



      I've been pulling my hair out trying to solve this, any help is greatly appreciated!







      python tensorflow machine-learning keras computer-vision






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 20 '18 at 13:01









      desertnaut

      16.4k63566




      16.4k63566










      asked Nov 20 '18 at 12:14









      jasonacp

      64




      64
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Found the problem, mask_1 has 32 channels while unpool_2 is trying to reshape output with 64 channels. I just re-arranged stuff so the depths lined up.






          share|improve this answer





















            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53392793%2ftensorflow-reshape-error-with-custom-pooling-unpooling-layer%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            Found the problem, mask_1 has 32 channels while unpool_2 is trying to reshape output with 64 channels. I just re-arranged stuff so the depths lined up.






            share|improve this answer


























              0














              Found the problem, mask_1 has 32 channels while unpool_2 is trying to reshape output with 64 channels. I just re-arranged stuff so the depths lined up.






              share|improve this answer
























                0












                0








                0






                Found the problem, mask_1 has 32 channels while unpool_2 is trying to reshape output with 64 channels. I just re-arranged stuff so the depths lined up.






                share|improve this answer












                Found the problem, mask_1 has 32 channels while unpool_2 is trying to reshape output with 64 channels. I just re-arranged stuff so the depths lined up.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 24 '18 at 20:26









                jasonacp

                64




                64






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53392793%2ftensorflow-reshape-error-with-custom-pooling-unpooling-layer%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

                    Alcedinidae

                    Origin of the phrase “under your belt”?