What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_pool of tensorflow?

Max Pooling is an operation to reduce the input dimensionality. The output is computed by taking maximum input values from intersecting input patches and a sliding filter window. At each step, the position of the filter window is updated according to the strides argument. When applying the filter to the border pixels some of the elements of the filter may not overlap the input elements. Therefore, in order to compute the values of those border regions input may be extended by padding with zero values. In some cases, we may want to discard these border regions. Hence, no padding required.

tf.nn.max_pool of tensorflow supports two types of padding, 'VALID' and 'SAME'. With 'VALID' padding tf.nn.max_pool returns output whose value can be computed without using any padding. 'VALID' option may discard the border elements of input. With 'SAME' padding tf.nn.max_pool returns output whose value can be computed by applying the filter to all input elements. Border elements are computed using zero padding. The output may be same or smaller than the input depending on the stride option.

Formulas for computing output size and padding pixels for 'VALID' and SAME' options are given in tensorflow website. For 'VALID' option there is no zero padding. Output dimensions are computed as:

out_height	= ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width	= ceil(float(in_width - filter_width + 1) / float(strides[2]))

For 'SAME' option output dimensions and padding options are computed as:

out_height	= ceil(float(in_height) / float(strides[1]))
out_width	= ceil(float(in_width) / float(strides[2]))
pad_along_height	= max((out_height - 1) * strides[1] + filter_height - in_height, 0)
pad_along_width	= max((out_width - 1) * strides[2] + filter_width - in_width, 0)
pad_top	= pad_along_height // 2
pad_bottom	= pad_along_height - pad_top
pad_left	= pad_along_width // 2
pad_right	= pad_along_width - pad_left

Padding is achieved by adding additional rows and columns at the top, bottom, left and right of input matrix depending on the above formulas. Padding values are always zero. Figures 1 and 2 show max pooling with 'VALID' and 'SAME' pooling options using a toy example.

For a 2D input of size 4x3 with a 2D filter of size 2x2, strides [2, 2] and 'VALID' pooling tf_nn.max_pool returns an output of size 2x1. Output dimensions are calculated using the above formulas. There is no padding with the VALID option. Max pooling starts by placing the 2x2 filter over the input at (0,0) and selecting the maximum input value from the overlapping region. For the next step, moving filter in the X direction by 2 (stride in X dimension) is not possible because the last column of the filter will be out of the image. Therefore, max pooling operation continued by moving filter in the Y direction by 2 (stride in Y dimension). Again from this new position, any further move in both X and Y directions is not possible. Therefore, Max pooling operation finishes up with an output of size 2x1 (figure 1).

Figure 1: Running max pooling with the VALID option over a 2D input of size 4x3 with a 2D filter of size 2x2 and strides [2, 2] produces an output of size 2x1. There is no padding in input with the VALID option. Input and output matrices are shown on the left and the right side of the animation respectively. Each step of max pooling operation is highlighted with yellow color.

For the same input, filter, strides but 'SAME' pooling option tf_nn.max_pool returns an output of size 2x2. Output and padding dimensions are computed using the given formula. Value of pad_right is 1 so a column is added on the right with zero padding values. Now max pooling operation is similar as explained above. Max pooling operation with the 'SAME' operation produces an output of size 2x2 (figure 2).

Figure 2: Running max pooling with the SAME option over a 2D input of size 4x3 with a 2D filter of size 2x2 and strides [2, 2] produces an output of size 2x2. Input and output matrices are shown on the left and the right side of the animation respectively. Each step of max pooling operation is highlighted with yellow color.