Where, num_filters=C and groups=C means this is channel-wise transposed
convolution. The filter shape will be (C, 1, K, K) where K is filer_size,
This initializer will set a (K, K) interpolation kernel for every channel
of the filter identically. The resulting shape of the output feature map
will be (B, C, factor * H, factor * W). Note that the learning rate and the
weight decay are set to 0 in order to keep coefficient values of bilinear
interpolation unchanged during training.