In an earlier post, http://liipetti.net/erratic/2016/11/25/imaginary-landscapes-using-pix2pix/, I experimented with pix2pix, a versatile new package for training a model, using a conditional GAN architecture, to do various image transforms. In that earlier post, I also ventured beyond an image transform in which the content of the image is kept spatially similar, namely adding what is outside the image.
I have since then continued these experiments. In my current version I have used my own version of the generator (which takes into account that the spatial structure of the image will change: the input image is scaled down and moved into the middle, leaving out the area around to be filled in by the process. Some noise is also added).
function defineG_encoder_decoder(input_nc, output_nc, ngf) -- input is (nc) x 512 x 512 e1 = - nn.SpatialConvolution(input_nc, ngf, 4, 4, 2, 2, 1, 1) e1b = e1 - nn.SpatialMaxPooling(2,2) - nn.SpatialZeroPadding(64,64,64,64) - nn.WhiteNoise(0,0.1) e2 = e1 - nn.NoiseMask(ngf, 512, 0, 0.1) - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf, ngf * 2, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 2) e3 = e2 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 2, ngf * 4, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 4) e4 = e3 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 4, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) e5 = e4 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) e6 = e5 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) e7 = e6 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) e8 = e7 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) d1 = e8 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5) d2 = d1 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5) d3 = d2 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5) d4 = d3 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) d5 = d4 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 4, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 4) d6 = d5 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 4, ngf * 2, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 2) d7_ = d6 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 2, ngf, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf) d7 = {d7_,e1b} - nn.JoinTable(2) d8 = d7 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf*2, output_nc, 4, 4, 2, 2, 1, 1) o1 = d8 - nn.Tanh() netG = nn.gModule({e1},{o1}) return netG end
This is by far not perfect… just a snapshot of work under progress. I am not either too concerned how well the model will perform with totally new images. It is enough to see how the generator manages to complete images used during training, under guidance from the discriminator.
Here’s a typical result, which does not look spectacular but the image has been extended reasonably. The borders of the original image are clearly visible, as the model makes to attempt to ensure local continuity along the seam.
Here, the area on the left and the right has been filled with an interestingly artistic rendering of a building not too unlike the one in the middle.
Here, too, the buildings in the middle extend outwards, retaining somewhat of the same quality while getting an artistic, far from real quality. It is noteworthy that an artistic quality can be obtained without using any style images.
Occasionally, the resulting image can have the feeling of a complete image as such, like in the image above. Most often, however, it is at all not the case. The image below might be a great painting if not for the middle part, which looks like a square hole cut in the canvas. Still, the painting around it derives its colors and even shapes from the photo in the middle. Anyhow, again the process has, besides inventing the missing landscape and building, produced very interesting stylistic effects.
Referring to my previous post, Neural networks, style transfer and artistic process, these are clearly experiments. The results are interesting, but they are as yet not part of any artistic process by me. It is probably correct to say that these experiments aim at providing me with new tools and materials for further artistic creation.