Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 24353

Bhishan Bhandari: Deep learning for style transfer – Understanding baselines

$
0
0

The purpose of this writeup is to demystify the style transfer method in this famous paper Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization. It is an old paper but a very foundation for style transfer. Also on how you can train a model using this method with a custom dataset

Style transfer is an image synthesis problem which consists of an input content image Ix and an input style image Iy with a goal to produce a processed image(Is) that preserves the content of Ix and adds the style of Iy

A. The network

Following is the network representation used by the method. From the network diagram, we can see that the input is in pairs i.e content image and style image. Both the images are passed through the encoder separately

B. Encoder

The encoder used is the first few layers of a pretrained VGG-19 network. The representations of the content image and style image in feature spaces is then passed to the novel Adaptive Instance Normalization method. 

C. Adaptive Instance Normalization

AdaIN is an extension of the Instance Normalization that simply aligns the channelwise mean and variance of content image to match those of style image. The advantage of adaptive instance normalization in comparison to other types of normalizations such as batch normalization, instance normalization, conditional instance normalization is that it does not have learnable affine parameters thus making it adaptive to arbitrary styles. Now this is awesome. As we can see, it simply scales the normalized content image with the standard deviation of the style image followed by a shifting it with the mean of the style image. 

D. Decoder

Since Adaptive Instance Normalization is performed in the feature space, the decoder network is a kind of mirror of the encoder network that takes the feature space back to image space thus the output with content preserved and styled with the style image. 

E. Important network decisions

The decoder does not use any forms of normalization since the goal is to produce images with two different contexts(content and style). 

Another design choice of the network is that it uses reflection padding. Reflection padding avoids having artifacts and introduction and unwanted content. Also, it preserves spatial continuity as opposed to zero padding. 

F. Network Loss

Pre-trained VGG-19 is used to compute the loss to train the decoder. The loss(content and style loss) is weighted with a weight λ. 

Content Loss

The content loss is the Euclidean distance between the target features and the features of the output image. The content target is the output of the Adaptive Instance Normalization. 

Style Loss

Since only mean and standard deviation of the style image is transferred, thus only those statistics are relevant for loss propagation. Here style features are used as the target. The loss is Euclidean distance between the means and Euclidean distance between the standard deviations. 

G. Training with a custom dataset

Training the network with a custom dataset is very easy since all we require is a flat folder image dataset. Two sets of dataset are required i.e. one of content images, another of style images. Following is an implementation of a data loader for a flat folder dataset.

class FlatFolderDataset(data.Dataset):
    def __init__(self, root, transform):
        super(FlatFolderDataset, self).__init__()
        self.root = root
        self.paths = list(Path(self.root).glob('*'))
        self.transform = transform

    def __getitem__(self, index):
        path = self.paths[index]
        img = Image.open(str(path)).convert('RGB')
        img = self.transform(img)
        return img

    def __len__(self):
        return len(self.paths)

    def name(self):
        return 'FlatFolderDataset'
  1. Links

https://openaccess.thecvf.com/content_ICCV_2017/papers/Huang_Arbitrary_Style_Transfer_ICCV_2017_paper.pdf

https://colab.research.google.com/drive/1HiS92cRnBkJQ2dWS55iNMa-5Gx6o60oR?usp=sharing&fbclid=IwAR0dT0RINPSD_koHa7sBE1EJowvz3cRBOTb0CZyV97hSne_ppvEPNmOYNqQ#scrollTo=a4ht3WMBOpGG&line=2&uniqifier=1

Next?


Viewing all articles
Browse latest Browse all 24353

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>