Posts tagged GAN

“The video, called “Alternative Face v1.1”, is the work of Mario Klingemann, a German artist. It plays audio from an NBC interview with Ms Conway through the mouth of Ms Hardy’s digital ghost. The video is wobbly and pixelated; a competent visual-effects shop could do much better. But Mr Klingemann did not fiddle with editing software to make it. Instead, he took only a few days to create the clip on a desktop computer using a generative adversarial network (GAN), a type of machine-learning algorithm. His computer spat it out automatically after being force fed old music videos of Ms Hardy. It is a recording of something that never happened.”

This post summarizes a bunch of connected tricks and methods I explored with the help of my co-authors. Following the previous post, above the stability properties of GANs, the overall aim was to improve our ability to train generative models stably and accurately, but we went through a lot of variations and experiments with different methods on the way. I’ll try to explain why I think these things worked, but we’re still exploring it ourselves as well. The basic problem is that generative neural network models seem to either be stable but fail to properly capture higher-order correlations in the data distribution (which manifests as blurriness in the image domain), or they are very unstable to train due to having to learn both the distribution and the loss function at the same time, leading to issues like non-stationarity and positive feedbacks. The way GANs capture higher order correlations is to say ‘if there’s any distinguishable statistic from real examples, the discriminator will exploit that’. That is, they try to make things individually indistinguishable from real examples, rather than in the aggregate. The cost of that is the instability arising from not having a joint loss function – the discriminator can make a move that disproportionately harms the generator, and vice versa.