Creating Fake Pokémon Images with Machine Learning

When you can’t catch em’ all, you generate em’ all

Code AI Blogs

Published in

CodeAI

6 min readJul 27, 2021

Introduction
Data Source
Setup
Data Preprocessing
Training the Model
Results
Conclusion
References

Introduction

What is a Pokémon GO fan to do in the midst of a pandemic where leaving the house to catch Pokémon is a no go? Generate their own Pokémon of course!

In this article, I’ll be going over how to create a fake Pokémon image generator using a generative adversarial network, or GAN.

GANs are generative models that create new data based on given training data. They do this by pitting 2 models against each other, one generative model and one discriminator model. The generator’s goal is to create the new data that looks real while the discriminator’s goal is to distinguish between the real data and the generated data.

The GAN in this project is a deep convolutional generative adversarial network (DCGAN), which is a class of convolutional neural networks that follow a set of architecture guidelines. You can read more about DCGANs here.

I based my process on this deep convolutional generative adversarial network tutorial by TensorFlow:

Deep Convolutional Generative Adversarial Network | TensorFlow Core

This tutorial demonstrates how to generate images of handwritten digits using a Deep Convolutional Generative…

www.tensorflow.org

Data Source

I’ll be using this dataset from Kaggle:

Pokemon Image Dataset

Pokemon image dataset

www.kaggle.com

It contains one image each for all 809 Pokémon from generation 1 to 7. It also contains a CSV file with the primary and secondary type for each, but I won’t be using it for this project.

This is the same dataset I used for a past article on Pokémon classification, which you can check out for some exploratory data analysis and a more detailed convolutional neural network guide:

Classifying Pokémon Images with Machine Learning

A convolutional neural network (CNN) walkthrough with code

medium.com

Setup

First things first, I pip and import all the Python libraries I’ll need for this project:

# To generate GIFs%pip install imageio%pip install git+https://github.com/tensorflow/docsimport numpy as np
import zipfileimport osfrom PIL import Imageimport matplotlib.pyplot as pltfrom matplotlib import gridspecimport tensorflow as tffrom tensorflow.keras import layersimport timefrom IPython import display# To generate GIFsimport globimport imageio

Now I create a list of the file paths to every Pokémon image. If you’re using Google Colab, I recommend uploading the dataset as a zip file and extracting it within the notebook.

train_dir = "/content/images/images/"fnames = os.listdir(train_dir)filepaths = [train_dir + fname for fname in fnames]

Data Preprocessing

With everything set up, let’s prep our dataset for our machine learning model!

First, I’ll set up a few constants:

IMG_SIZE = 120BATCH_SIZE = 8BUFFER_SIZE = 809       # number of pokemon in our dataset

And then we’ll define a few functions for preprocessing.

In our dataset, images differ in colour format and background colour. We need consistent colour formats for our machine learning model, and we want the same background colour for all Pokémon to prevent our generated images from having a mix of background colours. To solve this, modify_background removes the transparency in the images by converting them from RGBA to RGB and adds a black background. preprocess calls modify_background and rescales the RGB values to be between -1 and 1.

To see these functions in action, we’ll test them on an Abomasnow image. Here we have the original image on top and the modified image below:

Note: in Google Colab, im.show() from the PIL library does not work. Instead, use the name of the PIL image instance (i.e. im instead of im.show()).

Finally, we’ll create and configure our training dataset:

Training the Model

We’re now ready to build and train our model!

We’ll start by creating our generator model, a sequential model with 3 transposed convolutional layers. When training and testing the generator, noise gets passed in. Through training, the generator learns how to create new data that resembles the training data from this noise.

Next, I create the discriminator. It is a sequential model with 2 convolutional layers. The discriminator takes in images, and through training, it learns to distinguish between real and fake data. For some data augmentation, I also included a layer that randomly flips the training images horizontally. As 800 images is very little for training GANs, this helps increase the size of the dataset and adds some variability.

We also need to define loss functions and optimizers for both the generator and the discriminator:

Now we’ll define the training loop! As training takes a long time (even with a GPU hardware accelerator), we’ll also create save checkpoints so we don’t have to start from scratch if something goes wrong in the middle.

In the previous code cell, we also defined a few constants. I’ve set the number of epochs to 500, the dimensions of the noise to be 100, and the number of examples to generate to 16. It’s hard to be certain when a GAN is done training, so I decided to stop when few major changes were happening and the training seemed to stabilize, which I found to be around 500 epochs from my trial runs.

Finally, I train the model. If you’re using Google Colab, I recommend using the GPU hardware accelerator to speed up the training process.

train(train_dataset, EPOCHS)checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))

If you want to create a GIF of the training process, you can include the following code cell after training:

Results

Here’s the mesmerizing training process:

And after 500 epochs, here are the 16 generated fake Pokémon:

These 16 fake Pokémon images are my output after 500 epochs. You wouldn’t fool anyone with these generated images, but they do seem to have the potential to look like real Pokémon . The blue image in the second column of the third row resembles water and ice type Pokémon , while the image in the fourth column of the third row seems to have red eyes and the potential to be a new ghost type.

Unfortunately (or fortunately in some cases), some of the finer details, like eyes, that are present in the later half of the training process vanish by the end. You can experiment with the number of training epochs to find the perfect balance between detail and stability!

Conclusion

We’ve successfully trained a machine learning model to generate fake Pokémon images!

But what we’re creating now is far from game-ready. To potentially improve model performance, we could:

Increase our training sample size by using a larger dataset and more data augmentation
Remove the checkerboard artifacts in our generated images by using alternatives to transposed convolutional layers as shown in this guide

Deconvolution and Checkerboard Artifacts

When we look very closely at images generated by neural networks, we often see a strange checkerboard pattern of…

distill.pub

Here are some resources for more tips and tricks:

Keep Calm and train a GAN. Pitfalls and Tips on training Generative Adversarial Networks

Generative Adversarial Networks (GANs) are among the hottest topics in Deep Learning currently. There has been a…

medium.com

ganhacks/README.md at master · soumith/ganhacks

github.com

References

In addition to the ones linked throughout this article, I wouldn’t have been able to complete this project without the help of these awesome examples and tutorials:

[1] Google Developers | Generative Adversarial Networks

[2] TensorFlow | Data Augmentation

[3] Kaggle | Generating Pokémon with Generative Adversial Networks (GAN) by Paul Gavrikov

[4] Jovian | PokeGAN: Generating Fake Pokemon with a Generative Adversarial Network by Justin Kleiber

[5] Machine Learning Mastery | How to Get Started With Generative Adversarial Networks by Jason Brownlee

[6] Towards Data Science | Generative Adversarial Network (GAN) for Dummies — A Step By Step Tutorial by Michel Kana