Technology Articles

Make Pixel Art in Seconds with Machine Learning

Pixelate any image with Akvelon’s new PixelArt filter

Hello, readers! My name is Irina and I’m a data scientist at Akvelon. Recently, Akvelon added a new filter to AI ArtDive, a platform that transforms photos and videos into works of art using neural networks. This new filter is called PixelArt, and can pixelate your photos and videos using machine learning.

First of all, let’s find out what Pixel Art is, how it is used in 2021, and why it should be considered art

Historically, pixel art is one of the most popular aesthetics in video games. It strives to recreate the look and feel of old Nintendo and Arcade titles. In the ’90s, pixel art was the only option for most console games. Now, pixel art style is popular again!

General Pixel Art rules

Some might think that the pixel style is just poor quality images with oversized pixels, but I want to explain how this is just untrue. A certain color gamut is used, and also in each case, its own pixel size is used to make the image look more harmonious.

  • Make individual pixels clearly visible, but not necessarily in low-resolution images.

 

Pixelate your favorite meme and and enjoy

How to transform images

The task definition

Now that we have defined what a pixel art style is, we can move on to the practical side of this article. Our task is to create a model that will take a usual picture, photo, meme, whatever you want, and convert it to pixel art style.

 

General concepts

Since we have to generate a new picture in a certain way, we will use GAN (Generative Adversarial Network). The GAN architecture consists of a generator model for outputting new plausible synthetic images, and a discriminator model that classifies images as real (from the dataset) or fake (generated). The discriminator model is updated directly, whereas the generator model is updated via the discriminator model. As such, the two models are trained simultaneously in an adversarial process where the generator seeks to better fool the discriminator and the discriminator seeks to better identify the counterfeit images.

About the data used

If you are still learning the ropes in data science, this information is just for you. But if you have some experience in data science, you already know that we have two ways to find data:

  • Collecting/parsing data for your task by yourself

Domain A

Domain B

In the second approach this dataset was used. t contains many images of cartoon characters on a white background that are used for domain A and many images of Pixel Art cartoon characters on a white background that are used for domain B.

 

Domain A

Domain B

How we using CycleGAN

CycleGAN architecture in more detail

As you already know, GAN contains generator and discriminator. Above, we also said that CycleGAN transforms images from one domain to another and vice versa. For this reason, CycleGAN architecture contains two generators and two discriminators.

Implementation of the model

For implementation, we decided to use a transfer learning approach because it’s convenient, fast, and gives excellent learning results. We used PyTorch as a neural network framework, PyCharm IDE for a training model with GPU, and Jupiter and Google Colab notebooks for loading the checkpoint and showing results.

  1. Download a pretrained model (e.g. horse2zebra) with the following script: bash ./scripts/download_cyclegan_model.sh <pretrained model name>

Inference

We should initialize our model architecture and load our weights file, which is also called the checkpoint. Despite the fact that our network architecture included generators and discriminators during training, during use we will take only the part that generates the image, that is, only the generator. You can see all model architectures in the repository, as they are too large to be included in the article. Note that the code below is in notebook format.

We need a function that returns a normalization layer based on the input string. This layer is used as a parameter for the constructor of the ResnetGenerator class.

Init model and load state dict

So, it’s time to create a ResnetGenerator object with ‘instance’ norm layers and initialize it with random weights.

And, of course, we need to check and define a device depending on CUDA availability.

Load and transform image

Let’s download the image that will be pixelated.

When working with images, we should always apply transformed operation images before passing them to the neural network. The set of transformations depends on your task, but in this case we will use a simple standard transform: turn the image into a torch tensor and normalize it.

Just load the downloaded image into Python code and apply the transformations.

Process image and save output to file

Finally we can pass the transformed image to the neural network.

After that, we need a last function that transforms the output of the ResnetGenerator neural network (that is PyTorch’s tensor) to a numpy array. This numpy array can be saved to disk as an image.

Just save and display pixelated image.

Results

So, in this tutorial, you should have learned:

  • Generic Pixel Art rules
  • Image to image translation and style transfer task
  • Existing approaches to solve style transfer
  • What the choice of dataset effects
  • How to define and train the CycleGAN model on custom dataset

Irina Nikolaeva is a Data Scientist/Software Engineer at Akvelon.