Case Studies

PixelArt by Akvelon: Developing the Machine Learning Model

Akvelon has added a new filter to AI ArtDive, a platform that transforms photos and videos into works of art using neural networks. This new filter is called PixelArt, and can pixelate your photos and videos using machine learning. The latest version of the PixelArt machine learning model is now available on AI ArtDive:

Introduction to Pixel Style

Historically, pixel art is one of the most popular aesthetics in video games. It strives to recreate the look and feel of old Nintendo and Arcade titles. In the ’90s, pixel art was the only option on most consoles. And now the pixel style is popular again!

Pixel art is a unique art style, an appreciation of big visible pixels that makes up the founding elements of the complete image. The beauty often lies within its simplicity, with fantastic arrangements of big colored pixel blocks.

Today, pixel art exists as a popular graphics style, where graphic artists set up limitations and rules so the images will be in line with the expected, retro result.  The fact that this kind of art had never been done before was a strong motivator to keep working. A new artistic expression emerged — the art of ‘pixeling’.

Some pictures in the pixel style can be surprising, with their elaboration and idea. Usually, it takes a lot of time for the artist to create and elaborate such pictures, since you need to remember about the limitations in resolution and color palette. Therefore, the idea of creating AI that will apply the pixel style to any photo or picture seemed very good!

PixelArt Machine Learning Project Overview

In this section, we will provide an overview of Akvelon’s PixelArt project and discuss the development of a separate PixelArt Deep Learning model that can transform any image into its ‘pixeled’ version.


In the machine learning community, this task is named “Style Transfer”. Choosing from many different approaches, we decided to use the CycleGAN style transfer model. It transforms images from one domain to another and vise versa. For example, it can transform summer landscapes to winter landscapes, oranges on any photo to apples, horses to zebras, and so on. This model is incredible: we don’t need to have a “paired” dataset. This model uses any images from two styles, they don’t need to be “paired”.

CycleGAN is one of GAN (Generative adversarial network): it contains two generators and two discriminators.

The first generator gets an image from domain A (any image in our case) and tries to generate an image from domain B (Pixel Art version of the image). The first discriminator gets generated images and real Pixel Art images from the training dataset and tries to distinguish generated images from real images. The generator and discriminator have opposite goals and opposite loss functions so they train simultaneously from one epoch to another.

The second generator and discriminator work in the opposite direction – the generator gets an image from domain B and tries to produce a “domain A” version of the input image. Discriminator tries to distinguish real “domain A” images from generated images.

The main interesting feature of CycleGAN is that it gets images from domain A, generates the “domain B” version, and then tries to reproduce the “domain A” version using the generated “domain B” version. Ideally, the initial image and after cycle pass using two generators should be the same, so CycleGAN also uses two more loss functions to compare images after cycle pass (A to B and B to A domains).

Core design: 

For our case, we used this dataset. It contains many images of cartoon characters on a white background that are used for domain A and many images of Pixel Art cartoon characters on a white background that are used for domain B.

We experimented with custom datasets (ImageNet and MS COCO for domain A and parsed from // images for domain B) but empirically results seemed not so natural so we chose a dataset with cartoon characters.

We also experimented with the CUT, FastCU, and SinCUT models. They are modified versions of CycleGan and are faster for training and lighter weight, but CycleGAN gives the best result.

We used PyTorch as an ML framework, Google Colab as an environment for fit models, and Git as a version control system. You can see how our model pixelated and changed colors in these real images:

Developing PixelArt

We collected (scrapped) a unique Pixel Art dataset, and then used MS COCO and ImageNet dataset for domain A, parsed from images for domain B for a custom dataset, and cartoon dataset with cartoon characters on white background as second dataset.

Next, we analyzed the available approaches and existing architectures: Pix2Pix, CycleGan, DCGan, and cascaded network which consists of three subnetworks: GridNet, PixelNet and DepixelNet.

We developed and trained several models for Pixel style transfer (CycleGAN, CUT, FastCUT, SinCUT). We then tested these models and selected CycleGAN as the best for our case based on empirical results, and also tested the CycleGAN model on different datasets and selected dataset with cartoon characters on white background as the best for our case based on empirical results. To develop PixelArt, we also used several technologies including Python 3, PyTorch, and Numpv.

Development Results

As a a result of PixelArt’s development, we were able to focus more attention on to Akvelon’s machine learning capabilities. Our developers were excited to work on such an interesting and innovative idea. We were also able to leverage and improve our expertise working with several other technologies including PyTorch, Python, and Numpy.

We also were left with our own model that can be applied in various fields: 

    • images style transfer
    • video style transfer
    • style transfer for indie game development/design
    • transformation from pixel images to images with high resolution