Skip to main content

Neural artist : Make your own Prisma with deep learning using Python


Introduction:

Hello world! After spending a long period with electronics & robotics, now I am working with neural networks & deep learning. We all know that unlike the electronics, human brain doesn't work with binary logic. The thinking approach of human brain is not like a pre-programmed electronic system, instead we can say that thinking ability is dynamic & it posses fuzziness. In simple words, fuzziness can be defined as a state of "may-be" or the state of neither high nor low. Thus, simulating human brain on the computer gives self learning ability to the computer. There is a well known term in computing to simulate the human brain which is called as a "Neural network".


Structure of neurons in your brain


Similar to human brain, an artificial neural network creates a mathematical model of thousands neurons. This neurons are connected to each other and they can communicate. I hope you already know about basic architecture of biological neurons, architecture of artificial neuron (perceptron) & various mathematical terms like weights, activation functions etc. 
After that, you need to learn about fundamentals of deep learning & details of Convolutional Neural network. Long story short, if you dont know anything about deep learning & CNNs then google is your friend. Read about it & then come back so that you can understand what I am trying to explain further :)

  • What is neural art?

I know you're thinking about it. What exactly a neural art is? Let me explain with an example.

The great wave of Kanagawa by Katasushika Hokusai
The above image is the painting called as "The great wave of kanagawa". It was painted by Katasushika Hosukai in 1830s and its one of the well known painting ever painted in Japan (You can see mount Fuji behind the waves). Now, observe the above painting carefully. Try to recognise the patterns of brush strokes, curves of waves as well as colour and texture combinations.
..
Now lets consider an another image.

Puppy

Now imagine that Katsushika Hokusai is watching this dog and he wants to make a painting of this ecstatic puppy. You have already observed the style of an artist by observing his previous painting of waves. So can you imagine how this image can be converted into the painting with style of Hokusai?
...
Just close your eyes and try to generate the painting in your mind!
...
Done? Congrats, you just made a neural art! Does it look something like this one?

Neural art : Hokusai's version of Little puppy
Well done! But did you noticed what actually happened inside your brain when you have tried to visualise this neural art? Mostly you didn't. So lets start mathematical modelling your brain activity so that we can perform the neural art on computers. This will enable us to convert any image into painting with styles of famous artists.

  • What's going on in your brain?


Neural networks used their weights to predict the output. So for any input data there are 2 possible outputs. First is a target or expected output & another is predicted output. The difference between expected & predicted output is called as error/loss. Thus, we need to calculate loss at every step and we have to keep it minimising. Remember, Lower the loss, higher the painting effect.
So there are 3 main types of loss :



1.Content Loss: This is the difference between output image and the input base image.Content loss ensures that your output painting shouldn't get much deviated from your input image.


2.Style Loss: It is the difference between styles of the input & output image. It is calculated by calculating gram matrix of both the images and then calculating the difference between them. Gram matrix is the matrix formed by calculating the image co-variance with itself. Style loss maintains the style of painting in the output image.


3.Total validation loss: It is the difference between a pixel of resulting image with its neighbouring pixel. This is done so that the image remains visually coherent.

Optimization:  It is the process of minimizing the loss by changing the input parameters. Optimization function tells us how much change in parameter is required to get minimum loss. Most of you should know about Gradient descent or adam optimizers. Here, we are using BFGS algorithm because its best suited for this application. Click here to know in depth mathematical equations of BFGS.

Code it..!

first, you need to setup Python & Keras (with theano backend) on your system. You can easily install keras module using pip once you have installed python.

Make sure your keras.json & .theanorc files will look like this: 
(change device=cpu if you don't have GPU setup but remember that it will take very long, even few hours to complete the process on CPU instead of GPU.)
(I am using theano backend for keras. You can use tensorflow if you want but you may need to modify the source code if necessary).


Then create a new file and save it with .py extension. Lets import all required libraries and declare global variables.


EPOCHS defines the number of iteration. For each iteration we calculate loss & tries to optimize it, so we will get better paintings at the output as we increase the number of iterations. (Warning: This program is processor intensive, so it took around 70 seconds on my system to complete 1 iteration). 


img_nrows & img_ncols defines the size of output matrix i.e. your painted image will be of dimension 300x300 pixels. [#pro-tip: reduce dimensions for faster processing]


Run same code with different style_wt & content_wt values to understand their significance.


Now, let's write some helper functions which will help us to pre-process our images. Pre-processing is necessary because our neural network requires input in particual format. It is similar to basic vision preprocessing occurring in your brain. After converting your image into painting, we need to reconstruct the output of neural network into the image format, for that purpose we are using deprocessing function.


Now lets read both the images i.e. style image (Wave art) & content image (puppy) into variables and store them in keras backend variables for further processing. Then create a Placeholder to store preprocessed images. Placeholder is simply a variable whose value can be assigned later. It is similar to declaring variable or object with None type initially but it will be used later to store the data.


Now we will create the neural network. For our purpose we are using a special type of convolutional neural network (CNN) which is called as VGG network. Their are 5 variants of VGG networks out of which we will use the standard model called as VGG-16. Keras provide pre-implemented VGG-16 class along with pre-trained weights. We are going to use these pre-trained weights because VGG-16 training is very much processor intensive and it will take lot of hours to train it. Thanks to 'imagenet' for releasing optimized VGG-16 weights as open source. The wights (approx 18 MBs) will be automatically downloaded into keras directory when you will run this program for the first time.



Here are the functions to calculate losses.




Now, we will set style & content attributes. After that, we will calculate gradients & set final output.




Now we will calculate loss & gradients and then we will evaluate the output losses by creating an evaluator class. What happen here is our BFGS algorithm calls the loss function from the evaluator class. This function calculates the loss & gradient with respect to style & base image.




Finally, we will create a for loop which will optimize the calculated loss using BFGS algorithm for each epoch.




I have iterated wave style on dog image for 5 times. I am attaching GIF image below for result comparison. You can observe that loss is reducing after every epoch and the co-relation between input & output image has been increased.

So, now you can try various art images as a styling image to perform different painting effects on your image. Just like your very own Prism!!

Base image & 5 iterations

What's next?

So far, we completed neural art on an image. Similarly, you can try to artify videos too. All you have to do is just feed each & every frame to the CNN & reconstruct the video again from the processed frames. But it will take huge processing time, so if anyone of you is working with some powerful GPU like GTX1080 or K-series products, try this video styling and let me know about it in comments!

Looking for complete code? Click here

Comments

  1. A very good article ! Elegantly explained one of the most used and popular application of CNN ! Keep up the good work :)

    ReplyDelete
    Replies
    1. I'm glad you liked it. Will try to write more about deep learning.

      Delete
  2. Nice implementation...
    https://hjlabs.in

    ReplyDelete

Post a Comment

Popular posts from this blog

ESP8266: Super compact WiFi Snipper for DeAuth attack

Introduction: Hello world! Few years ago, a new chip came to the market which started a new revolution! Yes, i am talking about the ESP8266. This chip is  way more powerful in terms of CPU architecture, clock frequency, flash size and power consumption if we compare it with existing arduino boards that usually contains Atmel's AVR controllers. ESP-01 Module Actually i bought this ESP-01 modules few years back which was originally flashed with good old serial AT command firmware. Over the time, I figured out that the same chip can be directly programmed through the Arduino IDE. This feature is so amazing that you can practically use all existing arduino libraries and the same IDE to program the ESP8266 according to your need by flashing your own arduino sketch. (Word of caution: It will remove your existing AT commands firmware from the chip) Okay, but why DeAuth? [ First of all you must read  this , if you don't know anything about deauthentication attack. ] A

Wireless gesture Controlled Robot Using AT89S52

ACCELEROMETER  BASED  WIRELESS  GESTURE CONTROLLED  ROBOT  USING 8051 (ATMEL’S  AT89S52)                        The gesture controlled robot is a special kind of robot which works with your hand gestures. It is possible to control the movement of robot in desired direction just with your hand gestures. You just need to wear a small transmitting device in your hand which included an acceleration meter. This will transmit an appropriate command to the robot so that it can do whatever we want. This robot is mainly divide into 2 parts: 1. Transmitter – The gesture device. 2. Receiver – The Robot. Now let’s discus about transmitter first. The transmitter consist of following analog and digital components: 1. Analog accelerometer ( motion sensor ) 2. Comparator (LM324 OP-AMP) 3. 4-bit Encoder (HT12E) 4. RF transmitter (remote control ) THE GESTURE (TRANSMITTER) DEVICE: Gesture Device THE TRANSMITTER SCHEMATIC: the transmitter schematic THE TRANSMITTER

Let's make an Arduino T-copter from scratch!

STEP-1 Starting up... Hi people, I am back after few months.. I decided to build something "productive" from ATmega chips laying around in my drawer, collecting dust. Still my final exam is not completed, i am busy in my studies.. but i am updating my idea as early as possible on this blog. From the first day, when i saw a multi-rotor UAV i was passionate to build my own multi copter.. And yes, build it from scratch rather than building it using ready made flight controllers like multiwii, kk multicopter etc... I am going make my own RF remote control rather than using a ready made multichannel  transmitters & receivers.. Hope my tricopter will fly as early as possible.. Basic Design (just created in paint): Selection of configuration I am going buy require components after my final exam... lot of work to do.. software, hardware & calculations...!! so stay tuned to build your very own tricopter with me...  (PS. I am trying to understand ma