Code on Github – https://github.com/webdim0/dnn_basic_imgrec.
The programming language for machine learning today of course is Python, but without knowledge of this great language yet I found some great solutions written with JS to get started to do something.
Probably the most common library in this case is – TensorflowJs a popular one and it also gives us pretty low level API of working with neural networks parts which is good to try and learn some basics.
About the project. As u can understand from the title I took the classical task – image recognition for one digit from 0 to 9 drawn in browser. Obviously for image processing now it’s more common to see using of convolutional neural networks witch are little more complex ones, but in our case to get started with pretty simple images I will use classical model for DNN.
Final version of the project looks like this:
There are two pages. “Home” here we got canvas to draw a digit “Predict” button witch shows us result of recognition, probability of the recognition between 10 digits, and possibility to correct and save image and valid label to the dataset. After saving of every new image DNN is also automatically retrained.
Second page is “Dataset” where we can see a sorted list of images in our data set with original images and processed simplified ones prepared exactly for operations in DNN. Here we can add or delete images for our network and retrain DNN after this manipulations.
Well, that’s all, pretty simple, although one of the great features of TensorflowJs is, that it can run in browser on the client side in this case I decided to do it classical way, all DNN logic on server side with NodeJs providing API to the client side.
Also building this client side of the app I wanted to try using Svelte framework. Doing some basic stuff with it I generally like it very match seems for me nice and concise. What I don’t like in my case, then I was looking for router component, something like Vue official one, a used svelte-routing as one of the most commonly used as I found, and I wasn’t impressed. Think should try SvelteKit next time as a template of a new app with things like that, but for the first time wanted to make it as simple and as native as possible.
You can find the code in the “client” folder here – https://github.com/webdim0/test_nuxt_express_gallery/tree/main/server
Server part of application is built with NodeJs, Express, and Node version of TensorflowJs.
Code is in the “server” folder – https://github.com/webdim0/test_nuxt_express_gallery/tree/main/server.
This part is more interesting so I will make some comments on its code.
If you are completely new to deep neural networks first you should get come basic theoretical knowledge.
Those 5min videos I think are good ones:
And one of the best and well-known but more deep end complicated one is – But what is a neural network? | Chapter 1, Deep learning
To start working with TensorflowJs there are alsow good tutorials on the website – https://www.tensorflow.org/js/tutorials but in general if you understand the theory basics this projects code isn’t more complicated then tutorial ones.
Server side provide the API that can store and delete images and labels, prepare images for DNN processing, training the model based on this data, and predicting the label fed by the new image body.
There are two interesting files where all the logic is done: app\services\imageOperator.js and app\services\neuralNetwork.js.
First object have method prepareImageTensor where input image is 130×130 pixels transparent PNG file body is scaled to 12×12, this is done first to make less calculations. 130×130 is matrix of 16900 points and 12×12 – 144 what can be critical. The size of small image is imperative, it is good to make it as small as possible from one side, but keep the shape of object still enough for good recognition.
Then Image is gray-scaled so RGB+transparency channels(4 digits) are mean to one channel(1 digit) this is good because we don’t care about the color only the shape, second it’s again more simple to deal with 1 digit then with 4.
Then we make cropping empty areas to leave only frame with data to work with, and then resizing it again to fit 12×12 matrix with digit in the center but without empty space on both vertical and horizontal borders.
And on the final step of preparation we leave only “visible and transparent”. A pixel is visible enough then its code > 100 (imperative decision), we mark it as 255(visible) if less 0(transparent).
It can look like that:
Proper preparation of data set is very vital part of building DNN, the better and cleaner the data the more simple can be the model of the network.
And main part of this project of course is neuralNetwork object. There are two interesting methods there. First is – “train”, this method creates model, trains it and save model in JSON file.
Why I use this structure on DNN? Well, I don’t now, we got input shape of data as 12×12 matrix, and in output layer we got 10 classes of images 0-9, and using softmax as activation function for classification task, structure of hidden layers and optimization algoritms are imperative, this works ok for me, but you can try another structures and see the results, think it could be interesting.
And finally, most wanted method is – “predict”.
Using softmax activation function it gives us a list of probabilities of witch class this image an be, 87% – “8”, 7% – “9”, not bad results I think!
Hope it was interesting for you and may be you also will try to play with DNNs, have a nice day and good luck!