You recently started some Machine Learning projects, got some results, but stumbled on how to evaluate them.




    Everything you see all over the internet (and also on your code output) tells you about these Loss and Accuracy things. How do interpret them? Which one is the correct to evaluate models? This article will explain that, going straight to the point:

 

Loss

  • The value the network is trying to minimize, its the error, the difference between the predicted value and the correct one.
    •   Let’s consider that the output of the network predicted a 0.8 and it should predict 1.0, the error is then 0.2. The network will change its parameters based on that difference to bring this 0.8 nearly as possible to 1.0.

 

Accuracy

  • How much the network correctly predicts the right class of the input.
    • It’s pretty straightforward: if you input 100 images and the network correctly predicts the classes of  86 images, then the Accuracy is 86%.

 

Time to code!

    The best way to learn is by practicing. Let’s train a model using a classic and simple dataset for a problem called Dogs vs Cats, which you can download from here (845 MB). The dataset consists of images of 2 categories, one having only dog images, and the other, just cats. The objective here is to build a classifier that can differentiate between the two image classes.

   Use the code below to train the model for some epochs and plot the output on one graph at the end. The training might take a while to finish, depending on your computer. Here I’ll use the Xception[1] network, training over 15 epochs.

 

Link to the gist of the code above

 

Training results

 

Run 1

 

Run 2

 

   Taking a closer look, the two runs produced the same validation accuracy of 73%. Which one is better, based on the prior Loss/Accuracy explanation?

   The second run went a little better than the first one, and because of that, produced a network that predicts more closely the expected values, because of the smaller val_loss (Run 1: 0.7324, Run 2: 0.7266).

 

   This topic goes way further, and with the basic knowledge acquired here, it’s now easier to explore deeper into this. Run more tests, try other datasets, change the code parameters and see how the training/validation process changes.

 

[1] CHOLLET, François. Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. p. 1251-1258.