The PSNR and the human visual system.




     In the field of image and video processing, quality assessment is an essential task. A widely used method for measuring the quality of reconstructed images and videos is Peak Signal-to-Noise Ratio (PSNR). However, the metric doesn’t consider the limitations of the Human Visual System (HVS) that can affect its ability to predict the actual perceived quality of an image or video.

     The HVS is a complicated system that doesn’t just rely on pixel information to respond to visual stimuli. It’s influenced by various factors like color, contrast, texture, and spatial frequency. To address this issue, PSNR-HVS [1] and some variants such as PSNR-HVS-M [2] have been developed.

 

PSNR

     PSNR is calculated by comparing the peak signal power (representing the maximum possible pixel value) to the mean squared error (MSE) between the original and compressed/reconstructed image. A higher value indicates a more negligible difference between the two images, which means a higher quality image. The equation is:

where the “R” is the maximum possible pixel value, and the “Mean Squared Error” is the average of the squared differences between the pixel values of the original and compressed/reconstructed/distorted images. The equation is usually expressed in decibels (dB).

     Here’s a code snippet of how the PSNR calculation could be implemented using Python:

 

 

PSNR-HVS and PSNR-HVS-M

     PSNR-HVS is an extension of PSNR that takes into account the fact that the human eye is more sensitive to certain types of image or video errors than others. For example, errors in high-frequency regions of an image or video are more noticeable than errors in low-frequency regions. PSNR-HVS takes this into account by applying a weighting function to the MSE between the original and compressed signals.

   Additionally, there is an option to include a Contrast Sensitivity Function (CSF) in the equation, resulting in another variant of PSNR-HVS known as PSNR-HVS-M [2].

 As you can already imagine, calculating these variants is not as straightforward and simple as the original one… So I’ll be relying on a library to run some comparisons.

 

Examples

 

Figure 2 - Original Image | 90% JPG Compression: PSNR= 31.31 dB, PSNR-HVS= 25.92 dB, PSNR-HVS-M= 26.87 dB

Figure 1 – Original Image | 90% JPG Compression: PSNR= 31.31 dB, PSNR-HVS= 25.92 dB, PSNR-HVS-M= 26.87 dB

 

Figure 3 - Original Image | Blur (K=3) PSNR= 33.18 dB, PSNR_HVS= 27.72 dB, PSNR-HVS-M= 28.56 dB | Blur (K=9) PSNR= 29.64 dB, PSNR_HVS= 24.01 dB, PSNR-HVS-M= 24.49 dB

Figure 2 – Original Image | Blur (K=3) PSNR= 33.18 dB, PSNR_HVS= 27.72 dB, PSNR-HVS-M= 28.56 dB | Blur (K=9) PSNR= 29.64 dB, PSNR-HVS= 24.01 dB, PSNR-HVS-M= 24.49 dB

 

Figure 3 - Original Image | JPG 50% Compression: PSNR= 36.65, PSNR-HVS= 33.52 PSNR-HVS-M= 37.23 | JPEG 90%: PSNR= 29.52, PSNR-HVS= 24.55 PSNR-HVS-M= 25.83

Figure 3 – Original Image | JPG 50% Compression: PSNR= 36.65, PSNR-HVS= 33.52, PSNR-HVS-M= 37.23 | JPEG 90%: PSNR= 29.52, PSNR-HVS= 24.55, PSNR-HVS-M= 25.83

 

Figure 4 - Original Image | Image with masked noise, PSNR= 26.18 dB, PSNR-HVS= 34.43 dB, PSNR-HVS-M=51.67 dB (Image from [2])

Figure 4 – Original Image | Image with masked noise, PSNR= 26.18 dB, PSNR-HVS= 34.43 dB, PSNR-HVS-M=51.67 dB (Image from [2])

     The HVS versions are more sensitive to certain spatial frequencies and contrasts, and as a result, it tends to penalize images more severely than the original PSNR, as seen in the values presented in both Figures 1 and 2.

      Regarding Figure 3, it is noticeable that the 50% compression did not result in a significant loss of quality. This factor led to PSNR-HVS-M producing a higher value than the other variants.

    Additionally, the HVS-M methodology yields better results when dealing with images containing masked noise (“Invisible Noise”), as we can see in Figure 4.

 

Conclusion

     PSNR and PSNR-HVS/M are good ways to measure image quality, and I like using PSNR-HVS-M in my Machine Learning Losses. But even though PSNR-HVS considers how humans see things, it’s not always 100% accurate in predicting what we’ll see. So it’s best to also use other ways to evaluate image quality, both objectively and subjectively.

 

 

Have fun with your projects! 
If this post helped you, please consider buying me a coffee 🙂

References