Welcome to another OpenCV tutorial. In this tutorial, we’ll be covering thresholding for image and video analysis. The idea of thresholding is to further-simplify visual data for analysis. First, you may convert to gray-scale, but then you have to consider that grayscale still has at least 255 values. What thresholding can do, at the most basic level, is convert everything to white or black, based on a threshold value. Let’s say we want the threshold to be 125 (out of 255), then everything that was 125 and under would be converted to 0, or black, and everything above 125 would be converted to 255, or white. If you convert to grayscale as you normally will, you will get white and black. If you do not convert to grayscale, you will get thresholded pictures, but there will be color.
The main idea of thresholding image is to set a limit for pixel values in grey scale. For every pixel, the same threshold value is applied. If the pixel value is smaller than the threshold, it is set to 0, otherwise, it is set to a maximum value. The function cv.threshold is used to apply the thresholding. The first argument is the source image, which should be a grayscale image. The second argument is the threshold value which is used to classify the pixel values. The third argument is the maximum value which is assigned to pixel values exceeding the threshold. OpenCV provides different types of thresholding which is given by the fourth parameter of the function. Basic thresholding as described above is done by using the type cv.THRESH_BINARY.
While that sounds good enough, it often isn’t. We will be covering multiple examples and different types of thresholding here to illustrate this. We will use the following image as our example image, but feel free to use one of your own:
This short blurb from a book makes for a great example of why one might threshold. First, the background has really no white at all, everything is dim, but also everything is varying. Some parts are light enough to be easily read, while others are quite dark and require quite a bit of focus to make out. First, let’s try just a simple threshold:
retval, threshold = cv2.threshold(img, 10, 255, cv2.THRESH_BINARY)
A binary threshold is a simple “either or” threshold, where the pixels are either 255 or 0. In many cases, this would be white or black, but we have left our image colored for now, so it may be colored still. The first parameter here is the image. The next parameter is the threshold, we are choosing 10. The next is the maximum value, which we’re choosing as 255. Next and finally we have the type of threshold, which we’ve chosen as THRESH_BINARY. Normally, a threshold of 10 would be somewhat poor of a choice. We are choosing 10, because this is a low-light picture, so we choose a low number. Normally something about 125-150 would probably work best.
import cv2 import numpy as np img = cv2.imread('bookpage.jpg') retval, threshold = cv2.threshold(img, 12, 255, cv2.THRESH_BINARY) cv2.imshow('original',img) cv2.imshow('threshold',threshold) cv2.waitKey(0) cv2.destroyAllWindows()
The image now is slightly better for reading, but still a bit of a mess. Visually, it is better, but using a program to analyze this will still be quite hard. Let’s see if we can simplify it further.
First, let’s grayscale the image, and then do a threshold:
import cv2 import numpy as np grayscaled = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) retval, threshold = cv2.threshold(grayscaled, 10, 255, cv2.THRESH_BINARY) cv2.imshow('original',img) cv2.imshow('threshold',threshold) cv2.waitKey(0) cv2.destroyAllWindows()
More simple, yep, but we’re still missing out on a lot of context here. Next up, we can try adaptive thresholding, which will attempt to vary the threshold, and hopefully account for the curving pages.
import cv2 import numpy as np th = cv2.adaptiveThreshold(grayscaled, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 115, 1) cv2.imshow('original',img) cv2.imshow('Adaptive threshold',th) cv2.waitKey(0) cv2.destroyAllWindows()
In the previous section, we used one global value as a threshold. But this might not be good in all cases, e.g. if an image has different lighting conditions in different areas. In that case, adaptive thresholding can help. Here, the algorithm determines the threshold for a pixel based on a small region around it. So we get different thresholds for different regions of the same image which gives better results for images with varying illumination.
In addition to the parameters described above, the method cv.adaptiveThreshold takes three input parameters:
The adaptiveMethod decides how the threshold value is calculated:
- cv.ADAPTIVE_THRESH_MEAN_C: The threshold value is the mean of the neighbourhood area minus the constant C.
- cv.ADAPTIVE_THRESH_GAUSSIAN_C: The threshold value is a gaussian-weighted sum of the neighbourhood values minus the constant C.
The blockSize determines the size of the neighbourhood area and C is a constant that is subtracted from the mean or weighted sum of the neighbourhood pixels.
The code below compares global thresholding and adaptive thresholding for an image with varying illumination:
import cv2 as cv import numpy as np from matplotlib import pyplot as plt img = cv.imread('sudoku.png',0) img = cv.medianBlur(img,5) ret,th1 = cv.threshold(img,127,255,cv.THRESH_BINARY) th2 = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_MEAN_C,\ cv.THRESH_BINARY,11,2) th3 = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_GAUSSIAN_C,\ cv.THRESH_BINARY,11,2) titles = ['Original Image', 'Global Thresholding (v = 127)', 'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding'] images = [img, th1, th2, th3] for i in xrange(4): plt.subplot(2,2,i+1),plt.imshow(images[i],'gray') plt.title(titles[i]) plt.xticks(),plt.yticks() plt.show()
Thresholding is a technique in OpenCV, which is the assignment of pixel values in relation to the threshold value provided. In thresholding, each pixel value is compared with the threshold value. If the pixel value is smaller than the threshold, it is set to 0, otherwise, it is set to a maximum value (generally 255). Thresholding is a very popular segmentation technique, used for separating an object considered as a foreground from its background. A threshold is a value which has two regions on its either side i.e. below the threshold or above the threshold.
In Computer Vision, this technique of thresholding is done on grayscale images.