Object Detection for Dummies Part 1: Gradient Vector, HOG, and SS
Jan 28, 2021. The raw blog URL.
Image Gradient Vector
- Derivative
- Scalar
- Directional Derivative
- Scalar
- Gradient
- Vector
import numpy as np import scipy.signal as sig data = np.array([[0, 105, 0], [40, 255, 90], [0, 55, 0]]) G_x = sig.convolve2d(data, np.array([[-1, 0, 1]]), mode='valid') G_y = sig.convolve2d(data, np.array([[-1], [0], [1]]), mode='valid')
- Vector
- Common Image Processing Kernels(Useful URL)
- Prewitt operator
- Sobel operator
Histogram of Oriented Gradients (HOG)
Useful URL in zhihu.
How HOG works
- Preprocess the image, including resizing and color normalization.
- Gamma Correction
\begin{equation} f(x) = x^{\gamma} \end{equation}
# Gamma Correction
import cv2
import numpy as np
img = cv2.imread('gamma.jpg', 0)
img2 = np.power(img/float(np.max(img)), 1.5)
- Compute the gradient vector of every pixel, as well as its magnitude and direction
\begin{equation} g = \sqrt{g_x^2+g_y^2} \ \theta = \arctan \frac{g_y}{g_x} \end{equation}
import cv2
import numpy as np
# Read image
img = cv2.imread('runner.jpg')
img = np.float32(img) / 255.0 # 归一化
# x,y gradient
gx = cv2.Sobel(img, cv2.CV_32F, 1, 0, ksize=1)
gy = cv2.Sobel(img, cv2.CV_32F, 0, 1, ksize=1)
# gradient
mag, angle = cv2.cartToPolar(gx, gy, angleInDegrees=True)
Divide the image into many 8x8 pixel cells. In each cell, the magnitude values of these 64 cells are binned and cumulatively added into 9 buckets of unsigned direction (no sign, so 0-180 degree rather than 0-360 degree; this is a practical choice based on empirical experiments).
Then we slide a 2x2 cells (thus 16x16 pixels) block across the image. In each block region, 4 histograms of 4 cells are concatenated into one-dimensional vector of 36 values and then normalized to have an unit weight. The final HOG feature vector is the concatenation of all the block vectors. It can be fed into a classifier like SVM for learning object recognition tasks.
# HOG
from skimage import feature, exposure
import cv2
image = cv2.imread('/home/zxd/Pictures/Selection_018.jpg')
fd, hog_image = feature.hog(image, orientations=9, pixels_per_cell=(16, 16),
cells_per_block=(2, 2), visualize=True)
# Rescale histogram for better display
hog_image_rescaled = exposure.rescale_intensity(hog_image, in_range=(0, 10))
cv2.imshow('img', image)
cv2.imshow('hog', hog_image_rescaled)
cv2.waitKey(0)==ord('q')
Image Segmentation (Felzenszwalb’s Algorithm)
Graph Construction
There are two approaches to constructing a graph out of an image.
Grid Graph: Each pixel is only connected with surrounding neighbours (8 other cells in total). The edge weight is the absolute difference between the intensity values of the pixels.
Nearest Neighbor Graph: Each pixel is a point in the feature space (x, y, r, g, b), in which (x, y) is the pixel location and (r, g, b) is the color values in RGB. The weight is the Euclidean distance between two pixels’ feature vectors.