Object Detection for Dummies Part 1: Gradient Vector, HOG, and SS

09 Feb 2022 in Study Blog on Deep Learning

Jan 28, 2021. The raw blog URL.

Image Gradient Vector

Derivative
- Scalar
Directional Derivative
- Scalar

Gradient

Vector

import numpy as np
import scipy.signal as sig
data = np.array([[0, 105, 0], [40, 255, 90], [0, 55, 0]])
G_x = sig.convolve2d(data, np.array([[-1, 0, 1]]), mode='valid')
G_y = sig.convolve2d(data, np.array([[-1], [0], [1]]), mode='valid')

Common Image Processing Kernels(Useful URL)
- Prewitt operator
- Sobel operator

Histogram of Oriented Gradients (HOG)

Useful URL in zhihu.

How HOG works

Preprocess the image, including resizing and color normalization.
- Gamma Correction

\begin{equation} f(x) = x^{\gamma} \end{equation}

# Gamma Correction
import cv2
import numpy as np
img = cv2.imread('gamma.jpg', 0)
img2 = np.power(img/float(np.max(img)), 1.5)

Compute the gradient vector of every pixel, as well as its magnitude and direction

\begin{equation} g = \sqrt{g_x^2+g_y^2} \ \theta = \arctan \frac{g_y}{g_x} \end{equation}

import cv2
import numpy as np

# Read image
img = cv2.imread('runner.jpg')
img = np.float32(img) / 255.0  # 归一化

# x,y gradient
gx = cv2.Sobel(img, cv2.CV_32F, 1, 0, ksize=1)
gy = cv2.Sobel(img, cv2.CV_32F, 0, 1, ksize=1)

# gradient
mag, angle = cv2.cartToPolar(gx, gy, angleInDegrees=True)

Divide the image into many 8x8 pixel cells. In each cell, the magnitude values of these 64 cells are binned and cumulatively added into 9 buckets of unsigned direction (no sign, so 0-180 degree rather than 0-360 degree; this is a practical choice based on empirical experiments).
Then we slide a 2x2 cells (thus 16x16 pixels) block across the image. In each block region, 4 histograms of 4 cells are concatenated into one-dimensional vector of 36 values and then normalized to have an unit weight. The final HOG feature vector is the concatenation of all the block vectors. It can be fed into a classifier like SVM for learning object recognition tasks.

# HOG
from skimage import feature, exposure
import cv2
image = cv2.imread('/home/zxd/Pictures/Selection_018.jpg')
fd, hog_image = feature.hog(image, orientations=9, pixels_per_cell=(16, 16),
                    cells_per_block=(2, 2), visualize=True)

# Rescale histogram for better display
hog_image_rescaled = exposure.rescale_intensity(hog_image, in_range=(0, 10))

cv2.imshow('img', image)
cv2.imshow('hog', hog_image_rescaled)
cv2.waitKey(0)==ord('q')

Image Segmentation (Felzenszwalb’s Algorithm)

Graph Construction

There are two approaches to constructing a graph out of an image.

Grid Graph: Each pixel is only connected with surrounding neighbours (8 other cells in total). The edge weight is the absolute difference between the intensity values of the pixels.
Nearest Neighbor Graph: Each pixel is a point in the feature space (x, y, r, g, b), in which (x, y) is the pixel location and (r, g, b) is the color values in RGB. The weight is the Euclidean distance between two pixels’ feature vectors.

Object Detection for Dummies Part 1: Gradient Vector, HOG, and SS

Image Gradient Vector

Histogram of Oriented Gradients (HOG)

How HOG works

Image Segmentation (Felzenszwalb’s Algorithm)

Graph Construction

Jiawei Lu

Error

Image Gradient Vector

Histogram of Oriented Gradients (HOG)

How HOG works

Image Segmentation (Felzenszwalb’s Algorithm)

Graph Construction

Templates (for web app):

Error