Skimage How Do You Know Parameters of Hog
Histogram of Oriented Gradients, or Pig for brusque, are descriptors mainly used in computer vision and machine learning for object detection. Still, we can as well utilize HOG descriptors for quantifying and representing bothshape andtexture.
HOG features were first introduced by Dalal and Triggs in their CVPR 2005 paper,Histogram of Oriented Gradients for Human Detection. In their work, Dalal and Triggs proposed Hog and a 5-stage descriptor to allocate humans in all the same images.
The 5 stages include:
- Normalizing the image prior to description.
- Calculating gradients in both thex andy directions.
- Obtaining weighted votes in spatial and orientation cells.
- Dissimilarity normalizing overlapping spatial cells.
- Collecting all Histograms of Oriented gradients to course the final feature vector.
The near important parameters for the Squealer descriptor are theorientations ,pixels_per_cell , and thecells_per_block . These three parameters (forth with the size of the input paradigm) finer control the dimensionality of the resulting feature vector. We'll be reviewing these parameters and their implications later in this commodity.
In about existent-world applications, HOG is used in conjunction with a Linear SVM to perform object detection. The reason HOG is utilized so heavily is considering local object appearance and shape can be characterized using the distribution of local intensity gradients. In fact, these are the exact same image gradients that we learned about in theGradients lesson, only only at present we are going to take these image gradients and plough them into a robust and powerful image descriptor.
We'll be discussing the steps necessary to combine both HOG and a Linear SVM into an object classifier later on in this course. But for now just understand that Grunter is mainly used every bit a descriptor for object detection and that after these descriptors can be fed into a car learning classifier.
HOG is implemented in both OpenCV and scikit-epitome. The OpenCV implementation is less flexible than the scikit-paradigm implementation, and thus we volition primarily used the scikit-epitome implementation throughout the residue of this form.
Objectives:
In this lesson, nosotros will exist discussing the Histogram of Oriented Gradients epitome descriptor in detail.
What are Pig descriptors used to describe?
HOG descriptors are mainly used to draw the structural shape and appearance of an object in an epitome, making them excellent descriptors for object classification. However, since Squealer captures local intensity gradients and edge directions, it likewise makes for a good texture descriptor.
The HOG descriptor returns a real-valued feature vector. The dimensionality of this feature vector is dependent on the parameters called for theorientations ,pixels_per_cell , andcells_per_block parameters mentioned in a higher place.
How do Grunter descriptors work?
The cornerstone of the Hog descriptor algorithm is that appearance of an object can exist modeled by the distribution of intensity gradients within rectangular regions of an image:
Implementing this descriptor requires dividing the epitome into small-scale connected regions called cells, so for each cell, computing a histogram of oriented gradients for the pixels within each cell. We tin and so accrue these histograms across multiple cells to course our feature vector.
Dalal and Triggs also demonstrated that we can performblock normalization to improve performance. To perform block normalization we accept groups of overlapping cells, concatenate their histograms, calculate a normalizing value, then dissimilarity normalize the histogram. By normalizing over multiple, overlapping blocks, the resulting descriptor is more robust to changes in illumination and shadowing. Furthermore, performing this blazon of normalization implies that each of the cells will be represented in the terminal feature vector multiple times but normalized by a slightly different gear up of neighboring cells.
Now, let'southward review each of the steps for computing the HOG descriptor.
Step i: Normalizing the epitome prior to clarification.
This normalization pace is entirely optional, only in some cases this step can ameliorate performance of the HOG descriptor. There are three primary normalization methods that we can consider:
- Gamma/ability law normalization: In this instance, nosotros have the
of each pixel
in the input image. Still, as Dalal and Triggs demonstrated, this approach is perhaps an "over-correction" and hurts performance.
- Square-root normalization: Hither, we take the
of each pixel
in the input image. By definition, square-root normalization compresses the input pixel intensities far less than gamma normalization. And again, as Dalal and Triggs demonstrated, square-root normalization actually increases accuracy rather than hurts information technology.
- Variance normalization: A slightly less used grade of normalization is variance normalization. Hither, we compute both the hateful
and standard deviation
of the input image. All pixels are mean centered by subtracting the mean from the pixel intensity, and then normalized through dividing by the standard deviation:
. Dalal and Triggs exercise not report accuracy on variance normalization; nevertheless, it is a class of normalization that I like to perform and thought it was worth including.
In most cases, it's best to starting time with either no normalization or foursquare-root normalization. Variance normalization is also worth consideration, merely in most cases information technology volition perform in a similar manner to square-root normalization (at least in my experience).
Step 2: Gradient computation
The kickoff actual step in the HOG descriptor is to compute the image gradient in both thex andy direction. These calculations should seem familiar, equally nosotros take already reviewed them in theGradients lesson.
We'll apply a convolution performance to obtain the slope images:
and
where is the input paradigm,
is our filter in theten-direction, and
is our filter in they-direction.
Equally a matter of abyss, here is an example of computing both thex andy gradient of an input prototype:
At present that nosotros have our gradient images, we can compute the final gradient magnitude representation of the image:
Which nosotros tin can see beneath:
Finally, the orientation of the slope for each pixel in the input paradigm tin then be computed by:
Given both and
, we can now compute a histogram of oriented gradients, where the bin of the histogram is based on
and thecontribution orweight added to a given bin of the histogram is based on
.
Step 3: Weighted votes in each cell
Now that nosotros accept our gradient magnitude and orientation representations, we need to divide our epitome up into cells and blocks.
A "prison cell" is a rectangular region defined by the number of pixels that vest in each jail cell. For example, if nosotros had a128 x 128 prototype and defined ourpixels_per_cell every bitiv x 4, nosotros would thus have32 x 32 = 1024 cells:
If we defined ourpixels_per_cell as32 x 32, nosotros would haveiv x 4 = xvi total cells:
And if we definedpixels_per_cell to be128 ten 128, we would merely haveane total cell:
Plain, this is quite an exaggerated example; we would probable never be interested in ai x 1 cell representation. Instead, this demonstrates how nosotros can divide an paradigm into cells based on the number of pixels per cell.
Now, for each of the cells in the image, we need to construct a histogram of oriented gradients using our gradient magnitude and orientation
mentioned above.
But earlier we construct this histogram, we demand to define our number oforientations . The number oforientations control the number of bins in the resulting histogram. The gradient bending is either within the range (unsigned) or
(signed). In general, it's preferable to utilize unsigned gradients in the range
withorientations somewhere in the range
. Just depending on your application, using signed gradients over unsigned gradients can amend accuracy.
Finally, each pixel contributes aweighted vote to the histogram — the weight of the vote is simply the slope magnitude at the given pixel.
Let's make this more clear by taking a await at our example paradigm divided up into16 x 16 pixel cells:
And then for each of these cells, we are going to compute a histogram of oriented gradients using 9orientations (or bins) per histogram:
Hither is a more than revealing blitheness were we tin can visually run across a different histogram computed for each of the cells:
At this bespeak, we could collect and concatenated each of these histograms to form our final feature vector. However, it's benign to apply block normalization, which we'll review in the next section.
Step 4: Contrast normalization over blocks
To business relationship for changes in illumination and contrast, we can normalize the slope valueslocally. This requires grouping the "cells" together into larger, connecting "blocks". Information technology is mutual for these blocks tooverlap, significant that each cell contributes to the terminal feature vector more than once.
Over again, the number of blocks are rectangular; however, our units are no longer pixels — they are the cells! Dalal and Triggs written report that using either2 x 2 or3 x threecells_per_block obtains reasonable accuracy in well-nigh cases.
Hither is an example where nosotros take taken an input region of an image, computed a gradient histogram for each cell, and and so locally grouped thecells intooverlapping blocks:
For each of the cells in the current block we concatenate their corresponding slope histograms, followed by either L1 or L2 normalizing the unabridged concatenated feature vector. Again, performing this type of normalization implies that each of the cells will be represented in the final feature vector multiple times but normalized by a unlike value. While this multi-representation is redundant and wasteful of space, information technology actually increases functioning of the descriptor.
Finally, later all blocks are normalized, we accept the resulting histograms, concatenate them, and care for them as our final characteristic vector.
Where are Squealer descriptors implemented?
HOG descriptors are implemented within the OpenCV and scikit-image library. Even so, the OpenCV implementation is not very flexible and is primarily geared towards the Dalal and Triggs implementation. The scikit-image implementation is far more flexible, and thus we will primarily use the scikit-paradigm implementation throughout this form.
How practise I employ Pig descriptors?
Here is an instance of how to compute Squealer descriptors using scikit-image:
from skimage import characteristic H = feature.sus scrofa(logo, orientations=9, pixels_per_cell=(eight, viii), cells_per_block=(2, two), transform_sqrt=True, block_norm="L1")
Nosotros can besides visualize the resulting Squealer paradigm:
from skimage import exposure from skimage import feature import cv2 (H, hogImage) = characteristic.grunter(logo, orientations=9, pixels_per_cell=(8, 8), cells_per_block=(2, 2), transform_sqrt=True, block_norm="L1", visualize=True) hogImage = exposure.rescale_intensity(hogImage, out_range=(0, 255)) hogImage = hogImage.astype("uint8") cv2.imshow("HOG Prototype", hogImage) (2017-xi-28) Update for skimage: Inscikit-image==0.12 , thenormalise parameter has been updated totransform_sqrt . Thetransform_sqrt performs the exact same operation, only with a unlike name. If you're using an older version ofscikit-image (over again, earlier the v0.12 release), then you'll want to altertransform_sqrt tonormalise . Inscikit-epitome==0.15 the default value ofblock_norm="L1″ has been deprecated and inverse toblock_norm="L2-Hys" . Therefore, for this lesson we'llexplicitly specifyblock_norm="L1″ . Doing this will avert it switching to"L2-Hys" with version updates without u.s. knowing (and yielding incorrect car logo identification results). You lot can read about L1 and L2 norms here.
(2019-01-06) Update for skimage: Thevisualise parameter is deprecated and changed tovisualize .
Identifying car logos using Pig descriptors
Later on in the PyImageSearch Gurus course, we'll learn how toautomatically detect and recognize license plates in images.
Merely what if we could alsoidentify the make of car based on its logo?
Now that would exist pretty cool.
In the remainder of this lesson, I'll demonstrate how we tin can use the Histogram of Oriented Gradients descriptor to characterize the logos of car brands. Simply similar in theHaralick texture lesson, we'll be leveraging a flake of machine learning to aide united states of america in the classification (which is a pretty mutual practise when it comes to identifying non-trivial shapes and objects).
Once more, I won't be performing a deep dive on the (very modest corporeality of) auto learning nosotros'll exist using in this lesson — nosotros accept the entireImage Classification module for that. If past the end of this lesson at that place are a couple of lines of code that feel a piffling bit like "blackness box magic" to you, that'south okay and to be expected. Call up, the betoken of these prototype descriptor lessons is providing a quick demonstration of how you tin utilize them in your own applications. The rest of the modules in the PyImageSearch Gurus course volition help fill in any gaps.
But earlier we swoop deep into this project, permit's look at our dataset.
Dataset
Our motorcar logo dataset consists of five brands of vehicles:Audi,Ford,Honda,Subaru, andVolkswagen.
For each of these brands, I gathered 5training images from Google. These images are theexample images nosotros'll utilize to teach our machine learning algorithm what each of the automobile logos await like. Our preparation dataset can be seen below:
Later on gathering the images from Google, I then went exterior and took a stroll around the local parking lot, snapping vii photos of machine logos. These logos will serve as ourtest set that nosotros can use to evaluate the performance of our classifier. The 7 testing images are displayed below:
Our goal
The goal of this projection is to:
- Excerpt Sus scrofa features from our grooming gear up to characterize and quantify each machine logo.
- Train a motorcar learning classifier to distinguish between each car logo.
- Apply a classifier to recognize new, unseen machine logos.
Recognizing car logos
Alright, enough talk. Let's kickoff coding upwardly this example. Open upwards a new file, proper noun information technologyrecognize_car_logos.py , and let's become coding:
# import the necessary packages from sklearn.neighbors import KNeighborsClassifier from skimage import exposure from skimage import feature from imutils import paths import argparse import imutils import cv2 # construct the argument parse and parse control line arguments ap = argparse.ArgumentParser() ap.add_argument("-d", "--grooming", required=True, help="Path to the logos training dataset") ap.add_argument("-t", "--test", required=True, help="Path to the test dataset") args = vars(ap.parse_args()) # initialize the data matrix and labels print("[INFO] extracting features...") data = [] labels = [] This code should wait adequately similar to our code from theHaralick texture instance. Parsing our command line arguments, nosotros tin see that we need two switches. The first is–training , which is the path to where the example car logos reside on disk. The 2d switch is–test , the path to our directory of testing images we'll utilize to evaluate our car logo classifier.
Nosotros'll also initializedata andlabels , 2 lists that will concur the HOG features and car brand proper noun for each paradigm in our training set up, respectively.
Let'due south go alee and excerpt Sus scrofa features from our grooming set up:
# loop over the epitome paths in the training prepare for imagePath in paths.list_images(args["training"]): # extract the brand of the car brand = imagePath.separate("/")[-2] # load the epitome, convert it to grayscale, and detect edges epitome = cv2.imread(imagePath) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) edged = imutils.auto_canny(greyness) # find contours in the edge map, keeping simply the largest i which # is presmumed to be the car logo cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cnts = imutils.grab_contours(cnts) c = max(cnts, key=cv2.contourArea) # excerpt the logo of the car and resize it to a canonical width # and tiptop (x, y, w, h) = cv2.boundingRect(c) logo = gray[y:y + h, 10:x + w] logo = cv2.resize(logo, (200, 100)) # extract Histogram of Oriented Gradients from the logo H = feature.hog(logo, orientations=9, pixels_per_cell=(ten, ten), cells_per_block=(2, 2), transform_sqrt=True, block_norm="L1") # update the data and labels information.append(H) labels.append(make) OnLine 22, nosotros start looping over each of the paradigm paths in the grooming directory. An example prototype path looks like this:car_logos/audi/audi_01.png
Using this image path, nosotros are able to excerpt the brand of the car onLine 24by splitting the path and extracting the second sub-directory name, or in this caseaudi .
From in that location, we'll perform a bit of pre-processing and ready the machine logo to be described using the Histogram of Oriented Gradients descriptor. All we need to practice is load the image from disk, convert it to grayscale, and then apply our handyauto_canny function to detect edges in the brand logo:
Notice how in each case nosotros are able to find the outline of the machine logo.
Anytime weobserve an outline, you can be sure that the adjacent step is (nearly always) toobserve the profile of the outline. In fact, that is exactly whatLines 33-36 do — extract the largest contour in the edge map, causeless to be the outline of the car logo.
Lines twoscore and 41 so have the largest contour region, compute the bounding box, and excerpt the ROI.
Be certain to pay attending toLine 42 , because it's extremely of import . Every bit I mentioned before in this lesson, having diverse widths and heights for your image tin can lead to HOG characteristic vectors ofunlike sizes— in nearly all situations this is not the intended behavior that you want!
Think of information technology this fashion: let'south assume that I extracted a Sus scrofa feature vector of size ane,024-d from Image A. And and so I extracted a Grunter feature vector (using the exact aforementioned Pig parameters) from Image B, which had different dimensions (i.e. width and height) than Image A, leaving me with a characteristic vector of size 512-d.
How would I go well-nigh comparing these two feature vectors?
The short reply is that you can't.
Remember, our extracted characteristic vectors are supposed to characterize and represent the visual contents of an prototype. And if our characteristic vectors are not the same dimensionality, then they cannot exist compared for similarity. And if we cannot compare our feature vectors for similarity, nosotros are unable to compare our two images at all!
Considering of this, when extracting HOG features from a dataset of images, you'll want to ascertain acanonical, known size that each epitome will exist resized to. In many cases, this means that you'll be throwing away the aspect ratio of the image. Usually, destroying the aspect ratio of an epitome should be avoided — simply in this case we are happy to do it, considering information technology ensures (i) that each paradigm in our dataset is described in a consistent manner, and (2) each feature vector is of the aforementioned dimensionality. We'll exist discussing this betoken much more when we reach theCustom Object Detector module.
Anyhow, at present that our logo is resized to a known, pre-defined200 x 100pixels, nosotros can then use the HOG descriptor usingorientations=9 ,pixels_per_cell=(10, 10) ,cells_per_block=(two, 2) , and foursquare-root normalization. These parameters were obtained by experimentation and examining the accuracy of the classifier — yous should expect to do this as well whenever you employ the HOG descriptor. Running experiments and tuning the HOG parameters based on these parameters is a critical component in obtaining an accurate classifier.
Finally, given the HOG feature vector, we so update ourdata matrix andlabels list with the feature vector and car make, respectively.
Given our data and labels we can now train our classifier:
# "train" the nearest neighbors classifier print("[INFO] training classifier...") model = KNeighborsClassifier(n_neighbors=1) model.fit(data, labels) print("[INFO] evaluating...") To recognize and distinguish the difference between our v machine brands, nosotros are going to utilise scikit-learns KNeighborsClassifier.
The k-nearest neighbor classifier is a type of "lazy learning" algorithm where nothing is actually "learned". Instead, the k-Nearest Neighbour (yard-NN) training phase simply accepts a ready of feature vectors and labels and stores them — that's it! Then, when it is fourth dimension to classify a new feature vector, it accepts the feature vector, computes the distance to all stored characteristic vectors (usually using the Euclidean distance, but any distance metric or similarity metric can be used), sorts them by distance, and returns the topthou "neighbors" to the input feature vector. From there, each of thechiliad neighbors vote as to what they think the label of the classification is. You can read more about the chiliad-NN algorithm inthis lesson.
In our case, nosotros are simply passing the Squealer characteristic vectors and labels to our k-NN algorithm and inquire it to study back what is the closest logo to our query features usingk=1 neighbors.
Let's encounter how we can utilise our k-NN classifier to recognize various auto logos:
# loop over the test dataset for (i, imagePath) in enumerate(paths.list_images(args["test"])): # load the test image, catechumen it to grayscale, and resize it to # the canonical size prototype = cv2.imread(imagePath) grayness = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) logo = cv2.resize(gray, (200, 100)) # extract Histogram of Oriented Gradients from the examination image and # predict the make of the machine (H, hogImage) = feature.hog(logo, orientations=9, pixels_per_cell=(10, 10), cells_per_block=(two, ii), transform_sqrt=True, block_norm="L1", visualize=True) pred = model.predict(H.reshape(ane, -ane))[0] # visualize the HOG prototype hogImage = exposure.rescale_intensity(hogImage, out_range=(0, 255)) hogImage = hogImage.astype("uint8") cv2.imshow("Sus scrofa Image #{}".format(i + 1), hogImage) # depict the prediction on the test prototype and display information technology cv2.putText(prototype, pred.title(), (10, 35), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0, 255, 0), 3) cv2.imshow("Exam Image #{}".format(i + 1), image) cv2.waitKey(0) Line 59 starts looping over the images in our testing set up. For each of these images, nosotros'll load it from disk; convert information technology to grayscale; resize it to a known, stock-still size; and then extract Grunter feature vectors from information technology in the exact same manner as we did in the training phase (Lines 62-69).
Line 70 so makes a call to our k-NN classifier, passing in our HOG feature vector for the current testing epitome and asking the classifier what it thinks the logo is.
We can alsovisualize our Histogram of Oriented Gradients image onLines 73-75. This is especially useful when debugging your Squealer parameters to ensure the contents of our image are being adequately quantified. Below are some examples of the Sus scrofa image our testing car logos:
Detect how the HOG epitome is always200 x 100 pixels, which are the dimensions of our resized testing prototype. We can also run into how thepixels_per_cell andorientations parameters come into play here, also every bit the ascendant orientation of each jail cell, where the size of the cell is defined by thepixels_per_cell . The more pixels in thepixels_per_cell , the more coarse our representation is. And similarly, smaller values ofpixels_per_cell volition yield more fine-grained representations. Visualizing a HOG image is an first-class fashion to "meet" what your Squealer descriptor and parameter set is describing.
Finally, we take the result of the nomenclature, draw it on our test image, and display information technology to our screen onLines 78-81.
To requite our car logo classifier a try, simply open upwards your terminal and execute the following command:
$ python recognize_car_logos.py --grooming car_logos --exam test_images
And below you'll run into our output classification:
In each instance, we were able to correctly allocate the brand of car using HOG features!
Of form, this approach only worked, because we had a tight cropping of the car logo. If nosotros had described theunabridged image of a car, it is very unlikely that we would accept been able to correctly classify the brand. But once again, that's something we can resolve when we go to theCustom Object Detector module, specificallysliding windows and epitome pyramids.
In the meantime, this example was nonetheless able to demonstrate how to utilize the Histogram of Oriented Gradients descriptor and the k-NN classifier to recognize the logos of cars. The key takeaway hither is that if you lot can consistently find and excerpt the ROI of your image dataset, the HOG descriptor should definitely be on your list of image descriptors to apply, every bit it's very powerful and able to obtain good results, especially when applied in conjunction with car learning.
Suggestions when using HOG descriptors:
HOG descriptors are very powerful; yet, it can exist ho-hum to choose the correct parameters for the number oforientations ,pixels_per_cell , andcells_per_block , especially when you outset working with object classification.
As a starting indicate, I tend to utilizeorientations=9 ,pixels_per_cell=(iv, four) , andcells_per_block=(ii, ii) , so get from there. Information technology's unlikely that your showtime set up of parameters will yield the best performance; however, it's important to start somewhere and obtain a baseline — results can be improved via parameter tuning.
Information technology's also important to resize your epitome to a reasonable size. If your input region is 32 x 32 pixels, then the resulting dimensionality would exist 1,764-d. Merely if your input region is128 x 128 pixel and you once again used the above parameters, your feature vector would be 34,596-d! By using large image regions and not paying attention to your HOG parameters, y'all tin end upwards with extremely big characteristic vectors.
We'll exist utilizing Squealer descriptors later in this class for object nomenclature, so if you're a little dislocated on how to properly melody the parameters, don't worry — this won't exist the terminal time you run into these descriptors!
Hog Pros and Cons
Pros:
- Very powerful descriptor.
- Excellent at representing local appearance.
- Extremely useful for representing structural objects that practice non demonstrate substantial variation in course (i.eastward. buildings, people walking the street, bicycles leaning confronting a wall).
- Very accurate for object classification.
Cons:
- Can upshot in very large characteristic vectors, leading to big storage costs and computationally expensive characteristic vector comparisons.
- Often non-trivial to tune theorientations ,pixels_per_cell , andcells_per_block parameters.
- Not the slowest descriptor to compute, but also nowhere near the fastest.
- If the object to be described exhibits substantial structural variation (i.east. the rotation/orientation of the object is consistently different), so the standard vanilla implementation of Squealer volition not perform well.
Downloads:
Download the Lawmaking
hardawayliker1951.blogspot.com
Source: https://customers.pyimagesearch.com/lesson-sample-histogram-of-oriented-gradients-and-car-logo-recognition/
0 Response to "Skimage How Do You Know Parameters of Hog"
Post a Comment