The goals / steps of this project are the following:

Overview of Files

My project includes the following files:

Feature Extraction and Training of Classfier

Feature extraction from the training images

The code for training the classifier is defined in function train() in file This function is invoked with a list of file names of vehicle and non-vehicle images and several parameters that configure the feature extraction algorithms. Here is an example of one of each of the vehicle and non-vehicle classes:

alt text

In order to identify if the images that show a vehicle the following feature extraction techniques are used:

Histograms of Color

The combinations of parameters and its impact on the test accuracy is calculated in test test_01_train_histogram() of

Number of features Color space Numer of bins Test Accuracy
24 RGB 8 0.842
48 RGB 16 0.884
96 RGB 32 0.902
192 RGB 64 0.922
24 HSV 8 0.906
48 HSV 16 0.946
96 HSV 32 0.947
192 HSV 64 0.952
24 LUV 8 0.868
48 LUV 16 0.889
96 LUV 32 0.910
192 LUV 64 0.932
24 HLS 8 0.888
48 HLS 16 0.921
96 HLS 32 0.948
192 HLS 64 0.961
24 YUV 8 0.865
48 YUV 16 0.889
96 YUV 32 0.911
192 YUV 64 0.930
24 YCrCb 8 0.873
48 YCrCb 16 0.883
96 YCrCb 32 0.923
192 YCrCb 64 0.936

Spatial Binning

The combinations of parameters and its impact on the test accuracy is calculated in test test_02_train_spatial of

Number of features Color space Spartial Size Test Accuracy
192 RGB 8 0.902
768 RGB 16 0.905
3072 RGB 32 0.909
12288 RGB 64 0.905
192 HSV 8 0.873
768 HSV 16 0.899
3072 HSV 32 0.872
12288 HSV 64 0.877
192 LUV 8 0.898
768 LUV 16 0.926
3072 LUV 32 0.901
12288 LUV 64 0.904
192 HLS 8 0.871
768 HLS 16 0.895
3072 HLS 32 0.870
12288 HLS 64 0.877
192 YUV 8 0.901
768 YUV 16 0.919
3072 YUV 32 0.901
12288 YUV 64 0.896
192 YCrCb 8 0.899
768 YCrCb 16 0.924
3072 YCrCb 32 0.896
12288 YCrCb 64 0.901

The best results are marked in the table.

Gradient Features

Histograms of Ordered Gradients (HOG) were calculated in order to extract features that represent the shape of the vehicle.

I explored different color spaces and different skimage.hog() parameters (orientations, pixels_per_cell, and cells_per_block).
Here are example using the YUV color space and HOG parameters of orientations=13, pixels_per_cell=(16, 16) and cells_per_block=(2, 2):

Examples of vehicels alt text alt text alt text alt text alt text

Exemples of non vehicles alt text alt text alt text alt text alt text

A linear Support Vector Machine classifier was trained with feature vectors that were created by different skimage.hog() parameters in order to figure out which parameters work best on the given training data. The following table is calculated in test_03_train_hog() of

len features color_space orient pix_per_cell cell_per_block hog_channel accuracy
2940 HSV 5 8 2 ALL 0.962
6000 HSV 5 8 4 ALL 0.965
540 HSV 5 16 2 ALL 0.971
240 HSV 5 16 4 ALL 0.969
5292 HSV 9 8 2 ALL 0.955
10800 HSV 9 8 4 ALL 0.950
972 HSV 9 16 2 ALL 0.973
432 HSV 9 16 4 ALL 0.969
7644 HSV 13 8 2 ALL 0.942
15600 HSV 13 8 4 ALL 0.945
1404 HSV 13 16 2 ALL 0.968
624 HSV 13 16 4 ALL 0.968
2940 LUV 5 8 2 ALL 0.967
6000 LUV 5 8 4 ALL 0.968
540 LUV 5 16 2 ALL 0.976
240 LUV 5 16 4 ALL 0.975
5292 LUV 9 8 2 ALL 0.965
10800 LUV 9 8 4 ALL 0.966
972 LUV 9 16 2 ALL 0.976
432 LUV 9 16 4 ALL 0.977
7644 LUV 13 8 2 ALL 0.969
15600 LUV 13 8 4 ALL 0.969
1404 LUV 13 16 2 ALL 0.983
624 LUV 13 16 4 ALL 0.976
2940 HLS 5 8 2 ALL 0.957
6000 HLS 5 8 4 ALL 0.966
540 HLS 5 16 2 ALL 0.972
240 HLS 5 16 4 ALL 0.962
5292 HLS 9 8 2 ALL 0.949
10800 HLS 9 8 4 ALL 0.952
972 HLS 9 16 2 ALL 0.972*
432 HLS 9 16 4 ALL 0.968
7644 HLS 13 8 2 ALL 0.944
15600 HLS 13 8 4 ALL 0.949
1404 HLS 13 16 2 ALL 0.967
624 HLS 13 16 4 ALL 0.969
2940 YUV 5 8 2 ALL 0.965
6000 YUV 5 8 4 ALL 0.969
540 YUV 5 16 2 ALL 0.976
240 YUV 5 16 4 ALL 0.978
5292 YUV 9 8 2 ALL 0.967
10800 YUV 9 8 4 ALL 0.975 ]
972 YUV 9 16 2 ALL 0.980
432 YUV 9 16 4 ALL 0.978
7644 YUV 13 8 2 ALL 0.980
15600 YUV 13 8 4 ALL 0.970
1404 YUV 13 16 2 ALL 0.981
624 YUV 13 16 4 ALL 0.973
2940 YCrCb 5 8 2 ALL 0.964
6000 YCrCb 5 8 4 ALL 0.969
540 YCrCb 5 16 2 ALL 0.976
240 YCrCb 5 16 4 ALL 0.980
5292 YCrCb 9 8 2 ALL 0.967
10800 YCrCb 9 8 4 ALL 0.967
972 YCrCb 9 16 2 ALL 0.976
432 YCrCb 9 16 4 ALL 0.976
7644 YCrCb 13 8 2 ALL 0.970
15600 YCrCb 13 8 4 ALL 0.974
1404 YCrCb 13 16 2 ALL 0.977
624 YCrCb 13 16 4 ALL 0.979

The best results are marked in the table.

Combination of Feature Extraction

color space YUV number of orientations=13, pix per cell 16, cells per block 2, and hog channels = “All”, spartial size of 16, 64 bins.

len features spatial_feat hist_feat hog_feat accuracy
2364 True True True 0.991
960 True True False 0.961
2172 True False True 0.986
768 True False False 0.924
1596 False True True 0.990
192 False True False 0.933
1404 False False True 0.983

The best results are marked in the table.

Training of the Classifier

I trained a linear SVM using the following test data that consists of 8792 vehicle images and 8968 non vehicles images

The training of the classifier is implemented in function train() of

  1. For each image the feature vector is calculated in method extract_features() in that is implemented in file
  2. sklearn.preprocessing.StandardScaler()is used to normalize the feature vectors
  3. sklearn.model_selection.train_test_split() creates shuffled training and test data and labels
  4. the trained classifier is is saved via pickle so that it can be reused later (see )

Sliding Window Search

The trained classifier is applied on sliding windows of different sizes. Small sizes are applied in parts of the image where the vehicles are expected to be small. Bigger windows are used at the lower part of the image where they are expected to be big.

The pipeline for detection of cars is implemented in function process() in file It consists of the following steps:

  1. Search for vehicles using the aformentioned sliding windows approach (see the find_cars() that is implemented in )
  2. If the classifier detects a car in the window, then the window is added to the set of hot boxes.
  3. These boxes are then combined in a heatmap that shows how many boxes overlap at each point in the image.
  4. I then used scipy.ndimage.measurements.label() to identify individual blobs in the heatmap (see
  5. I then assumed each blob corresponded to a vehicle. I constructed bounding boxes to cover the area of each blob detected.

Ultimately I searched on three scales using YUV 3-channel HOG features plus spatially binned color and histograms of color in the feature vector, which provided a nice result. Here are some example images of the results and the intermediate calculations:

alt text alt text alt text alt text alt text alt text

Video Implementation

Here’s a link to my video result

In order to reduce the false positives I included detections of vehicles in previous frames (see, draw_labeled_bboxes_with_history())

Here’s an example showing results and the intermediate work product of a sequence of frames in the project video.:

alt text alt text alt text alt text alt text alt text alt text alt text

Here the resulting bounding boxes are drawn onto the last frame in the series: alt text


While implementing this project I frequently ran into the situation where I tried to reuse a classifier that was trained using different feaure extraction strategies and parameters. By adding the parameters into the pickle file, I was able to avoid this problem.

The results of the vehicle detections improved significantly by improving the following aspects of the code:

Since the classifier was trained on a quite small training set, the classifier might not work as well in other situations. We could improve the training set by:

For improved classification results we might be able to find a better feature extraction mechanism or we could use a CNN such as YOLO (“You only look once”). Additionally, we could try to improve the parameters of the classifier using sklearn.model_selection.GridSearchCV.

Pro Tips