literature review for traffic sign recognition

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
Sensors (Basel)

Improved Traffic Sign Detection and Recognition Algorithm for Intelligent Vehicles

Jingwei cao.

1 State Key Laboratory of Automotive Simulation and Control, Jilin University, Changchun 130022, China; nc.ude.ulj.sliam@81wjoac (J.C.); moc.621@xhcgnos (C.S.); moc.621@nulisgnep (S.P.); nc.ude.ulj@ljgnefoaix (F.X.)

2 College of Automotive Engineering, Jilin University, Changchun 130022, China

Chuanxue Song

Shixin song.

3 School of Mechanical and Aerospace Engineering, Jilin University, Changchun 130022, China

Traffic sign detection and recognition are crucial in the development of intelligent vehicles. An improved traffic sign detection and recognition algorithm for intelligent vehicles is proposed to address problems such as how easily affected traditional traffic sign detection is by the environment, and poor real-time performance of deep learning-based methodologies for traffic sign recognition. Firstly, the HSV color space is used for spatial threshold segmentation, and traffic signs are effectively detected based on the shape features. Secondly, the model is considerably improved on the basis of the classical LeNet-5 convolutional neural network model by using Gabor kernel as the initial convolutional kernel, adding the batch normalization processing after the pooling layer and selecting Adam method as the optimizer algorithm. Finally, the traffic sign classification and recognition experiments are conducted based on the German Traffic Sign Recognition Benchmark. The favorable prediction and accurate recognition of traffic signs are achieved through the continuous training and testing of the network model. Experimental results show that the accurate recognition rate of traffic signs reaches 99.75%, and the average processing time per frame is 5.4 ms. Compared with other algorithms, the proposed algorithm has remarkable accuracy and real-time performance, strong generalization ability and high training efficiency. The accurate recognition rate and average processing time are markedly improved. This improvement is of considerable importance to reduce the accident rate and enhance the road traffic safety situation, providing a strong technical guarantee for the steady development of intelligent vehicle driving assistance.

1. Introduction

With the rapid development of economy and technology in the modern society, automobiles have become an indispensable means of transportation in the daily travel of people. Although the popularity of automobiles has introduced considerable convenience to people, it has also caused a numerous traffic safety problems that cannot be ignored, such as traffic congestion and frequent road accidents. Traffic safety issues are largely caused by subjective reasons related to the driver, such as inattention, improper driving operation and non-compliance with traffic rules, and smart cars have become an effective means to eliminate these human factors [ 1 , 2 , 3 , 4 , 5 ]. Self-driving technology can assist, or even independently complete the driving operation, which is of remarkable importance to liberate the human body and considerably reduce the incidence of accidents [ 6 , 7 ]. Traffic sign detection and recognition are crucial in the development of intelligent vehicles, which directly affects the implementation of driving behaviors. Smart cars use a vehicle-mounted camera to obtain real and effective road traffic information; they can also recognize and understand traffic signs in real time in the actual road scenes to provide correct command output and good motion control for intelligent vehicles, which can remarkably improve the efficiency and safety of automatic driving [ 8 , 9 , 10 ]. Therefore, conducting an in-depth study on it is necessary.

The traffic sign recognition process generally includes two stages: traffic sign detection and recognition. However, in the daily natural conditions, the changes of light, the complex backgrounds and the aging of signs have caused many difficulties in accurately identifying traffic signs. With the rapid increase in computer running speed, many experts and scholars have focused on the traffic sign recognition process, which is mainly divided into traffic sign detection and recognition technologies [ 11 , 12 , 13 , 14 ]. Traffic sign detection technology is mainly based on inherent information, such as color, shape and texture features of traffic signs, and accurately extracts traffic sign candidate areas from the actual road scenes. Wang et al. [ 15 ] proposed a red bitmap method to detect traffic signs. Firstly, color segmentation of the detected images is performed, and then shape detection of the region of interest (ROI) based on edge information is conducted. This method achieved good detection results but was only applicable to red circular traffic signs, which had some limitations. Hechri et al. [ 16 ] used the template matching method to match the traffic signs. By setting the sliding window of the same size as the traffic signs, the useless parts of non-traffic signs in the current road scenes were removed. However, some signs had different shapes and sizes, and the road traffic environment was complex and changeable; thus, the real-time performance of this method was poor. Lillo-Castellano et al. [ 17 ] adopted the color segmentation method that combines HIS and LAB color spaces to enable detection of black, white and colorful traffic signs. Xiao et al. [ 18 ] proposed a traffic sign detection method combining HOG features and Boolean convolutional neural networks. This method can eliminate the error detection areas of candidate regions and achieve good detection results by connecting cascaded classifiers. Guan et al. [ 19 ] proposed a method for detecting traffic signs from mobile LiDAR point clouds and digital images. Traffic signs were detected from mobile LiDAR point clouds based on valid road information and traffic-sign size, and segmented by digital image projection, and the given images can be classified automatically after normalization. Traffic sign recognition technology is mainly used to analyze and classify the detected traffic signs and accurately obtain their actual meaning. Sun et al. [ 20 ] proposed a traffic sign classification method based on extreme learning machine (ELM), which is a supervised learning algorithm related to feedforward neural network. Only one hidden layer is observed; therefore, the parameters were few and the training time was short. The algorithm classified traffic signs according to the calculation results by selecting a certain proportion of features and obtained high recognition accuracy. Qian et al. [ 21 ] trained the traffic sign data by using the regional depth convolutional neural network (CNN) and the collected Chinese traffic sign dataset for identification test, which achieved a high accurate recognition rate. He et al. [ 22 ] proposed ResNet network based on the concept of residuals. By continuously learning the residuals, the network performance was considerably raised, and the recognition accuracy was further improved. Yuan et al. [ 23 ] adopted the traffic sign recognition combining Adaboost algorithm and support vector machine (SVM). The candidate recognition images were screened by the Adaboost and then classified by the SVM. The recognition accuracy was high, but the detection time was long. Kumar et al. [ 24 ] proposed a traffic sign detection method based on capsule network. The multi-parameter deep learning network structure can realize automatic feature extraction, which had good robustness and stability. The experimental results showed that the method had a conspicuous detection effect. Yuan et al. [ 25 ] proposed an end-to-end traffic sign detection method. The multi-feature fusion network structure can extract effective features for different size images, and then establishing a vertical space sequence attention module to obtain background information around the detected image, which also had prominent detection performance in complex road traffic environments. The research results show that many methods have improved the accurate recognition rate of traffic signs, but advantages and disadvantages still exist between the algorithms, which will be limited by various conditions. In the study of traffic sign detection technology, disturbances, such as bad weather conditions, changes in lighting conditions and fading of signage, will lead to an evident decline in the accuracy of traffic sign detection and poor environmental adaptability [ 26 , 27 , 28 ]. Moreover, recognition algorithms based on deep learning-based methodologies have a high accurate recognition rate, but some problems, such as high complexity of the algorithms and long processing time, exist. Meanwhile, the algorithms have high requirements on system hardware, and the structures of training models are complicated, thereby indicating the presence of some limitations [ 29 , 30 , 31 , 32 ]. Therefore, further improvement of the traffic sign detection and recognition algorithm is urgent.

In this study, an improved traffic sign detection and recognition algorithm for intelligent vehicles is proposed. Firstly, the HSV color space is used for spatial threshold segmentation, and traffic signs are effectively detected based on the shape features. Secondly, the model is considerably improved on the basis of the classical LeNet-5 convolutional neural network model by using Gabor kernel as the initial convolutional kernel, adding the batch normalization (BN) processing after the pooling layer and selecting the Adam method as the optimizer algorithm. Finally, the traffic sign classification and recognition experiments are conducted based on the German Traffic Sign Recognition Benchmark (GTSRB). The favorable prediction and accurate recognition of traffic signs are achieved through the continuous training and testing of the network model. According to the analysis of experimental results and performance comparison with other algorithms, the comprehensive performance of the algorithm is evaluated.

The rest of this paper is organized as follows: In Section 2 , the HSV color space is used for spatial threshold segmentation, and traffic signs are effectively detected based on the shape features. In Section 3 , the classic LeNet-5 CNN model is further improved. In Section 4 , the experiments on traffic sign classification and recognition based on the GTSRB are conducted and analyzed, and the performance of algorithms are compared. In Section 5 , conclusions and suggestions for possible future work are outlined.

2. Traffic Sign Detection

The road traffic images are captured by vehicle-mounted cameras installed on the smart cars, and the traffic sign detection aims to extract the interested traffic sign regions from the current road traffic images sufficiently. However, in different external conditions, the qualities of the acquired images are uneven, and these qualities must be effectively detected following the inherent characteristics of traffic signs, such as color and shape. In this section, it mainly includes two parts: traffic sign segmentation based on the color space and traffic sign detection based on shape features.

2.1. Traffic Sign Segmentation Based on the HSV Color Space

Color is an important feature of traffic sign, and traffic sign can be quickly located by color segmentation. Compared with RGB color space and HSI color space, the HSV color space has a faster detection speed, less affected by illumination, and has a preferable segmentation advantage. Figure 1 shows the HSV color space converted from the RGB color space. It represents the points in the RGB color space by an inverted cone, where H is the hue, S is the saturation and V is the value.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g001.jpg

The HSV color space.

H indicates the color change of the image. The position of the spectral color is represented by the angle, and different color values correspond to different angles. Red, green and blue are 120° apart, that is, 0°, 120° and 240°, respectively. S denotes the proportion of the current color purity to the maximum purity with the maximum value of 1 and the minimum value of 0. V represents the brightness change of the image. The maximum value is 1 in white and the minimum value is 0 in black. In the HSV color space, given that V is a fixed value set and H and S are highly unrelated, the HSV color space has good illumination adaptability when the illumination conditions change, and its computational complexity is small, which are conducive to the color space threshold segmentation.

The conversion of an RGB to an HSV image is shown in Figure 2 .

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g002.jpg

Converting the RGB image to the HSV image.

Color space threshold segmentation is required after conversion to the HSV color space. Figure 3 shows the color space threshold segmentation step diagram.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g003.jpg

The color space threshold segmentation step diagram.

Common traffic signs mainly include red, yellow and blue colors. In order to meet the target requirements of real-time color segmentation, it is necessary to determine the corresponding threshold range. Through multiple test experiments, the three-channel threshold segmentation ranges of three colors are obtained on the premise of ensuring good segmentation effects, as shown in Table 1 .

HSV color space threshold segmentation ranges.

In the process of threshold segmentation, the pixels within the set threshold range are set to white, otherwise they are set to black, and the image is completely binarized. Since the traffic sign in the original picture is red, the obtained threshold coarse segmentation image only displays red. Figure 4 presents the threshold rough segmentation image.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g004.jpg

The threshold rough segmentation image.

2.2. Traffic Sign Detection Based on the Shape Features

In the actual road scenes, traffic signs do not exist independently. Colorful clothes of pedestrians and colored billboards are likely to be consistent with the color of traffic signs, thereby resulting in some useless interference to the binary image with threshold coarse segmentation. Therefore, filtering these interferences is necessary to achieve effective detection of the ROI. Figure 5 illustrates the morphological processing for binary images.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g005.jpg

The morphological processing for binary images.

Firstly, the binary image is processed by image corrosion and expansion. Some isolated useless pixels often exist on the edge of the image, and these pixels can be effectively removed by corrosion. Meanwhile, expansion aims to enlarge the area of the ROI. The combination of them can filter out some subtle interference, thereby producing prominent shape characteristics of traffic signs.

The filling process is then conducted. The traffic signs may be discolored, damaged and blocked by some obstacles in the actual road scenes, and the ROI cannot be completely displayed. The filling process can help complete and visualize the contours of traffic signs.

Finally, the effective detection of traffic signs is realized. Some large irregular interference areas still exist in the segmented image after the filling process and thus need to be filtered. Contour filtering is conducted by the contour analysis of connected area. This are in the image is a set with all the same pixel points. The circumference and area of the contours of all connected areas are calculated and then compared with the standard circular mark. The contours that meet the requirements are retained; otherwise, they are discarded. Similarly, this method is equally applicable to the traffic sign detection of triangle, rectangle and other shapes. The remaining part of the segmented image after contour filtering corresponds to the detected traffic sign.

3. Improved LeNet-5 CNN Model

Traffic sign recognition is based on existing dataset resources and uses effective classification algorithm to recognize detected traffic signs and feedback to smart cars accurately in real time. CNN extracts features directly from the input detection image and outputs the classification results via the trained classifier based on image features. This condition indicates that CNN has good graphic recognition performance. Furthermore, CNN does not need to extract features manually. The sensory cognitive process of human brains can be well simulated via forward learning and feedback mechanism, thereby gradually improving the ability of traffic sign classification and recognition [ 33 , 34 ]. In this section, the shortcomings of the classical LeNet-5 network model are analyzed, and the model is considerably improved to further expand the outstanding advantages of CNN in graphics recognition.

3.1. Deficiency Analysis of Classical LeNet-5 Network Model

Professor Yann Lecun proposed the LeNet-5 network model in 1998, which was mainly used for digital recognition. The LeNet-5 network model consists of seven layers, including two convolutional layers, two pooling layers, two fully-connected layers and one output layer. The input image size is 32 × 32, and the output is a 10-dimensional classification vector, which can identify numbers from 0 to 9 [ 35 , 36 ].

The classic LeNet-5 network model has good classification and recognition effects for a single target. However, in the traffic signs recognition training, it is difficult to ensure a high enough accurate recognition rate, the training network cannot converge, and the recognition efficiency of the network decreases dramatically.

Analysis and summary of the root causes of these problems show the following:

(1) The interference background in the traffic sign training image is much more complicated than that in a single digital image. The original convolutional kernel does not perform well in feature extraction. Consequently, the extracted features cannot be properly used for the accurate classification of the subsequent classifier.
(2) Different kinds of traffic sign training images exist, and the number of datasets is large. Gradient dispersion easily occurs during network training, and the generalization ability is significantly markedly reduced.
(3) The size of the ROI in the input traffic sign training image varies, and the effective features obtained by the current network model are insufficient to meet the target requirements of accurate traffic sign recognition.
(4) The learning rate and the iterations number of the training network are not adjusted accordingly, and the relevant parts are rationally optimized, thereby resulting to the emergence of the over-fitting phenomenon during training.

3.2. Improved LeNet-5 Network Model

3.2.1. image preprocessing.

The ROI in the traffic sign training image is not 100% in the center of the image, and some edge background information is included around the traffic sign. With the change of illumination conditions, these useless interference areas will increase the influence on traffic sign recognition, thereby undoubtedly raising the computational complexity of the training network and the misrecognition rate of traffic signs. Therefore, image preprocessing is necessary.

Image preprocessing mainly includes the following three stages:

(1) Edge clipping. Edge cropping is a particularly important step in the image preprocessing. Some background parts in the edge are not related to traffic signs, and these parts can account for approximately 10% of the entire image. The bounding box coordinates are used for proportional cropping to obtain the ROI. The removal of the interference region helps to reduce redundant information and speed up the network training.
(2) Image enhancement. The recognition effects of the same type of traffic signs in the training network under different illumination conditions are significantly different. Therefore, reducing or removing the noise interference caused by the light change via image enhancement is necessary. Direct grey-scale conversion method is used to adjust the grey value of the original image using the transformation function, which presents clear details of the ROI and demonstrates a blurred interference area. Thus, this method effectively improves the image quality and reduces the computational load of the training network.
(3) Size normalization. The same type of traffic signs may have different sizes. The different sizes of training images may have different feature dimensions during the CNN training process, which leads to difficulties in the subsequent classification and recognition. In this paper, the image is uniformly normalized in size, and the normalized image size is 32 × 32.

3.2.2. Improved LeNet-5 Network Model

The LeNet-5 network model has been considerably improved due to the shortcomings of the classic model in traffic sign recognition. Figure 6 shows the improved LeNet-5 network model structure.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g006.jpg

The improved LeNet-5 network model structure.

The improvement of LeNet-5 network model includes the following five aspects.

(1) The Gabor kernel is used as the initial convolutional kernel between the input layer and the first convolutional layer. In the actual road scenes, the change of light, the damage of traffic signs, and the obstruction of obstacles will affect the quality of the training image. Nonetheless, Gabor wavelet can solve such problems commendably. The Gabor wavelet is insensitive to changes in light; therefore, it has good adaptability to light. Furthermore, it has superior scale and direction selection characteristics that are sensitive to the edges of the training image.

The two-dimensional Gabor filter is a band-pass filter whose impulse response function is as follows:

where f is the center frequency of the bandwidth; θ is the spatial direction whose value ranges [ 0 , π ] ; σ x and σ y are the standard deviations in the x and y directions, respectively; f x = f · cos θ and f y = f · sin θ are both frequencies in space.

When σ x = σ y , the Equation (1) can be converted to:

Given that Gabor filters vary in different scales and directions, the mean value of Gabor kernels in different directions at the same scale is taken as the initial convolutional kernel in this paper.

(2) After each pooling layer, the BN is added for data normalization. In the deep learning network model, as the number of training increases, the hidden layer gradient near the output layer expands and the parameter updating accelerates. Meanwhile, the hidden layer gradient near the input layer shows the opposite; that is, presenting a state of random distribution called gradient dispersion, while BN data normalization can effectively solve this problem.

The BN data normalization is as follows:

Input: Mini-batch input x : B = { x 1 , … , m } .

Output: Normalized network response { y i = B N γ , β ( x i ) } .

The mean of training batch data: μ B = 1 m ∑ i = 1 m x i (4)
The variance of training batch data: σ B 2 = 1 m ∑ i = 1 m ( x i − μ ) 2 (5)
Normalization: x ^ i = x i − μ B σ B 2 + ε (6) where ε is the minimum positive number used to avoid division by 0.
Scale transformation and offset: y i = γ x ^ i + β (7)
The learning parameters γ and β are returned.

The BN data normalization results in the output mean of 0 and the variance of 1. These results are beneficial to the non-linear expression of the model and provides consistent output distribution with the real data distribution. The application of deep network models is not only appropriate but also has good effects in shallow network models.

(3) The ReLU function is selected as the activation function. Compared with the traditional Sigmoid and Tanh functions, the ReLU function is simple in calculation but effectively solves the gradient disappearance and explosion problem of the two functions. By making a part of the neuron output to 0, the network can be sparse, which helps reduce computational complexity and accelerate network convergence. Therefore, this function performs well in deep network training.

(4) The Adam method is chosen as the optimizer algorithm. This method is an extended first-order optimization algorithm based on the stochastic gradient descent method, which can dynamically adjust the learning rate of related parameters by using the moment estimation of the gradient. After the offset correction, the Adam method can control each iterative learning rate within a certain range, thereby ensuring a smooth updating of the network parameters.

The first moment of the gradient is as follows:

The second moment of the gradient is as follows:

where β 1 and β 2 are the attenuation factors, and g t is the gradient value of the loss function at time t .

The first moment deviation estimate of the gradient is as follows:

The second moment deviation estimate of the gradient is as follows:

The gradient update formula of the Adam method is as follows:

where η is the initial learning rate.

The Adam method is computationally efficient and requires less memory space. Thus, this method is suitable for solving optimization problems with large-scale data and parameters. The Adam method can effectively solve the problems of learning rate disappearance, slow convergence and large fluctuation of loss function in the optimization process, thereby possessing a good convergence mechanism.

(5) The dropout is added to the fully-connected layers. It temporarily discards half of the data flowing through the network by discarding some neurons. Before the new round of data iteration, the original fully connected model is restored, and then some neurons are randomly removed. The dropout can considerably reduce the amount of network computation, help weaken the joint adaptability between neuron nodes, enhance the generalization ability of the training model and play a regularization role to a certain extent to prevent over-fitting problems.

Table 2 lists the parameter settings of the improved LeNet-5 network model.

The parameter settings of the improved LeNet-5 network model.

In this paper, the classical LeNet-5 network model is improved in many aspects and multiple levels. Considering the different interference conditions that may occur in the actual road scenes, the improved LeNet-5 network model integrates multiple advantages into one, thereby fostering strengths and avoiding weaknesses and complementing each other. The robustness and stability of the training network are considerably enhanced, and the overall convergence speed is improved, thereby further enhancing the performance levels of traffic sign classification and recognition.

4. Traffic Sign Recognition Experiment

4.1. experimental environment.

Software environment: Windows 10 64-bit operating system, JetBrains PyCharm 2019.1.1, TensorFlow 1.13.1, Python 3.7.0 64-bit.

Hardware environment: Intel (R) Core (TM) i5-6500 [email protected] processor, 8.00 GB memory, 2 TB mechanical hard disk.

4.2. Traffic Sign Recognition Experiment

4.2.1. traffic sign dataset.

This paper uses the German Traffic Sign Recognition Benchmark (GTSRB), which was presented at the 2011 International Joint Conference on Neural Networks (IJCNN). The internal traffic signs are collected from the real road traffic environment in Germany, and it has become a common traffic sign dataset used by experts and scholars in computer vision, self-driving and other fields. The GTSRB comprises 51,839 images, which are divided into training and testing sets. A total of 39,209 and 12,630 images are provided in the training and testing sets, accounting for approximately 75% and 25% of the whole, respectively. Each image contains only one traffic sign, which is not necessarily located in the center of the image. The image size is unequal; the maximum and smallest images are 250 × 250 and 15 × 15 pixels, respectively [ 37 , 38 ].

The traffic sign images in GTSRB are taken from the video captured by the vehicle-mounted camera. As shown in Figure 7 , GTSRB includes 43 classes of traffic signs, and the number of different types of traffic signs varies. Each type of traffic sign corresponds to a catalogue, which contains a CSV file annotated with a class label and a single image of multiple tracks (each track includes 30 images). In accordance with the different instruction contents, GTSRB can also be divided into six categories: speed limit, danger, mandatory, prohibitory, derestriction and unique traffic signs as shown in Figure 8 . The same type of traffic signs include different resolutions, illumination conditions, weather conditions, occlusion degree, tilt levels and other images, making the dataset more in line with the actual road scenes.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g007.jpg

The number of 43 classes of traffic signs.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g008.jpg

Six categories of traffic signs sample images.

After image preprocessing, an artificial dataset must be generated for GTSRB. Given that the number of different types of traffic signs in GTSRB varies, this condition easily causes the imbalance of sample data. Different types of traffic signs have evident differences during classification and recognition, which affect the generalization of the entire network model. Generating an artificial dataset aims to construct a new artificial sample by randomly sampling from the value space of each attribute feature of the same sample type. Therefore, the number of different kinds of traffic signs is as equal as possible to solve the problem of sample data imbalance. After generating the artificial dataset, the 43 classes of traffic signs are shown in Figure 9 .

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g009.jpg

The number of 43 classes of traffic signs after generating the artificial dataset.

4.2.2. Traffic Sign Classification and Recognition Experiment

Traffic sign classification and recognition experiment can be divided into two stages, namely, the network training and testing stages. In the network training stage, the training set samples of GTSRB are taken as input. By performing thousands of network iterations, parameters, such as network weights and offsets, are continuously updated on the basis of forward learning and back propagation mechanisms until the loss function is reduced to the minimum, thereby classifying and predicting traffic signs. In the network testing stage, the testing set samples of GTSRB are inputted into the trained network model to test the accurate recognition rate of the training network.

Figure 10 shows the flow chart of the entire traffic sign classification and recognition experiment.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g010.jpg

The flow chart of the entire traffic sign classification and recognition experiment.

The basic steps of the network training stage are as follows.

(1) The training set samples are preprocessed, the artificial dataset is generated and the dataset order is disrupted.
(2) The Gabor kernel is used as the initial convolutional kernel, and the convolutional kernel size is 5 × 5, as activated by the ReLU function.
(3) The training set samples are forwardly propagated in the network model, and a series of parameters are set. The BN is used for data normalization after each pooling layer, and the Adam method is used as the optimizer algorithm. The parameters are set as follows: β 1 = 0.9 , β 2 = 0.999 , η = 0.001 and ε = 1 × 10 − 8 . The dropout parameter is set to 0.5 in the fully-connected layers, and the Softmax function is outputted as a classifier.
(4) The gradient of loss function is calculated, and the parameters, such as network weights and offsets, are updated on the basis of the back-propagation mechanism.
(5) The error between the real and the predicted value of the sample is calculated. When the obtained error is lower than the set error or reaches the maximum number of training, training is stopped and step (6) is executed; otherwise, step (1) is repeated for the next network iteration.
(6) The classification test is conducted in the network model. The subordinate categories of traffic signs in the GTSRB are predicted and compared with the real categories. The classification prediction results of traffic signs are counted, and the correct prediction rate is calculated.

The basic steps of the network testing stage are as follows.

(1) Several images are randomly selected from the testing set samples, and the images are inputted into the trained network model after preprocessing.
(2) The recognition results are outputted through the network model, thereby showing the meaning of traffic signs with the highest probability.
(3) The output results are compared with the actual reference meanings, and the statistical recognition results are obtained.
(4) All the sample extraction images are completely tested, and the accurate recognition rate of traffic signs is calculated.

Figure 11 shows the classification prediction results of some sample images in the network training stage.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g011.jpg

The classification prediction results of some sample images in the network training stage.

Figure 12 presents the dynamic change curve of relevant parameters in the network training stage, in which, (a) indicates the dynamic contrast curve of loss function with iteration number in the case of Gabor and non-Gabor kernels, (b) shows the dynamic contrast curve of correct prediction rate with iteration number in the case of BN data normalization and non-BN data normalization.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g012.jpg

The dynamic change curve of relevant parameters in the network training stage.

As shown in Figure 12 a, in the improved Lenet-5 network model, with the deepening of the network iterations, the loss function corresponding to the Gabor kernel initialization is much falling faster than that without the Gabor kernel initialization, and drops smoothly to 0. The reason is that the Gabor filter can extract effective target contour information and remove useless image noise, thereby effectively avoiding over-fitting of the training data and reducing the computational complexity, and further enhance the robustness and adaptability of the network model. Without the Gabor filter, the training network can easily fall into the local optimal solution, which makes the updating of network parameters such as weights and offsets become slower. It can be seen from Figure 11 and Figure 12 b that a good sample image classification prediction effect is achieved in the network training stage, and the correct prediction rate using BN data normalization increases with iteration number and the highest value can reach about 99.82%. When BN data normalization is not used, the correct prediction rate has a large fluctuation and the highest value is only about 75%. The reason is that after adding BN data normalization processing, not only can the gradient dispersion phenomenon be effectively avoided, but also the convergence speed of the training model can be accelerated, the training model is more stable, and the generalization ability can be considerably enhanced.

In the network testing stage, eight different traffic sign test images are randomly selected from the testing set samples and numbered automatically. Figure 13 shows the auto-numbered traffic sign test images.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g013.jpg

The auto-numbered traffic sign test images.

The traffic sign test images are inputted into the trained improved LeNet-5 network model for classification and recognition. For each test image, the traffic sign indicated by the first five probabilities are outputted, and the maximum probability is selected as the recognition result and compared with the actual reference meaning. Figure 14 shows the recognition results of traffic sign test images in the network testing stage.

An external file that holds a picture, illustration, etc.
Object name is sensors-19-04021-g014.jpg

The recognition results of traffic sign test images in the network testing stage.

It can be seen intuitively from the Figure 14 that the maximum probability recognition results of the eight traffic sign test images are completely consistent with their true meaning, and all of them have achieved effective recognition with an absolute probability close to 100%. The recognition results in the network testing stage show that the trained improved LeNet-5 CNN model has excellent classification and recognition ability, strong anti-jamming ability and high accuracy recognition rate for traffic sign dataset with different backgrounds and interferences, thereby reflecting admirable robustness and accuracy.

4.2.3. Statistics and Analysis of Experimental Results

A total of 1000 test images are randomly selected respectively from six categories of traffic signs classified roughly in the GTSRB for classification and recognition to test the comprehensive recognition performance of the improved LeNet-5 network model for different types of traffic signs. Table 3 lists the classification and recognition test results of six categories of traffic signs.

The classification and recognition test results of six categories of traffic signs.

As shown in Table 3 , TP (True Positive) is the number of test images in which traffic signs are correctly recognized, and FN (False Negative) is the number of test images in which traffic signs are misrecognized and missed. In the traffic sign classification and recognition test experiments, the unique traffic signs perform best in the test results due to the advantages of fixed contours and distinctive features. The accuracy recognition rate reaches 100.00%, and the average processing time per frame is 4.7 ms. The derestriction traffic signs perform worst in the test results due to disadvantages of consistent contours and similar features. However, the accurate recognition rate also reaches 99.40%, and the average processing time per frame is 6.4 ms. In total, the average accurate recognition rate of six categories of traffic signs reaches 99.75%, and the average processing time per frame is 5.4 ms. On this basis, the improved LeNet-5 network model has excellent image recognition performance, and the proposed traffic sign recognition algorithm has good real-time performance and adaptability.

By sorting and analyzing false or missed test images, the majority of these images are caused by extremely dark light, extremely low resolution, motion blur and excessive tilt. In the future, complex network models must be built aimed at these problems, and additional abundant datasets must be adopted to facilitate the accurate recognition of additional traffic signs with interference factors by CNN. In this manner, the inclusiveness and stability of traffic sign recognition algorithm are continuously improved.

4.3. Performance Comparison of Recognition Algorithms

The proposed algorithm is compared with other algorithms adopted in other literature to verify the performance of traffic sign recognition algorithms. Table 4 lists the comparison of statistics in algorithm performance based on the GTSRB dataset.

The comparison of statistics in algorithm performance based on the GTSRB dataset.

In the performance comparison experiment, the proposed algorithm and other literature all conducted relevant traffic sign recognition test experiments based on the GTSRB dataset. In reference [ 39 ], a traffic sign extraction method based on oriented gradient maps and the Karhunen-Loeve transform was adopted, which achieved good test results by reducing the number of attributes and combining multilayer perceptron. Compared with other algorithms, although the average processing time of this algorithm was relatively short, the accurate recognition rate was the lowest. Therefore, this algorithm is more likely to cause false or missed recognition in the actual road scenes than other algorithms. In reference [ 40 ], iterative nearest neighbors-based linear projection was combined with iterative nearest-neighbor classifier. Multiple HOG features were used for detection, and sparse representations were adopted for classification, thereby achieving good recognition performance. Compared with literature [ 39 ], although the accurate recognition rate was considerably improved, the average processing time was excessively long, and real-time performance was poor when applied to actual road scenes. In reference [ 41 ], a traffic sign recognition method based on the histogram of oriented gradients was utilized. By combining Gaussian filter and histogram equalization for effective image preprocessing, using principal component analysis for dimensionality reduction, and a good classification accuracy was achieved by using a kernel extreme learning machine (K-ELM) classifier, while the average processing time was also further shortened. In reference [ 42 ], the weighted multi-CNN was trained by a new training method, and good recognition accuracy was obtained. Although the running environment of the algorithm included GPU and CPU, the average processing time was still relatively long. Deep learning-based methodologies can still be further improved because of the complex structure of the training model, the large amount of calculation, the long training time and the poor real-time performance. Compared with the aforementioned literature, the proposed algorithm has the best overall performance when using the same dataset. The accurate recognition rate reaches 99.75%, and the average processing time per frame is 5.4 ms. The generalization ability and recognition efficiency of the network model are also remarkably improved. In terms of performance improvement, evident advantages are observed. The fully improved traffic sign recognition accuracy is conducive to considerably enhancing the driving safety of intelligent vehicles in the actual driving environments. Meanwhile, the fully shortened average processing time is conducive to meeting the real-time target requirements of intelligent vehicles in the actual driving environments effectively. Thus, this study contributes to further improving the technical level of intelligent vehicle driving assistance.

5. Conclusions

In this study, an improved traffic sign detection and recognition algorithm is proposed for intelligent vehicles. Firstly, the HSV color space is used for spatial threshold segmentation, and traffic signs are effectively detected based on the shape features. Secondly, this model is considerably improved on the basis of the classical LeNet-5 CNN model by using Gabor kernel as the initial convolutional kernel, adding the BN processing after the pooling layer, selecting Adam method as the optimizer algorithm. Finally, the traffic sign classification and recognition experiments are conducted based on the GTSRB. The favorable prediction and accurate recognition of traffic signs are achieved through the continuous training and testing of the network model. The experimental results show that the accurate recognition rate of traffic signs reaches 99.75%, and the average processing time per frame is 5.4 ms. The proposed algorithm has more admirable accuracy, better real-time performance, stronger generalization ability and higher training efficiency than other algorithms. The accurate recognition rate and average processing time are significantly improved.

From the viewpoint of traffic sign recognition accuracy and algorithm time-consuming, the proposed traffic sign detection and recognition algorithm has remarkable advantages. Considerably enhancing the driving safety of intelligent vehicles in the actual driving environments and effectively meeting the real-time target requirements of smart cars are conducive. Furthermore, a strong technical guarantee is provided for the steady development of intelligent vehicle driving assistance. In the future, the inclusiveness and anti-error recognition of the traffic sign recognition algorithm can be further optimized and improved to exploit the overall performance of the algorithm.

Author Contributions

J.C. designed the method, performed experiment and analyzed the results. C.S. provided overall guidance for the study. S.P. reviewed and revised the paper. F.X. offered crucial suggestions about the experiment. S.S. put forward the idea and debugged the model in Python.

This research is supported by the International Cooperation Project of the Ministry of Science and Technology of China (Grant No. 2010DFB83650) and Natural Science Foundation of Jilin Province (Grant No. 201501037JC).

Conflicts of Interest

The authors declare no conflict of interest.

Traffic Sign Recognition: A Survey

Ieee account.

Change Username/Password
Update Address

Purchase Details

Payment Options
Order History
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical Interests
US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support
About IEEE Xplore
Accessibility
Terms of Use
Nondiscrimination Policy
Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Traffic-Sign Recognition Using Deep Learning

Conference paper
First Online: 18 March 2021
Cite this conference paper

Zhongbing Qin 8 &
Wei Qi Yan 8

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1386))

Included in the following conference series:

International Symposium on Geometry and Vision

1116 Accesses

14 Citations

Traffic-sign recognition (TSR) has been an essential part of driver-assistance systems, which is able to assist drivers in avoiding a vast number of potential hazards and improve the experience of driving. However, the TSR is a realistic task that is full of constraints, such as visual environment, physical damages, and partial occasions, etc. In order to deal with the constrains, convolutional neural networks (CNN) are accommodated to extract visual features of traffic signs and classify them into corresponding classes. In this project, we initially created a benchmark (NZ-Traffic-Sign 3K) for the traffic-sign recognition in New Zealand. In order to determine which deep learning models are the most suitable one for the TSR, we choose two kinds of models to conduct deep learning computations: Faster R-CNN and YOLOv5. According to the scores of various metrics, we summarized the pros and cons of the picked models for the TSR task.

Traffic signs
Faster R-CNN
NZ-Traffic-Sign 3K

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Available as PDF
Read on any device
Instant download
Own it forever
Available as EPUB and PDF
Compact, lightweight edition
Dispatched in 3 to 5 business days
Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Mogelmose, A., Trivedi, M., Moeslund, T.B.: Vision-based traffic sign detection and analysis for intelligent driver assistance systems: perspectives and survey. IEEE Trans. Intell. Transp. Syst. 13 (4), 1484–1497 (2012)

Article Google Scholar

Zhu, Y., Zhang, C., Zhou, D., Wang, X., Bai, X., Liu, W.: Traffic sign detection and recognition using fully convolutional network guided proposals. Neurocomputing 214 , 758–766 (2016)

Yang, Y., Luo, H., Xu, H., Wu, F.: Towards real-time traffic sign detection and classification. IEEE Trans. Intell. Transp. Syst. 17 (7), 2022–2031 (2015)

Zhang, J., Huang, M., Jin, X., Li, X.: A real-time Chinese traffic sign detection algorithm based on modified YOLOv2. Algorithms 10 (4), 127 (2017)

Article MathSciNet Google Scholar

Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Netw. 32 , 323–332 (2012). https://doi.org/10.1016/j.neunet.2012.02.016

Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The German traffic sign recognition benchmark: a multi-class classification competition. In: International Joint Conference on Neural Networks (2011)

Google Scholar

Larsson, F., Felsberg, M.: Using Fourier descriptors and spatial models for traffic sign recognition. In: Heyden, A., Kahl, F. (eds.) SCIA 2011. LNCS, vol. 6688, pp. 238–249. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21227-7_23

Chapter Google Scholar

Wang, G., Ren, G., Quan, T.: A traffic sign detection method with high accuracy and efficiency. In: International Conference on Computer Science and Electronics Engineering (2013)

Sermanet, P., LeCun, Y.: Traffic sign recognition with multi-scale convolutional networks. In: International Joint Conference on Neural Networks (2011)

Mao, X., Hijazi, S., Casas, R., Kaul, P., Kumar, R., Rowen, C.: Hierarchical CNN for traffic sign recognition. In: IEEE Intelligent Vehicles Symposium (IV) (2016)

Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE CVPR, pp. 779–788 (2016)

Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE CVPR, pp. 7263–7271 (2017)

Girshick, R.: Fast R-CNN. In: IEEE ICCV, pp. 1440–1448 (2015)

Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML (2013)

Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)

Yan, W.Q.: Computational Methods for Deep Learning - Theoretic. Practice and Applications. Springer, Heidelberg (2021). https://doi.org/10.1007/978-3-030-61081-4

Book MATH Google Scholar

Yan, W.Q.: Introduction to Intelligent Surveillance - Surveillance Data Capture, Transmission, and Analytics, 3rd edn. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-319-60228-8

Book Google Scholar

Pan, C., Yan, W.Q.: Object detection based on saturation of visual perception. Multimed. Tools Appl. 79 (27–28), 19925–19944 (2020). https://doi.org/10.1007/s11042-020-08866-x

Pan, C., Li, X., Yan, W.: A learning-based positive feedback approach in salient object detection. In: IEEE IVCNZ (2018)

Liu, X., Yan, W., Kasabov, N.: Vehicle-related scene segmentation using CapsNets. In: IEEE IVCNZ (2020)

Liu, X., Neuyen, M., Yan, W.: Vehicle-related scene understanding using deep learning. In: Cree, M., Huang, F., Yuan, J., Yan, W.Q. (eds.) ACPR 2019. CCIS, vol. 1180, pp. 61–73. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3651-9_7

Wang, J., Bacic, B., Yan, W.Q.: An effective method for plate number recognition. Multimed. Tools Appl. 77 (2), 1679–1692 (2017). https://doi.org/10.1007/s11042-017-4356-z

Zheng, K., Yan, W., Nand, P.: Video dynamics detection using deep neural networks. IEEE Trans. Emerg. Top. Comput. Intell. 2 (3), 224–234 (2018)

Shen, Y., Yan, W.: Blind spot monitoring using deep learning. In: IEEE IVCNZ (2018)

Qin, G., Yang, J., Yan, W., Li, Y., Klette, R.: Local fast R-CNN flow for object-centric event recognition in complex traffic scenes. In: Satoh, S. (ed.) PSIVT 2017. LNCS, vol. 10799, pp. 439–452. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-92753-4_34

Wang, J., Yan, W.: BP-neural network for plate number recognition. Int. J. Digit. Crime Forensics 8 (3), 34–45 (2016)

An, N., Yan, W.: Multitarget tracking using Siamese neural networks. ACM TOMM (2021)

Liu, X., Yan, W.: Traffic-light sign recognition using Capsule network. MTAP (2021)

Xing, J., Yan, W.: Traffic sign recognition using guided image filtering. In: ISGV (2021)

Download references

Author information

Authors and affiliations.

Auckland University of Technology, Auckland, New Zealand

Zhongbing Qin & Wei Qi Yan

You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zhongbing Qin or Wei Qi Yan .

Editor information

Editors and affiliations.

Minh Nguyen

Auckland Bioengineering House, Auckland, New Zealand

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper.

Qin, Z., Yan, W.Q. (2021). Traffic-Sign Recognition Using Deep Learning. In: Nguyen, M., Yan, W.Q., Ho, H. (eds) Geometry and Vision. ISGV 2021. Communications in Computer and Information Science, vol 1386. Springer, Cham. https://doi.org/10.1007/978-3-030-72073-5_2

Download citation

DOI : https://doi.org/10.1007/978-3-030-72073-5_2

Published : 18 March 2021

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-72072-8

Online ISBN : 978-3-030-72073-5

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Publish with us

Policies and ethics

Find a journal
Track your research

International Journal of Engineering Research & Technology (IJERT)

Mission & Scope
Editorial Board
Peer-Review Policy
Publication Ethics Policy
Journal Policies
Join as Reviewer
Conference Partners
Call for Papers
Journal Statistics – 2023-2024
Submit Manuscript
Journal Charges (APC)
Register as Volunteer
Upcoming Conferences
CONFERENCE PROCEEDINGS
Thesis Archive
Thesis Publication FAQs
Thesis Publication Charges
Author Login
Reviewer Login

Volume 10, Issue 05 (May 2021)

Traffic sign recognition using machine learning: a review.

Article Download / Views: 6,433
Authors : Vaibhavi Golgire
Paper ID : IJERTV10IS050422
Volume & Issue : Volume 10, Issue 05 (May 2021)
Published (First Online): 04-06-2021
ISSN (Online) : 2278-0181
Publisher Name : IJERT

Vaibhavi Golgire

Department of Computer Engineering, Pimpri Chinchwad College of Engineering,

Savitribai phule pune university, Pune, Maharashtra, India

Abstract:- A series of warnings about the route are conveyed by traffic signs. They keep traffic going by aiding travelers in reaching their destinations and providing them with advance notice of arrival, exit, and turn points. Road signs are placed in specific positions to ensure the safety of travelers. They also have guidance for when and where drivers can turn or not turn. In this paper, we proposed a system for traffic sign detection and recognition, as well as a method for extracting a road sign from a natural complex image, processing it, and alerting the driver through voice command. It is applied in such a way that it helps drivers make fast decisions. In real-time situations, factors like shifting weather conditions, changing light directions, and varying light intensity make traffic sign identification challenging. The reliability of the machine is influenced by a number of factors such as noise, partial or absolute underexposure, partial or complete overexposure, and significant variations in color saturation, wide variety of viewing angles, view depth, and shape/color deformations of traffic signs (due to light intensity).The proposed architecture is sectioned into three phases .The first of which is image pre-processing, in which we quantify the dataset's input files, determine the input size for learning purposes, and resize the information for the learning step. The proposed algorithm categorizes the observed symbol during the recognition process. A Convolutional Neural Network is used to do this in the second phase, and the third phase deals with text-to-speech translation, with the detected sign from the second phase being presented in audio format.

Keywords- Convolution Neural Network, Machine Learning, Image Preprocessing, Feature Extraction, Segmentation, Data Augmentation ,Text to speech conversion.

INTRODUCTION

According to official statistics, about 400 road accidents occur in India every day. Road signs help to avoid accidents on the road, ensuring the safety of both drivers and pedestrians. Additionally, traffic signals guarantee that road users adhere to specific laws, minimizing the likelihood of traffic violations. Route navigation is also made easier by the use of traffic signals. Road signals should be prioritized by all road users, whether they are drivers or pedestrians. We overlook traffic signs for a variety of reasons such as problems with concentration, exhaustion, and sleep

deprivation. Other causes that contribute to missing the signs include poor vision, the influence of the external world, and environmental circumstances. It is much more important to use a system that can recognize traffic signals and advise and warn the driver. Image-based traffic-sign recognition technologies analyze images captured by a car's front-facing camera in real time to recognize signals. They help the driver by giving him or her warnings. The identification and recognition modules are the key components of a vision- based traffic sign recognition system. The detection module locates the sign area in the image/video, while the recognition module recognizes the sign. The sign regions with the highest probability are selected and fed into the recognition system to classify the sign during the detection process .For traffic sign recognition, various machine learning algorithms such as SVM, KNN, and Random Forest can be used [6]. However, the key disadvantage of these algorithms is that feature extraction must be done separately; on the other hand, CNN will do feature extraction on its own [1] .As a result, the proposed system employs a convolutional neural network. Input preprocessing module will prepare image captured with the help of vehicle camera for recognition stage before that. The driver will get a voice warning message after recognition.

RELATED WORK

In any kind of study, the most critical move is to do a literature review. This move would allow us to identify any gaps or flaws in the current structure which will attempt to find a way to get around the limitations of the current method. We briefly discuss similar work on traffic sign detection identification and recognition in this segment. Comparative analysis of reference articles is shown below in Table 1.

Wasif Arman Haquea ,SaminArefin b , A.S.M.

Shihavuddin c , Muhammad Abul Hasan [1] describe the A novel lightweight CNN architecture for traffic sign recognition without GPU requirements. Author focused on Main challenges in detecting traffic signs in real time scenarios includes distortion of images, speed factor, motion effect, noise, faded color of signs. Training only on grayscale images gives average accuracy. So authors proposed DeepThin architecture which is divided into 3 modules input processing, learning, and prediction. Architecture is deep and thin at the same time. Thin because they considered small number of feature maps per layer and deep because 4 layers used. And since

they considered small input images, a small

number of feature maps, and large convolution strides, it has become possible to train without a GPU. use of overlapping max pooling and sparsely used stride convolution made training faster and reduced overfitting issue. Data augmentation is performed in order to achieve robustness. For augmentation they used operations such as original random shearing of training images, zoomed-in/zoomed-out, horizontally- shifted, vertically-shifted during training. For experimentation German Traffic Sign Recognition Benchmark and Belgian Traffic Sign Classification dataset is used. hyper parameter tuning is done for kernel size and feature map and

During training phase CNN model is used with backpropagation learning algorithm, cross- entropy, stochastic gradient descent (SGD) as the optimizer.

Shijin Songa ,Zhiqiang Que b, JunjieHoua , Sen Dua , YuefengSonga [2] describe the An efficient convolutional neural network for small traffic sign detection. In this paper, researcher focused on issues for small object detection and proposed efficient convolutional neutral network for small traffic sign detection and compared accuracy against R-CNN and Faster R-CNN.CNN model is explained in detail along with forward propagation, back word propagation, loss functions. Authors increased the number of convolutional kernels per Conv layer from the start and implemented Max-pooling layers with a stride of 2 to down-sample the network in thefeature extraction phase. To optimize this model further three strategies used convolution factorization, redundant layer cropping and fully connected transformation. The Tsinghua-Tencent data set is used for evaluation. Proposed model is not only efficient but also consumed less GPU memory and save the computation cost.

AashrithVennelakanti, Smriti Shreya, ResmiRajendran, Debasis Sarkar, Deepak Muddegowda, PhanishHanagal [3] describe the Traffic Sign Detection and Recognition using a CNN Ensemble .Proposed system in this paper is divided into two modules detection and recognition and it is evaluated on Belgium Data Set and the German Traffic Sign Benchmark. Detection involves capturing images of traffic sign and locating object from image and in recognition stage convolutional neural network ensemble is used which will assign label to detected sign .In first phase Hue Saturation Value(HSV) color space is used instead of RGB because HSV model is more similar to the way human eye process image and it has wide range of colors .After that color based detection and shape based detection is implemented , in color based detection red values of sign are checked if they fall under particular threshold then that part is examined to see if sign is present or not . Douglas Peucker algorithm is then used for shape based detection .Authors focused on only 2 shapes circle and tringle .This algorithm found area from no of edges detected in image and bounding boxes are used to separate ROI .Now sign inside bounding box is validated by applying image thresholding and inversion filter .In the second phase detected sign is classified using feed-forward CNN network with six convolutional layers and As they used ensemble method ,aggregated result of 3 CNN is a final output . They achieved 98.11% accuracy for triangular traffic signs and 99.18% for circles.

DomenTabernik; DanijelSkoaj [4] describe the Deep Learning for Large-Scale Traffic-Sign Detection and Recognition. In this paper convolutional neural network (CNN), the mask R- CNN is used for traffic sign detection and recognition. Authors used CNN for full feature extraction rather than Hough transform, scale invariant feature transform, local binary patterns. In order to solve real time problems of traffic sign appearance and distortion they also implemented data augmentation method. Swedish traffic-sign dataset (STSD) is used for evaluation of Faster R- CNN and Mask R-CNN. To have low inter-class and high intra-class variability they produced new data set called DFG traffic-sign. To improve the overall recall, average precision modification has been done in Mask R-CNN.

Ivona Mato; Zdravko Krpi; Kreimir Romi

EXISTING SYSTEM

In the area of traffic sign detection and recognition, a considerable amount of work has been put forward.As two global characteristics of traffic signs, several authors concentrated on the color and shape attributes of image for detection. These features can be used to detect and trace a moving object in a series of frames. This approach is helpful when the target to be identified is a special color that is distinct from the background color. To detect an object with a certain shape, object borders, corners, and contours may be used. However authors only focused on the detection and recognition measures, ignoring the voice feature, which is an essential driver warning system. In addition, hyper parameter tuning has received less attention. As a result, the proposed system would concentrate on different parameters of the CNN algorithm in order to improve accuracy without requiring additional computing resources.

PROPOSED SOLUTION

In the proposed system, Traffic sign detection and recognition is achieved by CNN algorithm. Before classification input preprocessing is done in order to remove noise, reduce the complexity and improve the precision of the implemented algorithm. Since we can't write a special algorithm for each condition under which an image is taken, we tend to transform images into a format

that can be solved by a general algorithm. At the end voice alert message will be given to driver.

Image Preprocessing :

Gray Scale Conversion: To save space or reduce computing complexity, we can find it helpful to remove redundant details from images in some situations. .Converting colorful images to grayscale images, for example. This is because color isn't always used to identify and perceive an image in several objects. Grayscale may be sufficient for identifying such artefacts [1][3]. Color images can add needless complexity and take up more memory space because they hold more detail than black and white images color images are represented in three channels, which means that converting it to grayscale reduces the number of pixels that need to be processed. For traffic signs gray values are sufficient for recognition

Thresholding and Segmentation: Segmentation is the method of partitioning a visual image into different subgroups (of pixels) called Image Objects, which reduces the image's complexity and makes image analysis easier. Thresholding is the method of using an optimal threshold to transform a grayscale input image to a bi-level image [4].

Traffic sign recognition:

Deep Learning is a subdomain of Machine Learning that includes Convolutional Neural Networks. Deep Learning algorithms store information in the same manner as the human brain does, but on a much smaller scale .Image classification entails extracting features from an image in order to identify trends in a dataset. We are using CNN for traffic sign recognition as it is very good at feature extraction [1][2].In CNN, we use filters. Filters come in a variety of shapes and sizes, depending on their intended use. Filters allow us to take advantage of a specific image's spatial localization by imposing a local communication pattern between neurons. Convolution is the process of multiplying two variables pointwise to create a new feature. Our image pixels matrix is one function and our filter is another. The dot product of the two matrices is obtained by sliding the filter over the image. Matrix called "Activation Map" or "Feature Map". The output layer is made up of several convolutional layers that extract features from the image. CNN can be optimized with the help of hyper parameter optimization. It finds hyper parameters of a given machine learning algorithm that deliver the best performance as measured on a validation set. Hyper parameters must be set before the learning process can begin [1]. The learning rate and the number of units in a dense layer are provided by it. In our system will consider dropout rate, learning rate, kernel size and optimizer hyper parameter.

Convolutional Neural Network Architecture

Convolution Layer

This layer is major building block in convolution process. It performs convolution operation to identify various features from given image[1]. It basically scans entire pixel grid and perform dot product. Filter or kernel is nothing but a feature from multiple features which we want to identify from input image. For example in case of edge detection we may have separate filter for curves, blur, sharpen image etc. As we go deeper in the network ,more complex features can be identifies

Pooling Layer

This layer is used for down sampling of the features. It reduces dimensonality of large image but still retains important features. It helps to reduce amount of computation and weights. One can choose Max pooling or Average pooling depending on requirement. Max pooling takes maximum value from feature map while average takes average of all pixels.

Activation Function

This layer introduce non linear properties to network. It helps in making decision about which information should be processed further and which not. Weighted sum of input becomes input signal to activation function to give one output signal

This step is crucial because without activation function output signal would be simple linear function which has limited complex learning capabilities. Types of activation function includes Sigmoid function, Tan H, ReLU, Identity, Binary Step function. Sigmoid function is mostly used in backpropagation its range is 0 to 1 while TanH range is -1 to 0,Optimization is easy in this function. Range for ReLU is 0 to infinity, its a most popular activation function .

Flattening Layer

The output of the pooling layer is in the form of a 3D feature map, and we need to transfer data to the fully connected layer in the form of a 1D feature map. As a result, this layer transforms a 3*3 matrix to a one-dimensional list.

Fully connected Layer

Actual classification happens in this layer. It takes end result of convolution or polling layer by flattened layer and reaches a classification decision. Here every input is connected to every output by weights .It combines the features into more attributes that better predicts the classes

Output of recognized sign in audio format:

At present the driver will have to read the text written on the classified sign, but with the aid of a speech module, more comfort is assured. A text to speech module will alert driver with detected sign. In Python, there are many APIs available for converting text to voice. The Google Text to Speech API, also known as the gTTS API, is one of these APIs. gTTS is a simple application that transforms entered text into audio that can be stored as an mp3 format. The gTTS API supports several languages and audio can be delivered at customized speed

We presented a literature review on traffic sign identification using machine learning techniques, as well as a comparative study and analysis of these techniques in this paper. CNN performs well for recognition and with the aid of hyper parameter tuning, accuracy or recognition rate can be improved. As a result, in the proposed scheme to design a warning traffic sign detection system for drivers, we used CNN for traffic sign recognition. The images will be taken with a camera mounted on the car during the image acquisition stage and the recognition process will be done using the CNN algorithm after preprocessing. The machine issues a voice alert when a traffic sign is identified. This model can be used in circumstances requiring precise navigation.

VII. REFERENCES

W. Haque, S. Arefin, A. Shihavuddin and M. Hasan, "DeepThin: A novel lightweight CNN architecture for traffic sign recognition without GPU requirements", Expert Systems with Applications, vol. 168, p. 114481, 2021.

S. Song, Z. Que, J. Hou, S. Du and Y. Song, "An efficient convolutional neural network for small traffic sign detection", Journal of Systems Architecture, vol. 97, pp. 269- 277, 2019. Available: 10.1016/j.sysarc.2019.01.012.

A. Vennelakanti, S. Shreya, R. Rajendran, D. Sarkar, D. Muddegowda and P. Hanagal, "Traffic Sign Detection and Recognition using a CNN Ensemble," 2019 IEEE International Conference on Consumer Electronics (ICCE), 2019, pp. 1-4

D. Tabernik and D. Skoaj, "Deep Learning for Large-Scale Traffic-Sign Detection and Recognition," in IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 4, pp. 1427- 1440, April 2020

I. Mato, Z. Krpi and K. Romi, "The Speed Limit Road Signs Recognition Using Hough Transformation and Multi-Class Svm," 2019 International Conference on Systems, Signals and Image Processing (IWSSIP), 2019, pp. 89-94.

Degui Xiao, Liang Liu, Super-resolution-based traffic prohibitory sign recognition ,2019.

1 thoughts on “ Traffic Sign Recognition using Machine Learning: A Review ”

This paper is full of comparison. Please mark more information about this technology. That has more useful for taking as a seminar.

You must be logged in to post a comment.

IMAGES

(PDF) An Overview of Traffic Signs Recognition Methods
Traffic Sign Recognition Block Diagram / 2021 Acura TLX vs 2021
(PDF) Traffic sign recognition without color information
Traffic Sign Recognition / 978-3-8383-0359-8 / 9783838303598 / 3838303598
Keras Tutorial
(PDF) Designing Traffic Signs: A Case Study on Driver Reading Patterns

COMMENTS

PDF Literature Review of Traffic Sign Detection and Recognitions
The location and confirmation of traffic signs is essential for driver safety and legitimate routes. This document provides an overview of the literature on the detection and recognition of traffic signs over the last decade. This paper presents a review of recent vision-based traffic sign recognition and detection systems. Our focus is
Traffic sign recognition based on deep learning
Intelligent Transportation System (ITS), including unmanned vehicles, has been gradually matured despite on road. How to eliminate the interference due to various environmental factors, carry out accurate and efficient traffic sign detection and recognition, is a key technical problem. However, traditional visual object recognition mainly relies on visual feature extraction, e.g., color and ...
Traffic sign detection and recognition: Review and analysis
Using German traffic sign recognition benchmark dataset as model, the pre-processing system filters 97% of frames with no traffic sign objects and has an accuracy of 88%. ... Our literature review ...
Machine Vision Based Traffic Sign Detection Methods: Review, Analyses
Traffic signs recognition (TSR) is an important part of some advanced driver-assistance systems (ADASs) and auto driving systems (ADSs). As the first key step of TSR, traffic sign detection (TSD) is a challenging problem because of different types, small sizes, complex driving scenes, and occlusions. In recent years, there have been a large number of TSD algorithms based on machine vision and ...
Mathematics
Method: This study performs a systematic literature review (SLR) of studies on traffic sign detection and recognition using YOLO published in the years 2016-2022. Results: The search found 115 primary studies relevant to the goal of this research. After analyzing these investigations, the following relevant results were obtained.
Enhancing traffic sign recognition (TSR) by classifying deep learning
Traffic sign recognition (TSR) systems are essential for strengthening road safety, enhancing traffic management, and promoting efficient driving due to the ever-increasing number of vehicles on the roads and the need for better transportation systems. Modern intelligent transportation systems rely on TSR systems to help with the detection, categorization, and interpretation of traffic signs ...
The Improved Framework for Traffic Sign Recognition Using ...
In the lighting conditions such as hazing, raining, and weak lighting condition, the accuracy of traffic sign recognition is not very high due to missed detection or incorrect positioning. In this article, we propose a traffic sign recognition (TSR) algorithm based on Faster R-CNN and YOLOv5. The road signs were detected from the driver's point of view and the view was assisted by satellite ...
Autonomous Traffic Sign (ATSR) Detection and Recognition using Deep CNN
Literature Review Nowadays, recognition and classification of traffic signs are very important, especially for unmanned automatic driving. Extensive research has been done in the area of recognition and classification of traffic and road signs. ... Efficient Traffic-Sign Recognition with Scale-aware CNN. arXiv preprint arXiv:1805.12289. [11] Yi ...
Deep Learning-Based Real-Time Traffic Sign Recognition System for Urban
A traffic sign recognition system is crucial for safely operating an autonomous driving car and efficiently managing road facilities. Recent studies on traffic sign recognition tasks show significant advances in terms of accuracy on several benchmarks. However, they lack performance evaluation in driving cars in diverse road environments. In this study, we develop a traffic sign recognition ...
Towards Enhancing Traffic Sign Recognition through Sliding Windows
The German Traffic Signs Recognition Benchmark (GTSRB ) dataset is widely used in the literature [15,18,19,31] as it reports on images of traffic signs belonging to eight categories with heterogeneous illumination, occlusion and distance from the camera. The dataset contains sequences of 30 images for each traffic sign, which were gathered as ...
Automatic traffic sign detection and recognition: A review
The traffic sign detection and recognition is an integral part of Advanced Driver Assistance System (ADAS). Traffic signs provide information about the traffic rules, road conditions and route directions and assist the drivers for better and safe driving. Traffic sign detection and recognition system has two main stages: The first stage involves the traffic sign localization and the second ...
Improved Traffic Sign Detection and Recognition Algorithm for
The proposed algorithm is compared with other algorithms adopted in other literature to verify the performance of traffic sign recognition algorithms. ... Li W.Q., Zhang D., Zhang W. A review on recognition of traffic signs; Proceedings of the 2014 International Conference on E-Commerce, E-Business and E-Service (EEE); Hong Kong, China. 1-2 ...
Traffic Sign Recognition: A Survey
In this survey, the need of traffic road safety has been discussed and an overview of traffic sign detection and recognition research works has been provided including novel, breakthrough approaches. Traffic sign databases and its inherent steps: Pre processing, Feature Extraction and Detection, Post processing have been discussed thoroughly. But most importantly an overall comparative study ...
(PDF) Traffic Sign Detection and Recognition Using YOLO Object
Method: This study performs a systematic literature review (SLR) of studies on traffic sign detection and recognition using YOLO published in the years 2016-2022.
Indian traffic sign detection and recognition using deep learning
This section includes a discussion of the proposed RMR-CNN system, for detection of traffic signs, with several refinements. At the outset, the Mask R-CNN algorithm used for traffic sign detection is presented in brief; next, refinements in the parametrical values to adapt the Mask R-CNN to our requirements are shown, followed by the improvements in architecture and data augmentation of Mask R ...
Research on Traffic sign recognition based on CNN Deep ...
Literature review With the development of advanced technologies such as the Internet of Things, artificial intelligence as well as cloud computing, the technology of intelligent driving will become more and more mature in the future. ... Review of Traffic sign recognition System [J]. Logistics Science and Technology, 2021,44(10):69-74. [11 ...
Traffic-Sign Recognition Using Deep Learning
2 Literature Review. Traffic sign recognition (TSR) has benefited a large number of realistic applications, such as driver assistance system, autonomous vehicles, and intelligent mobile robots since they have delivered the current state of traffic signs into various systems.
Traffic Sign Detection and Recognition Using YOLO Object Detection
A systematic literature review of studies on traffic sign detection and recognition using YOLO published in the years 2016-2022 finds that this SLR is the most relevant and current work in the field of technology development applied to the detection and recognition of traffic signs using YOLO. Context: YOLO (You Look Only Once) is an algorithm based on deep neural networks with real-time ...
A Real-Time Traffic Sign Recognition Method Using a New ...
Artificial Intelligence (AI) in the automotive industry allows car manufacturers to produce intelligent and autonomous vehicles through the integration of AI-powered Advanced Driver Assistance Systems (ADAS) and/or Automated Driving Systems (ADS) such as the Traffic Sign Recognition (TSR) system. Existing TSR solutions focus on some categories of signs they recognise. For this reason, a TSR ...
Comparative Survey on Traffic Sign Detection and Recognition: a Review
2019. TLDR. This survey paper intensively reviews studies proposed in this field considering the three main phases of any TSDR system: preprocessing, detection, and recognition and spots the light on the common public datasets and the issues and challenges that TSDR systems may face. Expand. 3.
Traffic Sign Recognition using Machine Learning: A Review
CONCLUSION. We presented a literature review on traffic sign identification using machine learning techniques, as well as a comparative study and analysis of these techniques in this paper. CNN performs well for recognition and with the aid of hyper parameter tuning, accuracy or recognition rate can be improved.
PDF Traffic sign recognition system using CNN and Keras
The recognition of traffic signs is an important study area in computer vision and ... Literature Review [1] Many scholars have recently conducted studies on the topic of traffic sign
Special Issue : Traffic Sign Detection and Recognition
Dear Colleagues, Welcome to the special issue of Traffic Sign Detection and Recognition dedicated to traffic signs as a part of smart "infrastructure to vehicle communication" (I2V). Traffic signs constitute a fundamental asset of the road. Pedestrians and drivers should easily notice them by day and by night in order to be warned and guided.
Literature Review on Vision-based Traffic Sign Detection and
A comprehensive survey on traffic sign detection and recognition with details of algorithms, architectures, and methods implemented is provided. Abstract: With the expansion in the number of vehicles on the road, the development of technology, and other factors, safety has elevated to the top of automotive industries. Following traffic laws and regulations are essential in Advanced Driver ...