Http://210.77.192.38/bugzhao/_all_others/mywork/articles/bpnet/bpnet.pdf
I have made a PDF file.
Complete original can download this PDF reading
Abstract is as follows:
Summary:
Back Propagation Network (BP network) is a multi-layer front-to-directional network that performs weight training. The BP network can be seen as an extension of multi-layer perceptual network, that is, the forward propagation of information and reverse transmission of error data. The BP network can be mainly applied to mode classification, function approximation, and data compression, etc. This article describes a more simple three-layer BP network model, and applies it to an example: Identification of numbers 0 to 9. The principle of the BP network will be given, and the process of designing the BP network and the improvement of the specific implementation. At the same time, a large amount of experimental data is attached. The program is written in VC 6.0, with a friendly GUI interface.
Keywords:
Artificial neural network, BP network, reverse communication algorithm, digital identification
Abstract:
Back propagation Network (short for BP net) is an classical ANN that convey information forwardly and just accept errors backwordly to ajust weight of the network so as to meet the LMS .BP net is based on the multiplayer perceptron network. It could be mainly used in the field of pattern classification, function approximation and data compression.This paper primarily introduce the priciple of a classical 3-layer BP net, and give an practical example, that is: noised digital number (0 to 9) recognition.The designing process IS Well Described and Also, I Give An Implication Program of It, Developed Using Visual C 6.0. Plenty of Tables and Graphs Are Also Included in this Article When Necessary.
Keywords:
ANN, BACK Propagation Network, BP, Digital Number Recognition
Introduction: Neural Network and BP Network Overview
Here, I briefly develop the general development of artificial neural network, especially the principle of the BP network, is a simple summary. Artificial neural network, an abstract development from the biological nervous system of the biological nervous system. In the neural network, the most basic unit is neuron. Neurons consist of three parts: dendrites, cells and axons. The dendritic is a tree-shaped neural fiber receiving a network, which transmits the electrical signal to the cell body, and the cell body integrates these input signals and performs a threshold treatment. The axon is a single long fiber that directs the output signal of the cell body to other neurons. The alignment of neurons and the connection strength of synapses establishes the function of neural network. Some neural structures are born, while others are formed during the learning process. In the process of learning, some new connections may be generated, or some connections should disappear, but the primary changes are linkages that enhance or attenuate synapses. The above figure is a mathematical model of abstracting the biological neuron model into a signal transfer, feedback. The input of the neuron is that the signal P is sent to a activation function F after a accumulator accumulated, thereby obtaining the output A of this neuron. The output A of this neuron can also be used as the input of the next or more neurons, thereby transmitting the neural signal into a network dispersion. A neuron can accept multiple inputs, so the neuron is expressed as a vector, and the matrix form is easier to deal with analytical practical problems. Here, P represents an input vector of the R dimension, b is the bias vector, and W is the weight matrix of the network, f is the activation transfer function, and a is the output vector of the network. The above is a description of a single layer of neural network. Generally, in the actual situation, multiple networks are widely used. The BP network used later is also multi-layer. In multi-layer networks, there are generally at least 3 layers: one input layer, an output layer, one or more hidden layers. Multi-layer networks can solve problems that many single-layer networks cannot solve, such as multi-layer networks can be used for non-linear classification, which can be used to do high precision, as long as there is enough layer and sufficient neurons. These can be done. The number of neurons of the input and output layers of a multi-layer network is defined by external description. For example, if there are 4 external variables as input, there are 4 inputs. The determination of hidden neurons will be discussed in detail in the design of the BP network. Many mentioned in the text have been mentioned that the perception is the basis of the neural network and the basis of the BP network. The so-called perceptor, that is, given one or more known categories input, by training for networks to implement the correct classification of all input data. Note that the perception is a single layer network. Here, the training means to repeat the weight of the network through the output of the perception to meet the correct requirements of all classification. However, in general, the classification capacity of the perception is poor.
Below, an outline of the most widely used neural network: BP network. Frank Rosenblatt's perception learning model and Widrow LMS algorithm are used to design training single-layer networks. In practical applications, single-layer networks are far from satisfying demand, so multi-layer networks come to life. But multi-layer networks face a problem: How do we use the LMS algorithm? How do we use to determine the division of the adjustment weight coefficient? The BP algorithm has a good solution to this problem. In the BP algorithm model, the forward propagation of the information is first. This is the same as the previously mentioned neural network. The Success of the BP algorithm is that it provides a quick calculation method of implementation of each layer error to adjust the weights and offsets of each layer, so that the LMS algorithm can be applied well in multi-layer networks. In the network described above, the input of the algorithm is a series of correct, known samples and output pairs: {P1, T1}, {P2, T2}, {P3, T3} .... network output error is: F (x) = e (e) BP network The bias derivative is calculated by the chain method in the calculus. Here I give the formula of the training BP network, please see the references for detailed derivation.
The first is the forward propagation of information: then the reverse calculation error and adjust the weight: 2. Description of the problem and the network design
Here, start with the idea of the BP network to design a real actual neural network. An important use of the BP network is to pattern identification. Our mission is to design and train a feasible, efficient BP network to achieve digital and identification of 0 to 9 total noise. Here, the number uses an aesthetic digital tube font using an aesthetic digital tube font. First, choose a coding method. A row coding and column encoding can be used here, but a large number of experiments have shown that these two coding methods are poor anti-interference performance. I finally selected or dotted directly 0-1 encoding. For example, for numbers 0, can be encoded as: 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 0
In this way, it is easy to determine the input of the BP network is 64 dimensions. When the digital data is mixed with noise, it can be seen as a random inversion of 0, 1 data. For example, to add 7% noise, it can be reversed 4 0 or 1. Which one is reversed, here is random. As shown in the figure above, 0 after noise interference encoding: 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1 1 0 1 1 0 0 1 1 1 1 1 0
In the figure, the slope portion of the underline is noise.
Next to determine the number of neurons of the output layer. At the output layer, I encode 0 ~ 9 digits with 8421 yards, so that 10 numbers need 4 binary codes. This is also determined that there are 4 neurons below the output layer. Below, it is to determine the most critical hidden layer of neuron. In general, when doing a problem similar to a function approximation, the more neurons of the hidden layer, the more accurate approximation. However, for the topic of the current mode classification, the number of hidden neurons must be selected properly, that is, it can't be too much, it is not too small. If there is too much neuron of the hidden layer, it will greatly increase the training time, two, although the increase in neurons can more accurately describe the boundaries of the space, but at the same time, the network's fault tolerance can also decline. This is extremely unfavorable to our current 0 ~ 9 digital identification. Of course, when the number of neurons is too small, I can't meet the requirements of classification. After repeated experimental assay, contrast, I believe that the number of neurons of the hidden layer is 8 to 15 is the best. When less than 8 neurons, the network output error is too big; when more than 15 neurons, the identification rate is too low. Since this is just a simple 10-class distinction issue, use a hidden layer. Finally, the activation function of the hidden layer and the output layer is to determine. Following the general, I use the differential log-sigmoid function everywhere, F1 (N) = linear functions in the output layer, here, you may wish to take: F2 (n) = n
At this point, the general prototype of the BP network has been designed. Below, I combined with programming, specifically tell the implementation of the BP network, including the design of training samples, allowing the setting of the error and the step size of the step, and gives an improved recognition rate or training time.
3. Programming implementation of the BP network
Here, the programming development environment I use is:
Windows 2000 with SP4 Visual C 6.0 with SP5
In the face of various development environment and simulation software, I gave up Simple Matlab, nor did you use your familiar C Builder and Visual Basic. The most important reason is that VC controls the memory of memory. This has a key role in the BP algorithm for the calculation. To know, it is not difficult to design a BP network, it is difficult to design an efficient, high-speed, stable BP network, which requires us to optimize memory and maximize the algorithm. Otherwise, the design of the BP algorithm will be unachable or not stable enough. Be sure to pay attention to the memory that is not reused as much as possible when designing the program.
In order to improve the efficiency and convenience of programming, I use the matrix form of the BP algorithm, that is, formulas 1, 2, 3. In this way, the propagation of the propagation and reverse error of forward information is easily indicated in the form of a matrix (including vector) to add, phase division.
Before running the BP network, we must first have enough data to train this network. Here, we only use 10 groups of "pure" unsolicted data 0, 1 ... 9 is absolutely difficult. If you only use 10 sets of inputs, although you can quickly converge, you can reach a small error, but for our identification object - add noise data, you can almost completely lose the resolution.
So, here, after I have repeatedly compared, I have selected 60 groups of data as training samples. This 60 sets of data include:
1 The 10 groups are pure 0 ~ 9 numbers
The 50 groups of 2 is produced by random noise reduction of this 10 group pure data, which is randomly added to 5% to 15% noise. In this way, the training sample has sufficient health and fault tolerance.
The following is to determine the number of neurons of the hidden layer, allowing errors and learning steps. As mentioned earlier, the number of neuron number of hidden layer is the key to the design of the BP algorithm. The number of neurons of the hidden layer is too much, which is about to cause the training time, and will also reduce the anti-noise ability. Programming, when the number of hidden neurons exceeds 25, the training time of the network will not endure, and the anti-noise ability is greatly reduced; when the number of neurons is less than 7, the system error cannot converge to satisfactory value, resulting in The recognition rate is too low. After many comparative analysis, I finally selected the hidden layer of 9 neurons. It turns out that it is a good choice to see if the training time and identification rate from behind) are.
Allow errors is the target for training. Stop training when the output error of the system is less than this allowed error. Error selection is too large, will cause system recognition rate too low; error selection is too small, will greatly extend training time, is a waste. Here, the optimum allowable error is also obtained by multiple comparison analysis. I finally determined that the allowable error between 0.02 ~ 0.2.
The choice of learning steps is also the key to the BP network design. If the learning step is too long, it is possible to cause algorithms that cannot be convergent; the step is too short, and the training time is increased, and the second is easy to fall into the so-called local minimum. Here I initially set the learning step between 0.02 to 0.3.
Finally, let's take a look at the initial weights and offset settings.
Design initial weights and bias There are two points. First, W and B should not be too large, otherwise it is probably in the flatness of the error plane, leading to network training; Second, the weight cannot be set to the same value.
My implementation method is to set the weight and bias using the random number between (-1, 1).
At this point, the design of the network has been completed. Let me introduce this procedure I develop, briefly explain it.
The following figure is the main interface of this program: Menu interface: training error curve: Identification results: Note: The current program adds the function of output file records, as shown below: