| ||||||||||||||
| ||||||||||||||
|
||||||||||||||
Notes on Neural Network Learning and Training
1.0 IntroductionNeural Network (NN) could be define as an interconnected of simple processing element whose functionality is based on the biological neuron. Biological neuron (Figure 1) is a unique piece of equipment that carries information or a bit of knowledge and transfers to other neuron in a chain of networks. Artificial Neuron imitates these functions and their unique process of learning. Basically, biological neuron has three types of components called dendrites, soma and axon. Dendrites are the sensitive part of neuron that receive signal from other neuron. Soma calculates and sums the signals and transmitted to other cells through axon.
Simple neuron (Figure 2) introduced by McCulloch and Pitts in 1940s, consists of input layer, activation function, and output layer. Input layer receive input signal from external environment (or other neuron). Activation function is the neuron internal states that calculates and sum the input signals. The signals are then transmitted to output layer. The input layer, activation function and output layer in artificial neuron are similar to the function of dendrites, soma and axon in biological neuron.
2.0 Learning in Neural NetworkAssume we have n input units, Xi,…,Xn with input signals x1,…,xn. When the network receive the signals (xi) from input units (Xi), the net input to output (y_inj) is calculated by summing the weighted input signals (
The network output (yj) is calculated using the activation function
f(x). In which yj = f(x), where x is y_inj. The computed weight from the training is stored and will become the information or knowledge for the future application. 3.0 Training the NetworkTraining the network is time consuming. It usually learns after several epochs, depending on how large the network is. Thus, large network required more training time compared to the smaller one. Basically, the network is trained for several epochs and stopped after reaching the maximum epoch. For the same reason minimum error tolerance is used provided that the differences between network output and known outcome is less than the specified value (see for example Pofahl et al., 1998). We could also stop the training after the network meet certain stopping criteria.During training the network might learn too much. This problem is referred to as overfitting. Overfitting is a critical problem in most all standard NNs architecture. Furthermore, NNs and other AI machine learning models are prone to overfitting (Lawrence et al., 1997). One of the solutions is early stopping (Sarle, 1995), but this approach need more critical intention as this problem is harder than expected (Lawrence et al., 1997). The stopping criteria is also another issue to consider in preventing overfitting (Prechelt, 1998). Hence, for this problem during training, validation set is used instead of training data set. After a few epochs the network is tested with the validation data. The training is stopped as soon as the error on validation set increases rapidly higher than the last time it was checked (Prechelt, 1998). Figure 3 shows that the training should stop at time t when validation error starts to increase.
Discussion and ConclusionConstructing a program for Neural Network is not a difficult task. Basically, it was only several steps of algorithms that are easily followed even by novice practitioners. However, preparing the network for training is a difficult task since the network dealing with a large amount of data. Another problem is when to stop the training? Over training could cause memorization where the network might simply memorize the data patterns and might fail to recognize other set of patterns. Thus, early stopping is recommended to ensure that the network learn accordingly. ReferencesFausett, L. (1994). Fundamentals of Neural Network: Architectures, Algorithms and Applications. Englewood Cliffs: Prentice Hall. Sarle, W. S. (1997). Neurak Network FAQ, part 1 of 7: Introduction. Periodic posting to the Usenet newsgroup comp.ai.neural-nets, URL: ftp://ftp.sas.com/pub/neural/FAQ.html Downloaded on 30 Nov. 1999. Pofahl, W. E., Walczak, S. M., Rhone, E., and Izenberg, S. D. (1998). Use of an Artificial Neural Network to Predict Length of Stay in Acute Pancreatitis. American Surgeon, Sep98, Vol. 64 Issue 9, (pp: 868 – 872) Lawrence, S., Giles, C. L., and Tsoi, A. C. (1997). Lessons in Neural Network Training: Training May be Harder than Expected. Proceedings of the Fourteenth National Conference on Artificial Intelligence, AAAI-97, (pp. 540-545), Menlo Park, California: AAAI Press. Sarle, W. (1995). Stopped Training and Other Remedies for Overfitting. Proceedings of the 27th Symposium on the Interface of Computing Science and Statistics, (pp. 352-360). Retrieved March 18, 2002 from World Wide Web: ftp://ftp.sas.com/pub/neural/ Prechelt, L. (1998). Early Stopping-but when? Neural Networks: Tricks of the trade, (pp. 55-69). Retrieved March 28, 2002 from World Wide Web: http://wwwipd.ira.uka.de/~prechelt/Biblio/
Submitted: 14/03/2004 Article content copyright © Wan Hussain Wan Ishak, 2004.
|
|
|||||||||||||
All content copyright © 1998-2007, Generation5 unless otherwise noted.
- Privacy Policy - Legal - Terms of Use -