معلومات البحث الكاملة في مستودع بيانات الجامعة

عنوان البحث(Papers / Research Title)


Training Wave-Net based on Extended Kalman filter


الناشر \ المحرر \ الكاتب (Author / Editor / Publisher)

 
حيدر مهدي عبد الرضا الخفاجي

Citation Information


حيدر,مهدي,عبد,الرضا,الخفاجي ,Training Wave-Net based on Extended Kalman filter , Time 5/24/2011 6:05:35 AM : كلية الهندسة

وصف الابستركت (Abstract)


In this paper, we use the extended Kalman filter as an efficient tool in training the Wave-Net

الوصف الكامل (Full Abstract)


training wave-net based on extended kalman filter        hayder mahdi abul-ridha              hilal a. hussain abood                                                                              drenghaider@yahoo.com                      hilal_hussain@uobabylon.edu.iq                                                                                                                              university of babylon                                                                                                                                      phd. in electrical engineering                          msc. in electrical engineering               
abstract
wave-net is a promise network which utilizes the wavelet transform in building a new structure of neural network. training is the major problem facing researchers with the neural networks. in this paper, we use the extended kalman filter as an efficient tool in training the wave-net. the results show that the ability of using extended kalman filter as  training algorithm for wave-net and use it in classification problem and come up with good results especially in reducing    the number of    iterations in    training phase. a comparison between extended kalman filter and conventional back propagation, which are used in a classification problem, show that the training with kalman filter is better than training with gradient descent.
   
index terms: wave-net, extended kalman filter, wavelet, neural network.     
الخلاصة:
ان شبكة المويجة العصبية هي شبكة واعدة تستخدم محول المويجة لبناء تركيب جديد للشبكة العصبية. ان اهم مشكلة تواجه الباحثين هي مشكلة تدريب الشبكة العصبية.تم استخدام مرشح كالمان المطور في هذا البحث كاداة كفوءة في تدريب شبكة المويجة العصبية. حيث بينت النتائج نجاح تطبيق مرشح كالمان المطور في امكانية تدريب شبكة المويجة العصبية في حل مسالة التصنيف والحصول على نتائج  جيدة وخصوصا في تقليل عدد خطوات الزمن اللازمة للتدريب.  بينت المقارنة بين اثنين من خوارزميات التدريب , وهي خوارزمية مرشح كالمان المطور وخوارمية الانتشار العكسي التقليدية التي تم استخدامهما في مسالة التصنيف افضلية استخدام خوارزمية مرشح كالمان المطور على خوارزمية الانحدار التدريجي من حيث عدد خطوات الزمن اللازمة للتدريب.  
1. introduction
                  since the emergence of neural network, this field gains a great interest by the researchers and new structures of neural networks have been proposed. wave-net is one of the attractive structures of neural networks that combine the wavelet with neural network to give a new model [deghani , ahmadi , and eghtesad    2008].
                  many structures of neural networks have been introduced, beginning with the perceptron through hopfield network, feed forward neural network, radial basis neural network, etc. a new direction trying to combine two fields or more to give a new model, and all that toward building efficient neural network such as the hybrid neural fuzzy network and other hybrid structures[cheng-jian and cheng-hung 2005]. all the combinations of neural networks have been used extensively in classification, recognition, control, and other problems [lin ,quan, and yao 2006] ,[ puskorius , feldkamp 1994].
                  wavelet transform (wt) is a good tool for the analysis of signals and images. the main merit of wavelet transform over fourier transform is the ability of specifying the time-frequency position [kannan, martin 1996]. recent feature extraction methods utilize wt to get the best components to be used as an input vectors to the classifiers [michael, perz, black, and sammer  1993].     
                      learning process is the major problem with neural network where by which the effective parameters are specified. while the back propagation represent a nice procedure in the training of neural networks, but it carries a drawbacks, one of which is the problem of local minima [dan 1996 ].    many algorithms have been proposed for training neural networks in a trial to find a best and fast way for defining the weights and other parameters.
kalman filtering plays an essential role in systems theory and has found a wide range of applications in engineering and non-engineering systems.[xiao, xie  and minyue, 2008] ,[dan 2002]..
  this paper presents a way for training wave-net by using extended kalman filter and a comparison with another learning algorithm which is the gradient descent method.               
2. wave-net structures
wave-net is a hierarchal structure network formed from artificial neural network (ann) with mother wavelet function as its bases functions to produce powerful computational system for complex mapping and identification problems.
fig. 1 shows the    network design to solve identification problem with input layer  and  one hidden layer with variable neurons and  depend on mexican hat mother wavelet as  a base of the hidden layer to produce outputs as a summation through the  weights of network.



 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                              x1
x2
  y1
   
w11
w1m
wc1
wcm
w01
w0m


input layer
hidden layer , c neurons
output layer         
xn                         
fig. (1) wave-net structure  
3. network and training procedures a notation of (w) will be used to describe the weight matrix of the network which is a matrix of (c x m) dimensions, where c is the number of neurons in the hidden layer and m is the number of outputs as in eq. (1)
                                                                                                                                                                                                                                                                                                                                                                                                                                   
1-the output of this network will be the sum of multiplication of the weights with the response of the wavelet function in hidden layer as cleared in eq. (2) for a set of l input- out training vectors 2-    where
: the ith output vector
  : the ith input vector      is the mexican hat mother wavelet function which is work here as a base function of the hidden neurons,    : is the input the mexican hat function of the ith input vector and kth neuron of the hidden layer
dk: the translation parameter of the mother wavelet of the kth hidden neuron  b: the dilation parameter of the mother wavelet
eq. (3) shows the network (plant) description  
                                                                                                                                                                                                                                                                                                                                                              3-
where  
  and   
now to train this network we have to find the error function and try to minimize the error    as cleared in eq.(4)    4-                                                                                                                                                                                                                                                                                                                                                                                                                       
where y is the desired output of the network, to get the best performance from network we should optimize it against its parameters (the weights and translation parameter of mother wavelet). here we took the dilation parameter to be constant (b =1). for the first parameter it has shown in eq.(5) [karayiannis, 1999].  
  (i=1, 2,…, n)                                                                                                                                                                                                                                                                                                                                                              (5)   
while for translation parameter for mexican function mother wavelet as described in eq(6-8)  
 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      (6)
  where:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
 
the kalman filter addresses the general problem of trying to estimate the state of a discrete-time linear system process that is governed by the linear stochastic difference equations as in eq.(9 & 10) [kalman, 1960].
 
  xk = axk-1 + wk-1      (9)                                                                                                                                                                                                                                                                                                                                                                                         
   
and measurements equation:
   
                                                                          yk =hxk +vk                                                                                                                                                                                                                                                                                                                                                                                                                                      (10)
   
the random variables wk and vk represent the process and measurement noise (respectively).they are assumed to be independent (of each other), white, and with gaussian probability distributions
                                  p(w) ~ n(0,q)
                                  p(v) ~ n(0,r) q : the process noise covariance
r: the measurement noise covariance  
 
but in our  work the wavelet function is non linear, so here  we can use extended kalman filter to over come the non- linearity problem. we can linearize the estimation around the current estimate using the partial derivatives of the process and measurement functions to compute estimates even in the face of non-linear relationships as in eq. (11 & 12).  
xk=f(xk-1)+wk                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  (11)   
yk=h(xk)+vk  
eq. (13) shows that extended kalman filter can be derived by using first-order
taylor series approximations (neglecting higher order terms)  
                                                                                                                                                                                                                                                          (13)   
where  
                                                                                                                                                                                                                                                                                                               
we can updating the error and obtain the best prediction by the recursion
 
now the desired estimation    can be obtained by the recursion described   
kk: kalman gain.  
pk: the covariance matrix of the state estimation error.  
: the state estimation  
now to achieve the optimization problem in a form fit with kalman filtering, we let the elements of the weight matrix w and the translation parameter of the mother wavelet d    to be the state of a nonlinear system, and we constitute the output of the wavelet network with the output of the nonlinear system to which the kalman is applied.  
  the state of the nonlinear system can then be written as in eq. (16)
 
                      the block diagram of our proposed system is outlined in figure (2). the state of the system has been modeled depending on the weights of wavenet and the translation parameter of the mexican hat base function. the initial value of the weights has been given the      same value, while the translation parameter is set to zero. after that a system noise (uncertainty) is added.
the system will predict the output from input of current state and compare with training set output, the error will correct  the state of the system and  the system error covariance by changing  the kalman gain  and system error covariance

   
updating kalman gain and system noise         
state estimation  
wave-net with mexican hat base  
input data n*l matrix (“n” features for “l” input vectors), traning set
                                       
output data
( training set)                                     
state prediction                                 
output prediction             
      system state model (weights matrix and bias + system noise)         
correction state     
initial the system for weights and translation and noise    this process will continue until we get acceptable error, then the network will be ready to use and solve the classification problem  
fig. (2) wavenet training block diagram  
   
4. results
we use fisher s iris data set [fisher,1936] to train and test our wave-net, the dataset consists of 50 samples from each of three species of iris flowers (iris setosa, iris virginica and iris versicolor). four features were measured from each sample, they are the length and the width of sepal and petal, in centimeters. a program of matlab r2008a is used to train the network using 50% of the data chosen randomly then it tested with the randomly chosen data, and the weights of the wave-net were chosen to be zeros initially. after some experiments we found that the best performance for the network was with in error rate of 0.001 to terminate the training.
here we test the network to check its performance. we change the number of hidden layer and start with multi values of covariances p,q and r and the result was reported by    measuring the average of the cpu time for 10 trials also get the average of corrected outputs    for the input vectors   
       
as we notice from results the network succeed to overcome classification problem with acceptable performance and with good time for cpu before the training converge or locked in local minima which can not improve its error anymore.  
also its clear that initial covariances didn’t affect the training process with in the limits we took here in our  work (30, 40 and 60) while the number  of hidden layers usually improve the performance as increase till reach 6or 7 cells and the improvement will not be noticeable.  
the success of kalman training is very clear in reducing the computational efforts in compare with conventional back propagation algorithm resilient back propagation, multilayer networks typically use sigmoid transfer functions in the hidden layers. the results
   
   
4. conclusions
our work here shows that we can apply kalman filter in training wave-net network and how this network succeed in classification problem. furthermore the results we have got show the big role of kalman filter in reducing the load on computer processing by reducing the number of iterations in learning phase when compare with back propagation learning algorithm.
it’s suggested for future work to improve the performance of network to use the dilation parameter in addition to translation parameter and weights to minimize the error in training process to solve classification problem,  
 
reference   
cheng-jian lin and cheng-hung chen, "a self-constructing compensatory neural fuzzy system and its applications", elsevier, mathematical and computer modeling, vol. 42, pp. 339-351, 2005.
  dan simon, “training radial basis neural networks with the extended kalman filter”, neurocomputing vol. 48, issues 1-4, pp. 455-475, 2002
  dan w., "artificial neural networks", prentice hall, 1996.   
  deghani m., ahmadi m., and eghtesad m., "wavelet based network solution for forward kinematics problem of hexa parallel robot", ieee conference in intelligent engineering systems, pp. 63-70, 2008.
 
kalman, r. e., “a new approach to linear filtering and prediction problems,” transaction of the asme—journal of basic engineering, pp. 35-45 ,march 1960.
  karayiannis n., “reformulated radial basis neural networks trained by gradient descent, ieee trans. neural networks”. vol.3  pp. 657–671, 1999.
  kannan ramchandran, martin vetterli, and cormac herleg," wavelets, subband coding, and best bases", ieee transaction, vol. 84, pp. 541-560, 1996.
  lin mei, quan taifan, and yao tianbin, "tracking maneuvering target based on neural fuzzy network with incremental neural learning", journal of systems engineering and electronics, vol. 17, no. 2, pp. 343-349, 2006.
  michael m, perz s., black c., and sammer g., "detection and classification of p waves using gabor wavelets", ieee, computer in cardiology, pp. 531-534, 1993.
 
nan xiao1, lihua xie and minyue,2008, fu2, “kalman filtering over unreliable communication networks with bounded markovian packet dropingingingingingouts”,  int. j. robust nonlinear control (2008),  published online in wiley interscience (www.interscience.wiley.com). doi: 10.1002/rnc.1389
 
puskorius g., feldkamp l “neuro control of nonlinear dynamical systems with kalman filter trained recurrent networks”, ieee trans. neural networks: pp. 279–297, 1994.         

تحميل الملف المرفق Download Attached File

تحميل الملف من سيرفر شبكة جامعة بابل (Paper Link on Network Server) repository publications

البحث في الموقع

Authors, Titles, Abstracts

Full Text




خيارات العرض والخدمات


وصلات مرتبطة بهذا البحث