5. Set p = p + Si and At = pAt.
and go to step 6.
6. If s < m, go to step 7, otherwise set At = At/2 and wk = (wk + w k-i )/2, vk = (vk + v k - 1 )/4, s - 0 and go to 7.
7. Set wk+1 = Wk+
8. Compute ak+1 = V ( w k + 1 ) and vk+1 vk + 9. If aTk+1ak> 0, then s = 0 and go to 10, otherwise s = s + l,p = l and go to 10.
10. If ||ak+1|| < e then stop, otherwise go to 11. 11. If |v + | > ||vk|| then i = 0 and go to 3, otherwise Wk+2 = (w k+1 + wk)/2, | k 1| i = i + l and go to 12. 12. Perform a restart: If i < j, then ik+1 = (vk+1 + v k )/4 and k = k + 1, go to 8, otherwise vk+1 = 0, j; = 1,k;= k: + 1 and go to 8.
Particle Swarm Optimization
Particle swarm optimization (PSO) is a global optimization approach, modeled on the social behavior of flocks of birds and schools of fish [Eberhart et al. 1996, Kennedy and Eberhart 1995]. PSO is a population-based search procedure where the individuals, referred to as particles, are grouped into a swarm. Each particle in the swarm represents a candidate solution to the optimization problem. In a PSO system, each particle is "flown" through the multidimensional search space, adjusting its position in search space according to its own experience and that of neighboring particles. A particle therefore makes use of the best position encountered by itself and the best position of its neighbors to position itself toward the global minimum. The effect is that particles "fly" toward an optimum, while still searching a wide area around the best solution. The performance of each particle (i.e. the "closeness" of a particle to the global minimum) is measured according to a predefined fitness function which is related to the problem being solved. For the purposes of this study, a particle represents the weight vector of a NN, including all biases. The dimension of the search space is therefore the total number of weights and biases. The fitness function is the mean squared error (MSE) over the training set, or the test set (as measure of generalization). The PSO algorithm is summarized below to illustrate its simplicity. The interested reader is referred to [Come et al. 1999, Eberhart et al. 1996, Van den Bergh 2002] for more information on the swarm approach to optimization. PSO will also be revisited in Part III of this book. 1. Initialize a swarm of P D-dimensional particles, where D is the number of weights and biases. 2. Evaluate the fitness fp, where fp is the MSE of the associated network over a given data set. given data set. 3. If fp < BESTP then BESTp = fp and BESTxp = xp, where BESTP is the current best fitness achieved by particle p, xp is the current coordinates of particle p in D-dimensional weight space, and BESTxp is the coordinates corresponding to particle p's best fitness so far.
4. If fp < BESTGBEST then GBEST = p, where GBEST is the particle having the overall best fitness over all particles in the swarm. 5. Change the velocity Vp of each particle p:
= wVp +cm(BESTxp - xp) +
r2 (0, 1); w is the inertia
where c\ and 02 are acceleration constants, and ri, weight. 6. Fly each particle p to xp + Vp. 7. Loop to step 2 until convergence.
In step 5, the coordinates BESTxp and BESTxoBEST are used to pull the particles toward a minimum, and the acceleration constant controls how far particles fly from one another. Convergence tests are the same as for standard training algorithms, such as GD. In a similar way, genetic algorithms, evolutionary computing and cultural algorithms can be used to train NNs.
