hyperplane and the MLP separate the two classes (Figure 9.4 plot 1). Note that LDFs are not robust and that, a point is moved, its influence on the LDF is proportional to its distance from its class mean, which can be made arbitrarily large (e.g., see Figure 9.4 plot 5). Like Experiment 1, the MLP is small enough (of size 2.1.1, with 5 adjustable weight^)^ to investigate its actions in some detail. In plot 1 (Figure 9.4), the limits of linear separability are shown. As we shall see, these are quite important in describing the behavior of the MLP, whereas for the LDF the question of the separability of the classes is not relevant. I t is clear that the MLP does have some robustness property, evidenced by the fact that in plot 3 the MLP follows the aberrant point, whereas in plots 4 and 5, where the classes are not linearly separable, it no longer does so. We note that the final MLP models in plots 4 and 5 are very similar to that in plot 1.
Behaviour of MLP separable versus inseperable classes Experiment 2
Figure 9.5 The final decision boundary for the data Experiment 2. As we move the point we can see the resulting change in the fitted model. Note that the MLP is insensitive to t h e point when it is masked by other data points (plots 2, 3 and 4). Note also the large difference in the decision boundary between plots 5 and 6, for a very small movement in the point
We look more closely (Figure 9.5) a t what happens when the point is moved across the boundary between separability and non-separability. The MLP separates the two classes whenever it can, even when this leads to what might be considered an unsatisfactory fitted model (Figure 9.5 plot 1 and plot 5 ) . However, when it is no longer possible to produce linear separation in Figure 9.5, plots 2, 3 and 4), we see that the influence of the moving point is markedly reduced it is masked by other points. In the plot of the SC for w2 (Figure 9.6), we see that it has a distinctive ridge that follows the line of linear separability and gradually decays. Any failure to separate two linearly separable classes in practice is due to the limitations of the function
3 he MLPs were always started from t h e same random weights, have t h e maximum iterations set Unless otherwise specified, a scaled conjugate gradient t o 5000 and t h e tolerance set to minimization routine was used (see the software documentation for further details). Crystal universal product code version a developmentfor c#
Contour and perspective plots for the sensitivity curve for the weight.
minimization routine, the stopping criterion, or the precision of the computations. The fact that the classes are not separated in Figure 9.5 plot 6 was due to the fact that the model was fitted using singleprecision arithmetic. When the model was refitted using double precision, the two classes were separated, as expected4. In Figure 9.7 we see what happens when a second hidden-layer unit is added. Clearly any robustness property that the MLP model might have is lost when the model is over-parameterized.
9.3.3 Experiment 3
We consider a data set consisting of two classes that are not linearly separable (Figure 9.8) and note a number of things about the fitted MLP model. As the IC and the SC are quite complex in this case, we start by plotting the decision boundary. We consider three different cases:
case 1 an MLP of size 2.3.1, with 2 different starting values (Figure 9.8, plots 1
and 2);
case 2 an MLP of size 2.6.1, so that there are more hidden-layer units than are
necessary to separate the classes (Figure 9.8, plot 3);
an MLP of size 2.6.1, with one point moved over into the region of the other class (ie. one point is misclassified) (Figure 9.8, plots 4 and 5).
4All the fitted models reported here were fitted using double precision arithmetic. T h i s was the only model fitted using single precision. The single precision code is no longer supported. However, it is very interesting that a change in precision can affect a statistical model so markedly. See Section 10.3.1, p. 163, for another example of this.