Freund, Y. and Schapire, R. E. (1996). Experiments with a new boosting algorithm. In pp. 148-156. Friedman, J., Hastie, T., and Tibshirani, R. (1998). Additive logistic regression: a statistical view of boosting. Technical report, Department of Statistics, Stanford University. Friedman, J . H. (1984). A variable span smoother. Tech. Rep. No. 5, Laboratory for Computational Statistics, Dept. of Statistics, Stanford Univ., California. Friedman, H. (1989). Regularized discriminant analysis. 84:165-175.
Friedman, J. H. (1991a). Multivariate adaptive regression splines (with discussion). 19:l-141. Friedman, J. H. (1991b). Unpublished lecture notes, CSIRO, Sydney, Australia Friedman, J . H. (1996). On bias, variance, O/l-loss, and the curse-of-dimensionality. Available on Friedman, J. H. (1999). Stochastic gradient boosting. Available a t
Friedman, J . H. (2000). Greedy function approximation: A gradient boosting machine. The 1999 IMS Reitz Lecture. Available at . Friedman, J. H. and Silverman, B. W . (1989). Flexible parsimonious smoothing and additive modeling. 31 (1):3-21. Friedman, J. H. and Stuetzle, W. (1981). Projection pursuit regression. 76:817-823.
Fukumizu, K. (1996). A regularity condition of the information matrix of a multilayer perceptron network. 9(5):871-879. Fukunaga, K. (1990). Press, Boston, second edition. [First edition, 19721. Academic
Funahashi, K. (1989). On the approximate realization of continuous mappings by neural networks. 2:183-192. Furby, S., Kiiveri, H., and Campbell, N. (1990). The analysis of high dimensional spectral curves. In of pp. 175-184. Gallinari, P., Thiria, S., and Fogelman-Soulie, F. (1988). Multilayer perceptrons and data analysis. In on volume 1, pp. 391-399. Geman, S., Bienenstock, E., and Doursar, R. (1992). Neural networks and the bias/variance dilemma. 4( 1):l-58.
Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. on 6(6):721-741. Gibson, G. J . and Cowan, C. F. N . (1990). On the decision regions of multilayer perceptrons. of 78( 10):1590-1594. Girosi. F., Jones, M., and Poggio, T. (1993). Priors, stabilizers and basis functions: from regularization to radial, tensor and additive splines. A. I. Memo No. 1430 C.B.C.L. Paper No. 75, Massachusetts Institute of Technology. Gish, H. (1990). A probabilistic approach to the understanding and training of neural network classifiers. In on pp. 1361-1364. Golub, G. H. and van Loan, C. F. (1982). University Press, Baltimore. Johns Hopkins
Golub, G. H. and Wilkinson, J. H. (1966). Note on iterative refinement of least squares solution. 9:139-148. Goodacre, R., Kell, D. B., and Bianchi, G. (1992). Neural networks and olive oil. 359:594. Goodall, C. (1983). M-estimators of location: An outline of the theory. In Hoaglin, D. C., Mosteller, F., and Tukey, J . W., editors, pp. 339-403. Wiley, New York. Gorman, R. and Sejnowski, T. (1988). Analysis of hidden units in a layered network trained to classify sonar targets. 1:75-89. Goutte, C. and Hansen, L. K . (1997). Regularization with a pruning prior. 10( 6 ):1053- 1059. Halmos, P. R. (1974). Springer-Verlag, New York.
Hamamoto, Y., Uchimura, S., and Tomita, S. (1996). On the behavior of artificial neural network classifiers in high-dimensional spaces. on 18(5):571-574. Hampel, F. (1974). The influence curve and its role in robust estimation. 69:383-393. Hampel, F. R. (1985). The breakdown points of the mean combined with some rejection rules. 27(2):95-107. Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., and Stahel, W . A. (1986). on Wiley, New York. Hampshire, 11, J . B. and Perlmutter, B. A. (1990). Equivalence proofs for multilayer perceptron classifiers and the Bayesian discriminant function. In Touretzky, D., Elman, A., Sejnowski, T., and Hinton, G., editors, 1990 San Mateo, CA. Morgan Kaufmann.
Hand, D. J . (1997). Chichester. Hand, D. J. (2006). Classifier technology and the illusion of progress. 21(1):1-14. with discussion.
Hassibi, B. and Stork, D. G. (1991). Second order derivatives for network pruning: Optimal brain surgeon. In Lippmann et al. (1991), pp. 164-171. Hastie, T. (1994). Nonparametric discriminant analysis. Unpublished lecture notes, AT&T Bell Laboratories. Hastie, T., Buja, A., and Tibshirani, R. (1995). Penalized discriminant analysis. 23:73-102. Hastie, T. and Tibshirani, R. (1996). Discriminant analysis by Gaussian mixtures. 58:158-176. Hastie, T., Tibshirani, R., and Buja, A. (1994). Flexible discriminant analysis by optimal scoring. of 89:1255-1270. Hastie, T., Tibshirani, R., and Buja, A. (1999). Flexible discriminant and mixture models. In Kay, J . and Titterington, D. M., editors, Oxford University Press. Hastie, T., Tibshirani, R., and Friedman, J . (2001). Springer. Hastie, T., Tibshirani, R., Leisch, F., Hornik, K., and Ripley, B. D. (2006). original by 'Ikevor Hastie and Robert Tibshirani. R port by Friedrich Leisch and Kurt Hornik and Brian D. Ripley. R package version 0.3-2. Hastie, T. J . and Pregibon, D. (1992). Generalized linear models. In Chambers, chapter 9. Wadsworth J . M. and Hastie, T. J . , editors, and Brooks, Pacific Grove, CA. Haykin, S. (1999). Engelwood Cliffs, NJ. Hebb, D. 0. (1949). Prentice-Hall, Wiley, New York.
Hestenes, M. R. and Stiefel, E. (1952). Methods of conjugate gradients for solving linear systems. of 49(6):409436. Himmelblau, D. M. (1990). Introducing efficient second order effects into back propagation learning. In on volume 1, pp. 631-634, Hillsdale, NJ. Lawrence Erlbaum. Hogg, R. V. (1979). An introduction to robust estimation. In Launer, R. L. and Academic Press, Boston. Wilkinson, G. N., editors, Hornik, K . , Sinchcombe, M., and White, H. (1989). Multilayer feedforward networks are universal approximators. 2(5):359-365.
