Approximation theory of the MLP model in neural networks

Allan Pinkus

doi:10.1017/S0962492900002919

Approximation theory of the MLP model in neural networks

Published online by Cambridge University Press: 07 November 2008

Allan Pinkus

Show author details

Allan Pinkus: Affiliation:
Department of Mathematics, Technion – Israel Institute of Technology, Haifa 32000, Israel E-mail: pinkus@tx.technion.ac.il

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In this survey we discuss various approximation-theoretic problems that arise in the multilayer feedforward perceptron (MLP) model in neural networks. The MLP model is one of the more popular and practical of the many neural network models. Mathematically it is also one of the simpler models. Nonetheless the mathematics of this model is not well understood, and many of these problems are approximation-theoretic in character. Most of the research we will discuss is of very recent vintage. We will report on what has been done and on various unanswered questions. We will not be presenting practical (algorithmic) methods. We will, however, be exploring the capabilities and limitations of this model.

Information

Type: Research Article
Information: Acta Numerica , Volume 8 , January 1999 , pp. 143 - 195

DOI: https://doi.org/10.1017/S0962492900002919 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1999

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

REFERENCES

Adams, R. A. (1975), Sobolev Spaces, Academic Press, New York.Google Scholar

Albertini, F., Sontag, E. D. and Maillot, V. (1993), ‘Uniqueness of weights for neural networks’, in Artificial Neural Networks for Speech and Vision (Mammone, R. J., ed.), Chapman and Hall, London, pp. 113–125.Google Scholar

Attali, J.-G. and Pagès, G. (1997), ‘Approximations of functions by a multilayer perceptron: a new approach’, Neural Networks 10, 1069–1081.CrossRef Google Scholar

Barron, A. R. (1992), ‘Neural net approximation’, in Proc. Seventh Yale Workshop on Adaptive and Learning Systems, 1992 (Narendra, K. S., ed.), Yale University, New Haven, pp. 69–72.Google Scholar

Barron, A. R. (1993), ‘Universal approximation bounds for superpositions of a sigmoidal function’, IEEE Trans. Inform. Theory 39, 930–945.CrossRef Google Scholar

Barron, A. R. (1994), ‘Approximation and estimation bounds for artificial neural networks’, Machine Learning 14, 115–133.CrossRef Google Scholar

Bartlett, P. L., Maiorov, V. and Meir, R. (1998), ‘Almost linear VC dimension bounds for piecewise polynomial networks’, Neural Computation 10, 2159–2173.CrossRef Google Scholar PubMed

Baum, E. B. (1988), ‘On the capabilities of multilayer perceptrons’, J. Complexity 4, 193–215.CrossRef Google Scholar

Bishop, C. M. (1995), Neural Networks for Pattern Recognition, Oxford University Press, Oxford.CrossRef Google Scholar

Blum, E. K. and Li, L. K. (1991), ‘Approximation theory and feedforward networks’, Neural Networks 4, 511–515.CrossRef Google Scholar

Buhmann, M. D. and Pinkus, A. (1999), ‘Identifying linear combinations of ridge functions’, Adv. Appl. Math. 22, 103–118.CrossRef Google Scholar

Burton, R. M. and Dehling, H. G. (1998), ‘Universal approximation in p-mean by neural networks’, Neural Networks 11, 661–667.Google Scholar

Cardaliaguet, P. and Euvrard, G. (1992), ‘Approximation of a function and its derivatives with a neural network’, Neural Networks 5, 207–220.CrossRef Google Scholar

Carroll, S. M. and Dickinson, B. W. (1989), ‘Construction of neural nets using the Radon transform’, in Proceedings of the IEEE 1989 International Joint Conference on Neural Networks, Vol. 1, IEEE, New York, pp. 607–611.Google Scholar

Chen, T. and Chen, H. (1993), ‘Approximations of continuous functionals by neural networks with application to dynamic systems’, IEEE Trans. Neural Networks 4, 910–918.CrossRef Google Scholar PubMed

Chen, T. and Chen, H. (1995), ‘Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems’, IEEE Trans. Neural Networks 6, 911–917.CrossRef Google Scholar PubMed

Chen, T., Chen, H. and Liu, R. (1995), ‘Approximation capability in C(ℝⁿ) by multilayer feedforward networks and related problems’, IEEE Trans. Neural Networks 6, 25–30.CrossRef Google Scholar

Chen, X. and White, H. (1999), ‘Improved rates and asymptotic normality for non-parametric neural network estimators’, preprint.Google Scholar

Choi, C. H. and Choi, J. Y. (1994), ‘Constructive neural networks with piecewise interpolation capabilities for function approximations’, IEEE Trans. Neural Networks 5, 936–944.Google Scholar

Chui, C. K. and Li, X. (1992), ‘Approximation by ridge functions and neural networks with one hidden layer’, J. Approx. Theory 70, 131–141.CrossRef Google Scholar

Chui, C. K. and Li, X. (1993), ‘Realization of neural networks with one hidden layer’, in Multivariate Approximations: From CAGD to Wavelets (Jetter, K. and Utreras, F., eds), World Scientific, Singapore, pp. 77–89.CrossRef Google Scholar

Chui, C. K., Li, X. and Mhaskar, H. N. (1994), ‘Neural networks for localized approximation’, Math. Comp. 63, 607–623.CrossRef Google Scholar

Chui, C. K., Li, X. and Mhaskar, H. N. (1996), ‘Limitations of the approximation capabilities of neural networks with one hidden layer’, Adv. Comput. Math. 5, 233–243.CrossRef Google Scholar

Corominas, E. and Balaguer, F. Sunyer (1954), ‘Condiciones para que una foncion infinitamente derivable sea un polinomo’, Rev. Mat. Hisp. Amer. 14, 26–43.Google Scholar

Cotter, N. E. (1990), ‘The Stone–Weierstrass theorem and its application to neural networks’, IEEE Trans. Neural Networks 1, 290–295.CrossRef Google Scholar PubMed

Cybenko, G. (1989), ‘Approximation by superpositions of a sigmoidal function’, Math. Control, Signals, and Systems 2, 303–314.Google Scholar

DeVore, R. A., Howard, R. and Micchelli, C. (1989), ‘Optimal nonlinear approximation’, Manuscripta Math. 63, 469–478.CrossRef Google Scholar

DeVore, R. A., Oskolkov, K. I. and Petrushev, P. P. (1997), ‘Approximation by feedforward neural networks’, Ann. Numer. Math. 4, 261–287.Google Scholar

Devroye, L., Gyorfi, L. and Lugosi, G. (1996), A Probabilistic Theory of Pattern Recognition, Springer, New York.CrossRef Google Scholar

Donahue, M. J., Gurvits, L., Darken, C. and Sontag, E. (1997), ‘Rates of convex approximation in non-Hilbert spaces’, Const. Approx. 13, 187–220.CrossRef Google Scholar

Donoghue, W. F. (1969), Distributions and Fourier Transforms, Academic Press, New York.Google Scholar

Draelos, T. and Hush, D. (1996), ‘A constructive neural network algorithm for function approximation’, in Proceedings of the IEEE 1996 International Conference on Neural Networks, Vol. 1, IEEE, New York, pp. 50–55.Google Scholar

Edwards, R. E. (1965), Functional Analysis, Theory and Applications, Holt, Rine-hart and Winston, New York.Google Scholar

Ellacott, S. W. (1994), ‘Aspects of the numerical analysis of neural networks’, in Vol. 3 of Acta Numerica, Cambridge University Press, pp. 145–202.Google Scholar

Ellacott, S. W. and Bos, D. (1996), Neural Networks: Deterministic Methods of Analysis, International Thomson Computer Press, London.Google Scholar

Fefferman, C. (1994), ‘Reconstructing a neural net from its output’, Revista Mat. Iberoamer. 10, 507–555.CrossRef Google Scholar

Finan, R. A., Sapeluk, A. T. and Damper, R. I. (1996), ‘Comparison of multilayer and radial basis function neural networks for text-dependent speaker recognition’, in Proceedings of the IEEE 1996 International Conference on Neural Networks, Vol. 4, IEEE, New York, pp. 1992–1997.Google Scholar

Frisch, H. L., Borzi, C., Ord, D., Percus, J. K. and Williams, G. O. (1989), ‘Approximate representation of functions of several variables in terms of functions of one variable’, Phys. Review Letters 63, 927–929.CrossRef Google Scholar PubMed

Funahashi, K. (1989), ‘On the approximate realization of continuous mappings by neural networks’, Neural Networks 2, 183–192.Google Scholar

Gallant, A. R. and White, H. (1988), ‘There exists a neural network that does not make avoidable mistakes’, in Proceedings of the IEEE 1988 International Conference on Neural Networks, Vol. 1, IEEE, New York, pp. 657–664.Google Scholar

Gallant, A. R. and White, H. (1992), ‘On learning the derivatives of an unknown mapping with multilayer feedforward networks’, Neural Networks 5, 129–138.CrossRef Google Scholar

Geva, S. and Sitte, J. (1992), ‘A constructive method for multivariate function approximation by multilayer perceptrons’, IEEE Trans. Neural Networks 3, 621–624.CrossRef Google Scholar PubMed

Girosi, F. and Poggio, T. (1989), ‘Representation properties of networks: Kolmogorov's theorem is irrelevant’, Neural Computation 1, 465–469.CrossRef Google Scholar

Girosi, F. and Poggio, T. (1990), ‘Networks and the best approximation property’, Biol. Cybern. 63, 169–176.Google Scholar

Gori, M., Scarselli, F. and Tsoi, A. C. (1996), ‘Which classes of functions can a given multilayer perceptron approximate?’, in Proceedings of the IEEE 1996 International Conference on Neural Networks, Vol. 41, IEEE, New York, pp. 2226–2231.Google Scholar

Haykin, S. (1994), Neural Networks, MacMillan, New York.Google Scholar

Hecht-Nielsen, R. (1987), ‘Kolmogorov's mapping neural network existence theorem’, in Proceedings of the IEEE 1987 International Conference on Neural Networks, Vol. 3, IEEE, New York, pp. 11–14.Google Scholar

Hecht-Nielsen, R. (1989), ‘Theory of the backpropagation neural network’, in Proceedings of the IEEE 1989 International Joint Conference on Neural Networks, Vol. 1, IEEE, New York, pp. 593–605.Google Scholar

Hornik, K. (1991), ‘Approximation capabilities of multilayer feedforward networks’, Neural Networks 4, 251–257.CrossRef Google Scholar

Hornik, K. (1993), ‘Some new results on neural network approximation’, Neural Networks 6, 1069–1072.CrossRef Google Scholar

Hornik, K., Stinchcombe, M. and White, H. (1989), ‘Multilayer feedforward networks are universal approximators’, Neural Networks 2, 359–366.CrossRef Google Scholar

Hornik, K., Stinchcombe, M. and White, H. (1990), ‘Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks’, Neural Networks 3, 551–560.CrossRef Google Scholar

Hornik, K., Stinchcombe, M., White, H. and Auer, P. (1994), ‘Degree of approximation results for feedforward networks approximating unknown mappings and their derivatives’, Neural Computation 6, 1262–1275.CrossRef Google Scholar

Huang, G. B. and Babri, H. A. (1998), ‘Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions’, IEEE Trans. Neural Networks 9, 224–229.CrossRef Google Scholar PubMed

Huang, S. C. and Huang, Y. F. (1991), ‘Bounds on the number of hidden neurons in multilayer perceptrons’, IEEE Trans. Neural Networks 2, 47–55.CrossRef Google Scholar PubMed

Irie, B. and Miyake, S. (1988), ‘Capability of three-layered perceptrons’, in Proceedings of the IEEE 1988 International Conference on Neural Networks, Vol. 1, IEEE, New York, pp. 641–648.Google Scholar

Itô, Y. (1991a), ‘Representation of functions by superpositions of a step or a sigmoid function and their applications to neural network theory’, Neural Networks 4, 385–394.CrossRef Google Scholar

Itô, Y. (1991b), ‘Approximation of functions on a compact set by finite sums of a sigmoid function without scaling’, Neural Networks 4, 817–826.CrossRef Google Scholar

Itô, Y. (1992), ‘Approximation of continuous functions on ℝ^d by linear combinations of shifted rotations of a sigmoid function with and without scaling’, Neural Networks 5, 105–115.CrossRef Google Scholar

Itô, Y. (1993), ‘Approximations of differentiable functions and their derivatives on compact sets by neural networks’, Math. Scient. 18, 11–19.Google Scholar

Itô, Y. (1994 a), ‘Approximation capabilities of layered neural networks with sigmoidal units on two layers’, Neural Computation 6, 1233–1243.Google Scholar

Itô, Y. (1994b), ‘Differentiable approximation by means of the Radon transformation and its applications to neural networks’, J. Comput. Appl. Math. 55, 31–50.CrossRef Google Scholar

Itô, Y. (1996), ‘Nonlinearity creates linear independence’, Adv. Comput. Math, 5, 189–203.CrossRef Google Scholar

Itô, Y. and Saito, K. (1996), ‘Superposition of linearly independent functions and finite mappings by neural networks’, Math. Scient. 21, 27–33.Google Scholar

Jones, L. K. (1990), ‘Constructive approximations for neural networks by sigmoidal functions’, Proc. IEEE 78, 1586–1589. Correction and addition, Proc. IEEE (1991) 79, 243.CrossRef Google Scholar

Jones, L. K. (1992), ‘A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training’, Ann. Stat. 20, 608–613.Google Scholar

Jones, L. K. (1994), ‘Good weights and hyperbolic kernels for neural networks, projection pursuit, and pattern classification: Fourier strategies for extracting information from high-dimensional data’, IEEE Trans. Inform. Theory 40, 439–454.CrossRef Google Scholar

Jones, L. K. (1997), ‘The computational intractability of training sigmoidal neural networks’, IEEE Trans. Inform. Theory 43, 167–173.CrossRef Google Scholar

Jones, L. K. (1999), ‘Local greedy approximation for nonlinear regression and neural network training’, preprint.CrossRef Google Scholar

Kahane, J. P. (1959), Lectures on Mean Periodic Functions, Tata Institute, Bombay.Google Scholar

Kainen, P. C., Kurkova, V. and Vogt, A. (1999), ‘Approximation by neural networks is not continuous’, preprint.CrossRef Google Scholar

Katsuura, H. and Sprecher, D. A. (1994), ‘Computational aspects of Kolmogorov's superposition theorem’, Neural Networks 7, 455–461.CrossRef Google Scholar

Kreinovich, V. Y. (1991), ‘Arbitrary nonlinearity is sufficient to represent all functions by neural networks: a theorem’, Neural Networks 4, 381–383.CrossRef Google Scholar

Kurkova, V. (1991), ‘Kolmogorov's theorem is relevant’, Neural Computation 3, 617–622.Google Scholar

Kurkova, V. (1992), ‘Kolmogorov's theorem and multilayer neural networks’, Neural Networks 5, 501–506.Google Scholar

Kurkova, V. (1995 a), ‘Approximation of functions by perceptron networks with bounded number of hidden units’, Neural Networks 8, 745–750.CrossRef Google Scholar

Kurkova, V. (1995b), ‘Kolmogorov's theorem’, in The Handbook of Brain Theory and Neural Networks, (Arbib, M., ed.), MIT Press, Cambridge, pp. 501–502.Google Scholar

Kurkova, V. (1996), ‘Trade-off between the size of weights and the number of hidden units in feedforward networks’, Neural Network World 2, 191–200.Google Scholar

Kurkova, V. and Kainen, P. C. (1994), ‘Functionally equivalent feedforward neural networks’, Neural Computation 6, 543–558.CrossRef Google Scholar

Kurkova, V., Kainen, P. C. and Kreinovich, V. (1997), ‘Estimates of the number of hidden units and variation with respect to half-spaces’, Neural Networks 10, 1061–1068.CrossRef Google Scholar

Lapedes, A. and Farber, R. (1988), ‘How neural nets work’, in Neural Information Processing Systems (Anderson, D. Z., ed.), American Institute of Physics, New York, pp. 442–456.Google Scholar

Leshno, M., Lin, V. Ya., Pinkus, A. and Schocken, S. (1993), ‘Multilayer feedforward networks with a non-polynomial activation function can approximate any function’, Neural Networks 6, 861–867.CrossRef Google Scholar

Li, X. (1996), ‘Simultaneous approximations of multivariate functions and their derivatives by neural networks with one hidden layer’, Neurocomputing 12, 327–343.CrossRef Google Scholar

Light, W. A. (1993), ‘Ridge functions, sigmoidal functions and neural networks’, in Approximation Theory VII (Cheney, E. W., Chui, C. K. and Schumaker, L. L., eds), Academic Press, New York, pp. 163–206.Google Scholar

Lin, J. N. and Unbehauen, R. (1993), ‘On realization of a Kolmogorov network’, Neural Computation 5, 18–20.CrossRef Google Scholar

Lin, V. Ya. and Pinkus, A. (1993), ‘Fundamentality of ridge functions’, J. Approx. Theory 75, 295–311.CrossRef Google Scholar

Lin, V. Ya. and Pinkus, A. (1994), ‘Approximation of multivariate functions’, in Advances in Computational Mathematics: New Delhi, India, (Dikshit, H. P. and Micchelli, C. A., eds), World Scientific, Singapore, pp. 257–265.Google Scholar

Lippman, R. P. (1987), ‘An introduction to computing with neural nets’, IEEE Magazine 4, 4–22.CrossRef Google Scholar

Lorentz, G. G., von Golitschek, M. and Makovoz, Y. (1996), Constructive Approximation: Advanced Problems, Vol. 304 of Grundlehren, Springer, Berlin.CrossRef Google Scholar

Maiorov, V. E. (1999), ‘On best approximation by ridge functions’, to appear in J. Approx. TheoryCrossRef Google Scholar

Maiorov, V. E. and Meir, R. (1999), ‘On the near optimality of the stochastic approximation of smooth functions by neural networks’, to appear in Adv. Comput. Math.Google Scholar

Maiorov, V., Meir, R. and Ratsaby, J. (1999), ‘On the approximation of functional classes equipped with a uniform measure using ridge functions’, to appear in J. Approx. Theory.CrossRef Google Scholar

Maiorov, V. and Pinkus, A. (1999), ‘Lower bounds for approximation by MLP neural networks’, Neurocomputing 25, 81–91.CrossRef Google Scholar

Makovoz, Y. (1996), ‘Random approximants and neural networks’, J. Approx. Theory 85, 98–109.CrossRef Google Scholar

Makovoz, Y. (1998), ‘Uniform approximation by neural networks’, J. Approx. Theory 95, 215–228.CrossRef Google Scholar

Meltser, M., Shoham, M. and Manevitz, L. M. (1996), ‘Approximating functions by neural networks: a constructive solution in the uniform norm’, Neural Networks 9, 965–978.CrossRef Google Scholar

Mhaskar, H. N. (1993), ‘Approximation properties of a multilayered feedforward artificial neural network’, Adv. Comput. Math. 1, 61–80.CrossRef Google Scholar

Mhaskar, H. N. (1994), ‘Approximation of real functions using neural networks’, in Advances in Computational Mathematics: New Delhi, India, (Dikshit, H. P. and Micchelli, C. A., eds), World Scientific, Singapore, pp. 267–278.Google Scholar

Mhaskar, H. N. (1996), ‘Neural networks for optimal approximation of smooth and analytic functions’, Neural Computation 8, 164–177.CrossRef Google Scholar

Mhaskar, H. N. and Hahm, N. (1997), ‘Neural networks for functional approximation and system identification’, Neural Computation 9, 143–159.Google Scholar

Mhaskar, H. N. and Micchelli, C. A. (1992), ‘Approximation by superposition of a sigmoidal function and radial basis functions’, Adv. Appl. Math. 13, 350–373.Google Scholar

Mhaskar, H. N. and Micchelli, C. A. (1993), ‘How to choose an activation function’, in Vol. 6 of Neural Information Processing Systems (Cowan, J. D., Tesauro, G. and Alspector, J., eds), Morgan Kaufman, San Francisco, pp. 319–326.Google Scholar

Mhaskar, H. N. and Micchelli, C. A. (1994), ‘Dimension-independent bounds on the degree of approximation by neural networks’, IBM J. Research Development 38, 277–284.Google Scholar

Mhaskar, H. N. and Micchelli, C. A. (1995), ‘Degree of approximation by neural and translation networks with a single hidden layer’, Adv. Appl. Math. 16, 151–183.Google Scholar

Mhaskar, H. N. and Prestin, J. (1999), ‘On a choice of sampling nodes for optimal approximation of smooth functions by generalized translation networks’, to appear in Proceedings of International Conference on Artificial Neural Networks, Cambridge, England.Google Scholar

Nees, M. (1994), ‘Approximative versions of Kolmogorov's superposition theorem, proved constructively’, J. Comput. Appl. Anal. 54, 239–250.CrossRef Google Scholar

Nees, M. (1996), ‘Chebyshev approximation by discrete superposition: Application to neural networks’, Adv. Comput. Math. 5, 137–151.CrossRef Google Scholar

Oskolkov, K. I. (1997), ‘Ridge approximation, Chebyshev-Fourier analysis and optimal quadrature formulas’, Tr. Mat. Inst. Steklova 219 Teor. Priblizh. Garmon. Anal., 269–285.Google Scholar

Petrushev, P. P. (1998), ‘Approximation by ridge functions and neural networks’, SIAM J. Math. Anal. 30, 155–189.CrossRef Google Scholar

Pinkus, A. (1995), ‘Some density problems in multivariate approximation’, in Approximation Theory: Proceedings of the International Dortmund Meeting IDoMAT 95, (Müller, M. W., Felten, M. and Mache, D. H., eds), Akademie Verlag, Berlin, pp. 277–284.Google Scholar

Pinkus, A. (1996), ‘TDI-Subspaces of C(ℝ^d) and some density problems from neural networks’, J. Approx. Theory 85, 269–287.Google Scholar

Pinkus, A. (1997), ‘Approximating by ridge functions’, in Surface Fitting and Multiresolution Methods, (Méhauté, A. Le, Rabut, C. and Schumaker, L. L., eds), Vanderbilt University Press, Nashville, pp. 279–292.Google Scholar

Pisier, G. (1981), ‘Remarques sur un resultat non publié de B. Maurey’, in Seminaire D'Analyse Fonctionnelle, 1980–1981, École Polytechnique, Centre de Mathématiques, Palaiseau, France.Google Scholar

Ripley, B. D. (1994), ‘Neural networks and related methods for classification’, J. Royal Statist. Soc., B 56, 409–456.Google Scholar

Ripley, B. D. (1996), Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge.CrossRef Google Scholar

Royden, H. L. (1963), Real Analysis, MacMillan, New York.Google Scholar

Sarle, W. S. (1998), editor of Neural Network, FAQ, parts 1 to 7, Usenet newsgroup comp.ai.neural-nets, ftp://ftp.sas.com/pub/neural/FAQ.htmlGoogle Scholar

Sartori, M.A. and Antsaklis, P. J. (1991), ‘A, simple method to derive bounds on the size and to train multilayer neural networks’, IEEE Trans. Neural Networks 2, 467–471.CrossRef Google Scholar PubMed

Scarselli, F. and Tsoi, A. C. (1998), ‘Universal approximation using feedforward neural networks: a survey of some existing methods, and some new results’, Neural Networks 11, 15–37.Google Scholar

Schwartz, L. (1944), ‘Sur certaines familles non fondamentales de fonctions continues’, Bull. Soc. Math. France 72, 141–145.Google Scholar

Schwartz, L. (1947), ‘Théorie générale des fonctions moyenne-périodiques’, Ann. Math. 48, 857–928.CrossRef Google Scholar

Siu, K. Y., Roychowdhury, V. P. and Kailath, T. (1994), ‘Rational approximation techniques for analysis of neural networks’, IEEE Trans. Inform. Theory 40, 455–46.Google Scholar

Sontag, E. D. (1992), ‘Feedforward nets for interpolation and classification’, J. Comput. System Sci. 45, 20–48.CrossRef Google Scholar

Sprecher, D. A. (1993), ‘A universal mapping for Kolmogorov's superposition theorem’, Neural Networks 6, 1089–1094.Google Scholar

Sprecher, D. A. (1997), ‘A numerical implementation of Kolmogorov's superpositions II’, Neural Networks 10, 447–457.CrossRef Google Scholar

Stinchcombe, M. (1995), ‘Precision and approximate flatness in artificial neural networks’, Neural Computation 7, 1021–1039.Google Scholar

Stinchcombe, M. and White, H. (1989), ‘Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions’, in Proceedings of the IEEE 1989 International Joint Conference on Neural Networks, Vol. 1, IEEE, New York, pp. 613–618.Google Scholar

Stinchcombe, M. and White, H. (1990), ‘Approximating and learning unknown mappings using multilayer feedforward networks with bounded weights’, in Proceedings of the IEEE 1990 International Joint Conference on Neural Networks, Vol. 3, IEEE, New York, pp. 7–16.Google Scholar

Sumpter, B. G., Getino, C. and Noid, D. W. (1994), ‘Theory and applications of neural computing in chemical science’, Annual Rev. Phys. Chem. 45, 439–481.Google Scholar

Sussmann, H. J. (1992), ‘Uniqueness of the weights for minimal feedforward nets with a given input-output map’, Neural Networks 5, 589–593.CrossRef Google Scholar

Takahashi, Y. (1993), ‘Generalization and approximation capabilities of multilayer networks’, Neural Computation 5, 132–139.Google Scholar

de Villiers, J. and Barnard, E. (1992), ‘Backpropagation neural nets with one and two hidden layers’, IEEE Trans. Neural Networks 4, 136–141.CrossRef Google Scholar

Vostrecov, B. A. and Kreines, M. A. (1961), ‘Approximation of continuous functions by superpositions of plane waves’, Dokl. Akad. Nauk SSSR 140, 1237–1240 = Soviet Math. Dokl. 2, 1326–1329.Google Scholar

Wang, Z., Tham, M. T. and Morris, A. J. (1992), ‘Multilayer feedforward neural networks: a canonical form approximation of nonlinearity’, Internat. J. Control 56, 655–672.Google Scholar

Watanabe, S. (1996), ‘Solvable models of layered neural networks based on their differential structure’, Adv. Comput. Math. 5, 205–231.CrossRef Google Scholar

Williamson, R. C. and Helmke, U. (1995), ‘Existence and uniqueness results for neural network approximations’, IEEE Trans. Neural Networks 6, 2–13.CrossRef Google Scholar PubMed

Wray, J. and Green, G. G. (1995), ‘Neural networks, approximation theory and finite precision computation’, Neural Networks 8, 31–37.CrossRef Google Scholar

Xu, Y., Light, W. A. and Cheney, E. W. (1993), ‘Constructive methods of approximation by ridge functions and radial functions’, Numerical Alg. 4, 205–223.Google Scholar

Yukich, J. E., Stinchcombe, M. B. and White, H. (1995), ‘Sup-norm approximation bounds for networks through probabilistic methods’, IEEE Trans. Inform. Theory 41, 1021–1027.Google Scholar

Article contents

Approximation theory of the MLP model in neural networks

Abstract

Information

Access options

Article purchase

Temporarily unavailable

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests