Skip to main content Accessibility help
×

Book description

This book is for anyone who has biomedical data and needs to identify variables that predict an outcome, for two-group outcomes such as tumor/not-tumor, survival/death, or response from treatment. Statistical learning machines are ideally suited to these types of prediction problems, especially if the variables being studied may not meet the assumptions of traditional techniques. Learning machines come from the world of probability and computer science but are not yet widely used in biomedical research. This introduction brings learning machine techniques to the biomedical world in an accessible way, explaining the underlying principles in nontechnical language and using extensive examples and figures. The authors connect these new methods to familiar techniques by showing how to use the learning machine models to generate smaller, more easily interpretable traditional models. Coverage includes single decision trees, multiple-tree techniques such as Random Forests™, neural nets, support vector machines, nearest neighbors and boosting.

Reviews

'The book is well written and provides nice graphics and numerous applications.'

Michael R. Chernick Source: Technometrics

Refine List

Actions for selected content:

Select all | Deselect all
  • View selected items
  • Export citations
  • Download PDF (zip)
  • Save to Kindle
  • Save to Dropbox
  • Save to Google Drive

Save Search

You can save your searches here and later view and run them again in "My saved searches".

Please provide a title, maximum of 40 characters.
×

Contents

References
Adami, C (2006). Reducible complexity. Science, 312: 61–3.
Agarwal, S, Niyogi, P (2009). Generalization bounds for ranking algorithms via algorithmic stability. Journal of Machine Learning Research, 10: 441–74.
Agresti, A (1996). An Introduction to Categorical Data Analysis. John Wiley & Sons.
Agresti, A, Franklin, C (2009). Statistics: The Art and Science of Learning from Data. Second Edition. Prentice-Hall.
Agresti, A, Min, Y (2005). Simple improved confidence intervals for comparing matched proportions. Statistics in Medicine, 24: 729–40.
Ailon, N, Mohri, M (2008). An efficient reduction of ranking to classification. In Proceedings of the 21st Annual Conference on Learning Theory.
Alqallaf, FA, Konis, KP, Martin, RD, Zamar, RH (2002). Scalable robust covariance and correlation estimates for data mining. Proceedings of the SIGKDD Conference 2002, Edmonton, Alberta, Canada: 1–10.
Anderson-Sprecher, R (1994). Model comparisons and R2. American Statistician, 48(2): 113–17.
Archer, KJ, Mas, VR (2009). Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation dataset. Statistics in Medicine, in press.
Banzhaf, W, Beslon, G, Christensen, S, Foster, JA, Képès, F, Lefort, V, Miller, JF, Radman, M, Ramsden, JJ (2006). From artificial evolution to computational evolution: a research agenda. Nature Reviews Genetics, 7: 729–35.
Bartlett, PLet al. (2004). Discussion of three boosting papers. Annals of Statistics, 32(1): 85–134.
Berger, JO, Wolpert, RL (1988). The Likelihood Principle. Institute of Mathematical Statistics, Lecture Notes – Monograph Series, Vol. 6.
Berk, RA (2008). Statistical Learning from a Regression Perspective. Springer-Verlag.
Berlinet, A, Biau, G, Rouvière, L (2008). Functional supervised classification with wavelets. Annales de l'ISUP, 52: 61–80.
Biau, G (2010). Analysis of a random forests model. Submitted to Journal of Machine Learning Research.
Biau, G and Bleakley, K (2006). Statistical inference on graphs. Statistics & Decisions, 24: 209–32.
Biau, G, Bleakley, K, Györfi, L, Ottucsák, G (2010a). Nonparametric sequential prediction of time series. Journal of Nonparametric Statistics, 22: 297–317.
Biau, G, Cérou, F, Guyader, A (2010b). On the rate of convergence of the bagged nearest neighbor estimate. Journal of Machine Learning Research, 11: 687–712.
Biau, G, Cérou, F, Guyader, A (2010c). Rates of convergence of the functional k-nearest neighbor estimate. IEEE Transactions on Information Theory, 56: 2034–40.
Biau, G, Devroye, L (2005). Density estimation by the penalized combinatorial method. Journal of Multivariate Analysis, 94: 196–208.
Biau, G, Devroye, L (2010). On the layered nearest neighbor estimate, the bagged nearest neighbor estimate and the random forest method in regression and classification. Journal of Multivariate Analysis, in press.
Biau, G, Devroye, L, Lugosi, G (2008). Consistency of random forests and other averaging classifiers. Journal of Machine Learning Research, 9: 2015–33.
Blanchard, G, Lugosi, G, Vayatis, N (2003). On the rate of convergence of regularized boosting methods. Journal of Machine Learning Research 4: 861–94.
Bleakley, K, Biau, G, Vert, J-P (2007). Supervised reconstruction of biological networks with local models. Bioinformatics, 23: 157–65.
Bonita, R, Beaglehole, R (1988). Recovery of motor function after stroke. Stroke, 19: 1497–500.
Borg, I, Groenen, PJF (2005). Modern Multidimensional Scaling. Second Edition. Springer-Verlag.
Bossuyt, PM, Irwig, L, Craig, J, Glasziou, P (2006). Comparative accuracy: assessing new tests against existing diagnostic pathways. British Medical Journal, 332: 1089–92.
Breiman, L (2004). The 2002 Wald Memorial Lectures. Population theory for boosting ensembles. Annals of Statistics, 32(1): 1–11.
Breiman, L, Friedman, J, Stone, CJ, Olshen, RA (1984, 1993). Classification and Regression Trees. Chapman & Hall.
Carlin, BP, Louis, TA (2000). Bayes and Empirical Bayes Methods for Data Analysis. Second Edition. Chapman & Hall/CRC Press.
Casale, S, Russo, A, Scebba, G, Serrano, S (2008). Speech Emotion Classification Using Machine Learning Algorithms. 2008 IEEE International Conference on Semantic Computing: 158–65.
Claeskens, G, Hjort, NL (2008). Model Selection and Model Averaging. Cambridge University Press.
Cover, T, Hart, P (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13: 21–7.
Cox, DR (1958). Two further applications of a model for binary regression. Biometrika, 45: 562–5.
Cox, TF, Cox, MAA (2001). Multidimensional Scaling. Second Edition. Springer-Verlag.
Cristianini, N, Shawe-Taylor, J (2000). Support Vector Machines, and Other Kernel-Based Learning Methods. Cambridge University Press.
Davison, AC, Hinkley, DV (1997). Bootstrap Methods and their Application. Cambridge University Press.
Devroye, L, Györfi, L, Lugosi, G (1996). A Probabilistic Theory of Pattern Recognition. Springer-Verlag.
Devroye, L, Lugosi, G (2001). Combinatorial Methods in Density Estimation. Springer-Verlag.
Devroye, L, Wagner, T (1976). A distribution-free performance bound in error estimation. IEEE Transactions on Information Theory, 22: 586–7.
Díaz-Uriarte, R, Alvarez de Andrés, S (2006). Gene selection and classification of microarray data using random forest. BMC Bioinformatics 2006, 7:3 doi:10.1186/1471–2105–7–3.
Draper, NR (1984). The Box–Wetz criterion versus R2. Journal of the Royal Statistical Society, Series A (General), 147(1): 100–103.
Draper, NR, Smith, H (1998). Applied Regression Analysis. John Wiley & Sons.
Edgington, ES (1995). Randomization Tests. Third Edition. Marcel-Dekker.
Efron, B, Tibshirani, R (1997). Improvement on cross-validation: the 632+ bootstrap method. Journal of the American Statistical Association, 92: 548–60.
Elashoff, JD, Elashoff, RM, Goldman, GE (1967). On the choice of variables in classification problems with dichotomous variables. Biometrika, 54: 668–70.
Fagin, R, Kumar, R, Sivakumar, D (2003). Comparing top k lists. SIAM Journal of Discrete Mathematics, 20(3): 628–48.
Fagin, R, Kumar, R, Mahdian, M, Sivakumar, S, Vee, E (2006). Comparing partial rankings. SIAM Journal of Discrete Mathematics, 20(3): 628–48.
Forti, A, Foresti, GL (2006). Growing hierarchical tree SOM: an unsupervised neural network with dynamic topology. Neural Networks, 19(10): 1568–80.
Freedman, D (1991). Statistical models and shoe leather. Sociological Methodology, 21: 291–313.
Freedman, D (1995). Some issues in the foundation of statistics. Foundations of Science, 1: 19–39.
Freedman, D (1997). From association to causation via regression. Advances in Applied Mathematics, 18: 59–110.
Freedman, D (1999). From association to causation: some remarks on the history of statistics. Statistical Science, 14(3): 243–58.
Freedman, D (2009). Statistical Models: Theory and Practice. Revised Edition. Cambridge University Press.
Garcia-Pedrajas, N, Ortiz-Boyer, D (2008). Boosting random subspace method. Neural Networks, 21: 1344–62.
Gelman, A, Carlin, JB, Stern, HS, Rubin, DB (2004). Bayesian Data Analysis. Second Edition. Chapman & Hall/CRC Press.
Getoor, L, Taskar, B (eds) (2007). Introduction to Statistical Relational Learning. MIT Press.
Ghosh, D, Chinnaiyan, AM (2005). Classification and selection of biomarkers in genomic data using LASSO. Journal of Biomedicine and Biotechnology, 2: 147–54.
Gintis, H (2009). The Bounds of Reason: Game Theory and the Unification of the Behavioural Sciences. Princeton University Press.
Glantz, SA (2002). Primer of Biostatistics. Sixth Edition. McGraw-Hill.
Gneiting, T, Raftery, AE (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477): 359–78.
Good, P (2005). Permutation, Parametric, and Bootstrap Tests of Hypotheses. Third Edition. Springer-Verlag.
Guyatt, GHet al. (1992). Evidence-based medicine. Journal of the American Medical Association, 268(17): 2420–25.
Györfi, L, Kohler, M, Krzyżak, A, Walk, H (2002, 2010). A Distribution-Free Theory of Nonparametric Regression. Springer-Verlag.
Hamedani, GG, Volkmer, HW (2009). Letter to the Editor. American Statistician, 63(3): 295.
Hand, DJ (1997). Construction and Assessment of Classification Rules. John Wiley & Sons.
Hand, DJ (2006). Classifier technology and the illusion of progress (with Comments and Rejoinder). Statistical Science, 21(1): 1–34.
Hardin, J, Mitani, A, Hicks, L, VanKoten, B (2007). A robust measure of correlation between two genes on a microarray. BMC Bioinformatics, 8: 220–33.
Harrell, FE, Jr. (2001). Regression Modeling Strategies. Springer-Verlag.
Hastie, T, Tibshirani, R, Friedman, J (2009). Elements of Statistical Learning. Second Edition. Springer-Verlag.
Haunsperger, DB (1992). Dictionaries of paradoxes for statistical tests on k samples. Journal of the American Statistical Association, 87(417): 149–55.
Haunsperger, DB (2003). Aggregated statistical rankings are arbitrary. Social Choice and Welfare, 20: 261–72.
Haunsperger, DB, Saari, DG (1991). The lack of consistency for statistical decision problems. The American Statistician, 45(3): 252–5.
Healy, MJR (1984). The use of R2 as a measure of goodness of fit. Journal of the Royal Statistical Society, Series A, 147(4): 608–9.
Hilbe, JM (2009). Logistic Regression Models. Chapman & Hall/CRC Press.
Hilden, J (2003). Book review: Parmigiani, Giovanni (2002). Modeling in Medical Decision Making. A Bayesian Approach. John Wiley & Sons. Published in Statistics in Medicine, 23: 663–4.
Hilden, J (2004). Evaluation of diagnostic tests – the schism. (Notes on ROC methods and the value of information approach.) Available at: http://staff.pubhealth.ku.dk/~jh/.
Ho, TK (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8): 832–44.
Holden, ZA, Crimmins, M, Luce, C, Heyerdahl, EK, Morgan, P (2008). Analysis of Climate and Topographic Controls on Burn Severity in the Western United States (1984–2005), American Geophysical Union, Fall Meeting 2008, abstract #GC21A-0708.
Hosmer, DW, Lemeshow, S (2000). Applied Logistic Regression. Second Edition. John Wiley & Sons.
Hothorn, T, Hornik, K, Zeileis, A (2006). Unbiased recursive partitioning: a conditional inference framework. Journal of Computational and Graphical Statistics, 15: 651–74.
Hu, B, Palta, M, Shao, J (2006). Properties of R2 statistics for logistic regression. Statistics in Medicine, 25: 1383–95.
Ingsrisawang, L, Ingsriswang, S, Somchit, S, Aungsuratana, P, Khantiyanan, W (2008). Machine Learning Techniques for Short-Term Rain Forecasting System in the Northeastern Part of Thailand. Proceedings of the World Academy of Science, Engineering and Technology, 31 July: 248–53.
Ishwaran, H, James, LF, Zarepour, M (2009). An alternative to the m out of n bootstrap. Journal of Statistical Planning and Inference, 139: 788–801.
Ishwaran, H, Kogalur, UB, Blackstone, EH, Lauer, MS (2008). Random survival forests. The Annals of Applied Statistics, 2(3): 841–60.
Ishwaran, H, Kogalur, UB, Gorodeski, EZ, Minn, AJ, Lauer, MS (2010). High-dimensional variable selection for survival data. Journal of the American Statistical Association, 105(489): 205–17.
Japkowicz, N (2009). Bibliography for imbalanced (unbalanced) data analysis. http://www.site.uottawa.ca/~nat/Research/class_imbalance_bibli.html.
Jerebko, AK, Malley, JD, Franaszek, M, Summers, RM (2003). Multiple neural network classification scheme for detection of colonic polyps in CT colonography data sets. Academic Radiology, 10: 154–60.
Jerebko, AK, Malley, JD, Franaszek, M, Summers, RM (2005). Support vector machines committee classification method for computer-aided polyp detection in CT colonography. Academic Radiology, 12: 479–86.
Jiang, W (2004). Process consistency for adaboost. Annals of Statistics, 32(1): 13–29.
Jiang, W, Simon, R (2007). A comparison of bootstrap methods and an adjusted bootstrap approach for estimating prediction error in microarray classification. Statistics in Medicine, 26: 5320–34.
Jo, T, Japkowicz, N (2004). A multiple resampling method for learning from imbalanced data sets. Computational Intelligence, 20(1): 18–36.
Kabaila, P, Leeb, H (2006). On the large-sample minimal coverage probability of confidence intervals after model selection. Journal of the American Statistical Association, 101(474): 619–29.
Kendall, MG (1980). Multivariate Analysis. Second Edition. Charles Griffin & Company.
Kendall, M, Stuart, A (1979). The Advanced Theory of Statistics. Volume 2, Fourth Edition. Macmillan Publishing Company.
Koltchinskii, V, Yu, B (2004). Three papers on boosting: an introduction. Annals of Statistics, 32(1): 12.
König, I, Malley, JD, Pajevic, S, Weimar, C, Diener, H, Ziegler, A (2008). Patient-centered yes/no prognosis using learning machines. International Journal of Data Mining and Bioinformatics, 2(4): 289–341.
König, I, Malley, JD, Weimar, C, Diener, H, Ziegler, A (2007). Practical experiences on the necessity of external validation. Statistics in Medicine, 26(30): 5499–511.
Kooperberg, C, Ruczinski, I (2005). Identifying interacting SNPs using Monte Carlo logic regression. Genetic Epidemiology, 28: 157–70.
Körner, TW (2008). Naïve Decision Making: Mathematics Applied to the Social World. Cambridge University Press.
Kshirsagar, AM (1972). Multivariate Analysis. Marcel Dekker, Inc.
Kuss, O (2002). Global goodness-of-fit tests in logistic regression with sparse data. Statistics in Medicine, 21: 3789–801.
Lane, T (2000). Machine learning techniques for the computer security domain of anomaly detection. PhD thesis, Purdue University, West Lafayette, IN (August 2000). Available online at: http://www.cs.unm.edu/~terran/publications.
Lawless, JF, Yuan, Y (2010). Estimation of prediction error for survival models. Statistics in Medicine, 29: 262–74.
Leeb, H (2005). The distribution of a linear predictor after model selection: conditional finite-sample distributions and asymptotic approximations. Journal of Statistical Planning and Inference, 134: 64–89.
Leeb, H, Pötscher, BM (2005). Can one estimate the unconditional distribution of the post-model-selection estimators? Online at http://mpra.ub.uni-muenchen.de/1895.
Leeb, H, Pötscher, BM (2006). Can one estimate the conditional distribution of post-model-selection estimators?Annals of Statistics, 34: 2554–91.
Lin, Y, Jeon, Y (2006). Random forests and adaptive nearest neighbors. Journal of the American Statistical Association, 101(474): 578–90.
Lippert, C, Stegle, O, Ghahramani, Z, Borgwardt, KM (2009). A kernel method for unsupervised structured network inference. Proceeding of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS) 2009, Clearwater Beach, Florida, USA. Vol. 5 of Journal of Machine Learning and Research: W&CP 5.
Lugosi, G, Vayatis, N (2004). On the Bayes-risk consistency of regularized boosting methods. Annals of Statistics, 32(1): 30–55.
Ma, S and Huang, J (2008). Penalized feature selection and classification in bioinformatics. Briefings in Bioinformatics, 9(5): 392–403.
MacKay, DJC (1992a). A practical Bayesian framework for backpropagation networks. Neural Computation, 4: 448–72.
MacKay, DJC (1992b). The evidence framework applied to classification networks. Neural Computation, 4: 720–36.
Mahoney, FI, Barthel, D (1965). Functional evaluation: the Barthel Index. Maryland State Medical Journal, 14: 56–61.
Malley, JD, Jerebko, AK, Miller, MT, Summers, RM (2003). Variance reduction for error estimation when classifying colon polyps from CT colonography. Medical Imaging 2003: Physiology and Function: Methods, Systems, and Applications (Clough, AV, Amini, AA, Eds). Proceedings of SPIE, 5031: 570–78.
Mease, D, Wyner, A (2008). Evidence contrary to the statistical view of boosting. Journal of Machine Leaning and Research, 9: 131–56; Response and Rejoinder: 157–201.
Mease, D, Wyner, AJ, Buja, A (2007). Boosted classification trees and class probability/quantile estimation. Journal of Machine Leaning and Research, 8: 409–39.
Meinshausen, N (2006). Quantile regression forests. Journal of Machine Learning Research, 7: 983–99.
Melnik, O, Vardi, Y, Zhang, C-H (2007). Concave learners for Rankboost. Journal of Machine Learning Research, 8: 791–812.
Melvin, I, Ie, E, Weston, J, Noble, WS, Leslie, C (2007). Multi-class protein classification using adaptive codes. Journal of Machine Learning Research, 8: 1557–81.
Mielke, PW, Berry, KJ (2001). Permutation Methods: A Distance Function Approach. Springer-Verlag.
Miller, MT, Jerebko, AK, Malley, JD, Summers, RM (2003). Feature selection for computer-aided polyp detection using genetic algorithms. Medical Imaging 2003: Physiology and Function: Methods, Systems, and Applications (Clough, AV, Amini, AA, Eds), Proceedings of SPIE, 5031: 102–10.
Miller, ME, Langefeld, CD, Tierney, WM, Hui, SL, McDonald, CJ (1993). Validation of probabilistic predictions. Medical Decision Making, 13(1): 49–57.
Minku, FL, Ludermir, TB (2008). Clustering and co-evolution to construct neural network ensembles: an experimental study. Neural Networks, 21: 1363–79.
Mojirsheibani, M (1999). Combining classifiers via discretization. Journal of the American Statistical Association, 94: 600–609.
Mojirsheibani, M (2002). An almost surely optimal combined classification rule. Journal of Multivariate Analysis, 81: 28–46.
Molinaro, A, Simon, R, Pfeiffer, R (2005). Prediction error estimation: a comparison of resampling methods. Bioinformatics, 21: 3301–7.
Montori, VM, Guyatt, GH (2008). Progress in evidence-based medicine. Journal of the American Medical Association, 300(15): 1814–16.
Moons, KGM, Donders, ART, Steyerberg, EW, Harrell, FE (2004). Penalized maximum likelihood estimation to directly adjust diagnostic and prognostic prediction models for overoptimism: a clinical example. Journal of Clinical Epidemiology, 57: 1262–70.
Muirhead, RJ (1982). Aspects of Multivariate Statistical Theory. John Wiley & Sons.
Mukhopadhyay, N (2009). Letter to the Editor. American Statistician, 63(1): 102–3.
Newcombe, RG (1998a). Improved confidence intervals for the difference between binomial proportions based on paired data. Statistics in Medicine, 17: 2635 –50.
Newcombe, RG (1998b). Two-sided confidence intervals for the single proportion: comparison of seven methods. Statistics in Medicine, 17: 857–72.
Newcombe, RG (1998c). Interval estimation for the difference between independent proportions. Statistics in Medicine, 17: 873–90.
Newcombe, RG (2006a). Confidence intervals for an effect size measure based on the Mann–Whitney statistic. Part 1: General issues and tail area based methods. Statistics in Medicine, 25(4): 543–57.
Newcombe, RG (2006b). Confidence intervals for an effect size measure based on the Mann–Whitney statistic. Part 2: Asymptotic methods and evaluation. Statistics in Medicine, 25(4): 559–73.
Newcombe, RG (2007). Comments on “Confidence intervals for a ratio of binomial proportions based on paired data.”Statistics in Medicine, 26(25): 4684–5.
Nicodemus, K, Malley, JD (2009). Predictor correlation impacts machine learning algorithms: implications for genomic studies. Bioinformatics, 25(15): 1884–90.
Nicodemus, KK, Malley, JD, Strobl, C, Ziegler, A (2010). The behaviour of random forest permutation-based variable importance measures under predictor correlation. BMC Bioinformatics, 11: 110–23.
Omurlu, IK, Ture, M, Tokatli, F (2009). The comparisons of random survival forests and Cox regression analysis with simulation and an application related to breast cancer. Expert Systems with Applications, 36: 8582–8588.
Ottenbacher, KJ, Ottenbacher, HR, Tooth, L, Ostir, GV (2004). A review of two journals found that articles using multivariable logistic regression frequently did not report commonly recommended assumptions. Journal of Clinical Epidemiology, 57: 1147–52.
Ottenbacher, KJ, Smith, PM, Illig, SB, Linn, RT, Fiedler, RC, Granger, CV (2001). Comparison of logistic regression and neural networks to predict rehospitalization in patients with stroke. Journal of Clinical Epidemiology, 54: 1159–65.
Pace, L, Salvan, A (1997). Principles of Statistical Inference. World Scientific.
Pepe, MS (2003). The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press.
Potter, DM (2005). A permutation test for inference in logistic regression with small- and moderate-sized data sets. Statistics in Medicine, 24: 693–708.
Potter, DM (2008). Notes for CRAN Library package “logregperm.” Inference in Logistic Regression. Date/Publication 2008–04–22 07:36:27.
Predd, J, Seiringer, R, Lieb, EH, Osherson, D, Poor, V, Kulkarni, S (2009). Probabilistic coherence and proper scoring rules. IEEE Transactions on Information Theory, in press.
Ramsay, JO, Silverman, BW (1997). Functional Data Analysis. Springer-Verlag.
Ratkowsky, DA (1990). Handbook of Nonlinear Regression Models. Marcel Dekker.
Rogers, W, Wagner, T (1978). A finite sample distribution-free performance bound for local discrimination rules. Annals of Statistics, 6: 506–14.
Ruczinski, I, Kooperberg, C, LeBlanc, ML (2003). Logic regression. Journal of Computational and Graphical Statistics, 12(3): 475–511.
Rutjes, AWS, Reitsma, JB, Nisio, M Di, Smidt, N, Rijn, JC, Bossuyt, PMM (2006). Evidence of bias and variation in diagnostic accuracy studies. Canadian Medical Association Journal, 174(4): 1–12 (online).
Saari, DG (1995). A chaotic exploration of aggregation paradoxes. SIAM Review, 37(1): 37–52.
Saari, DG (2008). Disposing Dictators, Demystifying Voting Paradoxes: Social Choice Analysis. Cambridge University Press.
Sabato, S, Shalev-Shwartz, S (2008). Ranking categorical features using generalization properties. Journal of Machine Learning Research, 9: 1083–14.
Schölkopf, B, Smola, AJ (2002). Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press.
Schölkopf, B, Herbrich, R, Smola, AJ, Williamson, RC (2001). A generalized representer theorem. Proceedings of the 14th Annual Conference on Computational Learning Theory, COLT 2001. Lecture Notes in Computer Science (Springer): 416–26.
Schwarz, DF, König, IR, Ziegler, A (2010). On safari to Random Jungle: a fast implementation of Random Forests for high dimensional data. Bioinformatics, in press.
Seber, GAF, Wild, CJ (1989). Nonlinear Regression. John Wiley & Sons.
Severini, TA (2000). Likelihood Methods in Statistics. Oxford University Press.
Shao, H, Burrage, LC, Sinasac, DS, Hill, AE, Ernest, SR, O'Brien, W, Courtland, HW, Jepsen, KJ, Kirby, A, Kulbokas, EJ, Daly, MJ, Broman, KW, Lander, ES, Nadeau, JH (2008). Genetic architecture of complex traits: large phenotypic effects and pervasive epistasis. Proceedings of the National Academy of Science USA, 105(50): 19910–4.
Shapire, RE, Singer, Y (2000). BoosTexter: a boosting-based system for text categorization. Machine Learning, 39(2/3): 135–68.
Shi, T, Seligson, D, Belldebrun, AS, Palotie, A, Horvath, S (2005). Tumor classification by tissue microarray profiling: random forest clustering applied to renal cell carcinoma. Modern Biology, 18: 547–57.
Shieh, G (2001). The inequality between the coefficient of determination and the sum of squared simple correlation coefficients. The American Statistician, 55(2): 121–4.
Spanos, A (2010). The discovery of argon: a case for learning from data?Philosophy of Science, 77(3): 359–80.
Spiegelhalter, DJ, Marshall, EC (2006). Strategies for inference robustness in focused modelling. Journal of Applied Statistics, 33(2): 217–31.
Sprent, P, Smeeton, NC (2001). Applied Nonparametric Statistical Methods. Third Edition. Chapman & Hall/CRC Press.
Steinwart, I (2005). Consistency of support vector machines and other regularized kernel classifiers. IEEE Transactions on Information Theory, 51(1): 128–42.
Steyerberg, EW, Borsboom, GJJM, Houwelingen, HC, Eijkemans, MJC, Habbema, JDF (2004). Validation and updating of predictive logistic regression models: a study on sample size and shrinkage. Statistics in Medicine, 23: 2567–86.
Strobl, C, Malley, JD, Tutz, G (2009). An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14(4): 323–48.
Stone, C (1977). Consistent nonparametric regression. Annals of Statistics, 13: 689–705.
Strobl, C, Boulesteix, A-L, Kneib, T, Augustin, T, Zeileis, A (2008). Conditional variable importance for random forests. BMC Bioinformatics, 9: 307–18.
Strobl, C, Boulesteix, A-L, Zeileis, A, Hothorn, T (2007). Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, 8: 25–46.
Tango, T (1998). Equivalence test and confidence interval for the difference in proportions for the paired-sample design. Statistics in Medicine, 17: 891–908.
Tango, T (1999). Improved confidence intervals for the difference between binomial proportions based on paired data (with Author's Reply). Statistitics in Medicine, 18: 3511–13.
Tango, T (2000). Confidence intervals for differences in correlated binary proportions (with Author's Reply). Statistics in Medicine, 19: 133–9.
Tibshirani, R (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58(1): 267–88.
Toussaint, GT (1971). Note on optimal selection of independent binary-valued features for pattern recognition. IEEE Transactions on Information Theory, 17: 618.
Twisk, JWR (2006). Applied Multilevel Analysis. Cambridge University Press.
Übeyli, ED (2006). Combining neural network models for automated diagnostic systems. Journal of Medical Systems, 30: 483–8.
Vapnik, V (2000). The Nature of Statistical Learning Theory. Second Edition. Springer-Verlag.
Varma, S, Simon, R (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 7: 91–8.
Walker, JA, Miller, JF (2008). The automatic acquisition, evolution and reuse of modules in cartesian genetic programming. IEEE Transactions on Evolutionary Computation, 12(4): 397–417.
Wang, BX, Japkowicz, N (2008). Boosting support vector machines for imbalanced data sets. Proceedings of the 17th International Symposium on Methodologies for Intelligent Systems (ISMIS 2008).
Ward, MM, Hendrey, MR, Malley, JD, Learch, TJ, Davis, JC Jr, Reveille, JD, Weisman, MH (2009). Clinical and immunogenetic prognostic factors for radiographic severity in ankylosing spondylitis. Arthritis & Rheumatism (Arthritis Care & Research), 61(7): 859–66.
Ward, MM, Pajevic, S, Dreyfuss, J, Malley, JD (2006). Short-term prediction of mortality in patients with systemic lupus erythematosus: classification of outcomes using random forests. Arthritis & Rheumatism (Arthritis Care & Research), 55(1): 74–80.
Welsh, AH (1996). Aspects of Statistical Inference. John Wiley & Sons.
Willan, AR, Pinto, EM (2005). The value of information and optimal clinical trial design. Statistics in Medicine, 24: 1791–806.
Willett, JB, Singer, JD (1988). Another cautionary note about R2: its use in weighted least-square regression analysis. American Statistician, 42(3): 236–8.
Wolpert, DH (1992). Stacked generalization. Neural Networks, 5: 241–59.
Wolpert, DH, Macready, WG (1996). Combining stacking with bagging to improve a learning algorithm. Santa Fe Institute technical report, August 1996, SFI-TR-96–03–123.
Zakai, A, Ritov, Y (2008). How local should a learning method be?21st Annual Conference on Learning Theory (COLT): 205–16.
Zakai, A, Ritov, Y (2009). Consistency and localizability. Journal of Machine Learning Research, 10: 827–56.
Zhang, T (2004a). Statistical behavior and consistency of classification methods based on convex risk minimization. Annals of Statistics, 32(1): 56–134.
Zhang, H (2004b). The optimality of naive Bayes. Proceedings of the 17th International FLAIRS (Florida Artificial Intelligence Research Society) Conference. Association for the Advancement of Artificial Intelligence Press.
Zhao, Z, Liu, H (2007). Searching for interacting features. Proceedings of the International Joint Conference on Artificial Intelligence: 1156–61.
Zhou, X-H, Qin, G (2003). A new confidence interval for the difference between two binomial proportions of paired data. Technical Report No. 205, University of Washington.
Zhou, X-H, Obuchowski, NA, McClish, DK (2002). Statistical Methods in Diagnostic Medicine. John Wiley & Sons.

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Book summary page views

Total views: 0 *
Loading metrics...

* Views captured on Cambridge Core between #date#. This data will be updated every 24 hours.

Usage data cannot currently be displayed.