Hostname: page-component-848d4c4894-x24gv Total loading time: 0 Render date: 2024-06-08T04:07:39.738Z Has data issue: false hasContentIssue false

Error-weighted maximum likelihood (EWML): a new statistically based method to cluster quantitative micropaleontological data

Published online by Cambridge University Press:  20 May 2016

Evan Fishbein
Affiliation:
Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, 911091
R. Timothy Patterson
Affiliation:
Ottawa-Carleton Geoscience Center and Department of Earth Sciences, Carleton University, Ottawa, Ontario, K1S 5B6, Canada

Abstract

The advent of readily available computer-based clustering packages has created some controversy in the micropaleontological community concerning the use and interpretation of computer-based biofacies discrimination. This is because dramatically different results can be obtained depending on methodology. The analysis of various clustering techniques reveals that, in most instances, no statistical hypothesis is contained in the clustering model and no basis exists for accepting one biofacies partitioning over another. Furthermore, most techniques do not consider standard error in species abundances and generate results that are not statistically relevant. When many rare species are present, statistically insignificant differences in rare species can accumulate and overshadow the significant differences in the major species, leading to biofacies containing members having little in common.

A statistically based “error-weighted maximum likelihood” (EWML) clustering method is described that determines biofacies by assuming that samples from a common biofacies are normally distributed. Species variability is weighted to be inversely proportional to measurement uncertainty. The method has been applied to samples collected from the Fraser River Delta marsh and shows that five distinct biofacies can be resolved in the data. Similar results were obtained from readily available packages when the data set was preprocessed to reduce the number of degrees of freedom. Based on the sample results from the new algorithm, and on tests using a representative micropaleontological data set, a more conventional iterative processing method is recommended. This method, although not statistical in nature, produces similar results to EWML (not commercially available yet) with readily available analysis packages. Finally, some of the more common clustering techniques are discussed and strategies for their proper utilization are recommended.

Type
Research Article
Copyright
Copyright © The Paleontological Society 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abramowitz, M. A., and Stegun, I. A. 1972. Handbook of Mathematical Functions with Formulae, Graphs and Mathematical Tables. U.S. Government Printing Office, Washington, D.C., 1,046 p.Google Scholar
Anderberg, M. R. 1973. Cluster Analysis for Applications. Academic Press, New York, 359 p.Google Scholar
Andersen, H. V. 1953. Two new species of Haplophragmoides from the Louisiana Coast. Contributions from the Cushman Foundation for Foraminiferal Research, 4:2022.Google Scholar
Boltovskoy, E. 1978. Late Cenozoic benthonic foraminifera of the Ninetyeast Ridge (Indian Ocean). Marine Geology, 26:139175.CrossRefGoogle Scholar
Brady, G. S., and Robertson, D. 1870. The Ostracoda and Foraminifera of tidal rivers with an analysis and description of the Foraminifera. Annual Magazine of Natural History, 6:273309.Google Scholar
Buzas, M. A. 1970. On the quantification of biofacies. Proceedings of the North American Paleontological Convention, Part B:101116.Google Scholar
Buzas, M. A. 1979. Quantitative biofacies analysis. Foraminiferal Ecology and Paleoecology. SEPM Short Course No. 6:1120.Google Scholar
Buzas, M. A. 1990. Another look at confidence limits for species proportions. Journal of Paleontology, 64:842843.CrossRefGoogle Scholar
Cole, W. S. 1931. The Pliocene and Pleistocene foraminifera of Florida. Bulletin of the Florida State Geological Survey, 6:779.Google Scholar
Cronbach, L. J., and Gleser, G. C. 1953. Assessing the similarity between profiles. Psychological Bulletin, 50:456473.Google Scholar
Cushing, J. T. 1975. Applied Analytical Mathematics for Physical Scientists. John Wiley & Sons, New York, 651 p.Google Scholar
Cushman, J. A. 1925. Recent foraminifera from British Columbia. Contributions from the Cushman Laboratory for Foraminiferal Research, 1:3847.Google Scholar
Cushman, J. A., and Brönniman, P. 1948. Additional new species of arenaceous foraminifera from the shallow waters of Trinidad. Contributions from the Cushman Laboratory for Foraminiferal Research, 24:3743.Google Scholar
Friedman, H. P., and Rubin, J. 1967. On some invariant criteria for grouping data. Journal of the American Statistical Association, 62:11591178.CrossRefGoogle Scholar
Goldstein, S. T., and Frey, R. W. 1986. Salt marsh foraminifera, Sapelo Island, Georgia. Senckenbergiana Maritima, 18:97121.Google Scholar
Hartigan, J. A. 1975. Clustering Algorithms. John Wiley & Sons, New York, 351 p.Google Scholar
Hartigan, J. A., and Wong, M. A. 1979. A K-means clustering algorithm: algorithm AS 136. Applied Statistics, 28:456473.Google Scholar
Hooper, K. 1969a. Processing of foraminiferal data: a computer program, p. 291306. In Brönniman, P. and Renz, H. H. (eds.), Proceedings of the First International Conference on Planktonic Microfossils, Vol. II, Geneva, 1967. E. J. Brill, Leiden.Google Scholar
Hooper, K. 1969b. A re-evaluation of eastern Mediterranean foraminifera using factor-vector analysis. Contributions from the Cushman Foundation for Foraminiferal Research, 20:147151.Google Scholar
Jardine, C. J., Jardine, N., and Sibson, C. 1967. The structure and constitution of taxonomic hierarchies. Mathematical Biosciences, 1:173179.Google Scholar
Johnson, S. C. 1967. Hierarchical clustering schemes. Psychometrika, 32:241254.CrossRefGoogle ScholarPubMed
Lance, G. N., and Williams, W. T. 1967. A general theory of classificatory sorting strategies. 1. Hierarchical systems. Computer Journal, 9:373380.CrossRefGoogle Scholar
Linné, C. 1758. Systerna naturae per regna tria naturae, secundum classes, ordines, genera, species, cim characteribus, differentiis, synonymis, locis. G. Engelmann, Lipsiae, , ed., 1:1824.Google Scholar
Montagu, G. 1808. Testacea Britannica, supplement. Exeter, England. Printed by S. Woolmer, 183 p.Google Scholar
Patterson, R. T. 1990. Intertidal benthic foraminiferal biofacies on the Fraser River Delta, British Columbia: modern distribution and paleoecological importance. Micropaleontology, 36:229244.Google Scholar
Patterson, R. T., and Fishbein, E. 1989. Re-examination of the statistical methods used to determine the number of point counts needed for micropaleontological quantitative research. Journal of Paleontology, 63:245248.Google Scholar
Scott, D. B. 1976. Quantitative studies of marsh foraminiferal patterns in southern California and their application to Holocene stratigraphic problems, p. 153170. In Schafer, C. T. and Pelletier, B. R. (eds.), First International Symposium on Benthonic Foraminifera on Continental Margins, Part A, Ecology and Biology. Maritime Sediments, Special Publication 1.Google Scholar
Scott, D. B., and Medioli, F. S. 1980. Quantitative studies of marsh foraminiferal distributions in Nova Scotia. Implications for sea level studies. Cushman Foundation for Foraminiferal Research Special Publication No. 17, 58 p.Google Scholar
Sneath, P. H. A., and Sokal, R. R. 1973. Principles of Numerical Taxonomy. W. H. Freeman, New York, 574 p.Google Scholar
Ward, J. H. Jr. 1963. Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58:236244.Google Scholar
Ward, J. H. Jr., and Hook, M. E. 1963. Application of an hierarchical grouping procedure to a problem of grouping profiles. Educational and Psychological Measurement, 23:6982.Google Scholar
Wilks, S. S. 1960. Multidimensional statistical scatter, p. 486503. In Olkin, I. (ed.), Contributions to Probability and Statistics. Stanford University Press.Google Scholar
Wilks, S. S. 1962. Mathematical Statistics. John Wiley & Sons, New York, 644 p.Google Scholar
Yzerdraat, W., Hooper, K., and Erdtmann, B.-D. 1969. Fortran programs for faunal analysis. Carleton University, Department of Geology Geological Paper 69-3, Ottawa, Canada, 106 p.Google Scholar
Zahn, C. T. 1971. Graph-theoretical methods for detecting and describing Gestalt clusters. IEEE Transactions on Computers, C-20:6886.CrossRefGoogle Scholar