Abstract
Abstract A method for identifying clusters of points in a multidimensional Euclidean space is described and its application to taxonomy considered. It reconciles, in a sense, two different approaches to the investigation of the spatial relationships between the points, viz., the agglomerative and the divisive methods. A graph, the shortest dendrite of Florek etal. (1951a), is constructed on a nearest neighbour basis and then divided into clusters by applying the criterion of minimum within cluster sum of squares. This procedure ensures an effective reduction of the number of possible splits. The method may be applied to a dichotomous division, but is perfectly suitable also for a global division into any number of clusters. An informal indicator of the "best number" of clusters is suggested. It is a"variance ratio criterion" giving some insight into the structure of the points. The method is illustrated by three examples, one of which is original. The results obtained by the dendrite method are compared with those obtained by using the agglomerative method or Ward (1963) and the divisive method of Edwards and Cavalli-Sforza (1965). Keywords: numerical taxonomy cluster analysis minimum variance (WGSS) criterion for optimal grouping approximate grouping procedure shortest dendrite = minimum spanning tree variance ratio criterion for best number of groups
Keywords
Related Publications
The Effect of Cluster Size, Dimensionality, and the Number of Clusters on Recovery of True Cluster Structure
An evaluation of four clustering methods and four external criterion measures was conducted with respect to the effect of the number of clusters, dimensionality, and relative cl...
Comparing three classification strategies for use in ecology
Abstract. We compare three common types of clustering algorithms for use with community data. TWINSPAN is divisive hierarchical, flexible‐UPGMA is agglomerative and hierarchical...
Integration K-Means Clustering Method and Elbow Method For Identification of The Best Customer Profile Cluster
Clustering is a data mining technique used to analyse data that has variations and the number of lots. Clustering was process of grouping data into a cluster, so they contained ...
A Method for Cluster Analysis
A method for investigating the relation of points in multidimensional space is described. Using an analysis of variance technique, the points are divided into the two most-compa...
Computational clustering for viral reference proteomes
Abstract Motivation: The enormous number of redundant sequenced genomes has hindered efforts to analyze and functionally annotate proteins. As the taxonomy of viruses is not uni...
Publication Info
- Year
- 1974
- Type
- article
- Volume
- 3
- Issue
- 1
- Pages
- 1-27
- Citations
- 6351
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1080/03610927408827101