|
Organizers |
Some Results for a Class of Similarity Indices Used in Cluster Analysis
by
Ahmed N. Albatineh
Nova Southeastern University
Albatineh et al. (2006) defined a family of similarity indices L which are linear functions of the matching counts matrix [mij] resulting from clustering the same data using two clustering algorithms, where mij is the number of common elements between the ith cluster of method A and jth cluster of method B. This paper provides a derivation of the mean and variance of any member of the family L of similarity indices under fixed marginal totals of the matching counts matrix and independence of the clustering algorithms. Therefore, this paper generalizes the derivation of Fowlks and Mallows (1983) for Rand (1971) measure and a measure they called Bk, which is actually attributed to Ochiai (1957). Simulations using Bivariate Normal Distribution are presented for comparisons under different settings.
Date received: January 31, 2008
Copyright © 2008 by the author(s). The author(s) of this document and the organizers of the conference have granted their consent to include this abstract in Atlas Conferences Inc. Document # cavi-82.