Given a Set of Elements and two Partitions of to compare, and , we define the following:
, the number of pairs of elements in that are in the same set in and in the same set in
, the number of pairs of elements in that are in different sets in and in different sets in
, the number of pairs of elements in that are in the same set in and in different sets in
, the number of pairs of elements in that are in different sets in and in the same set in
The Rand index, , is:
:
Intuitively, one can think of as the number of agreements between and and as the number of disagreements between and .
The Rand index has a value between 0 and 1, with 0 indicating that the two data clusters do not agree on any pair of points and 1 indicating that the data clusters are exactly the same.
REFERENCES
W. M. Rand, ''Objective criteria for the evaluation of clustering methods''. Journal of the American Statistical Association, 66, pp846–850 (1971).
K. Y. Yeung, W. L. Ruzzo, ''Details of the Adjusted Rand index and Clustering algorithms'', Bioinformatics. {Link without Title}