The task of clusterization of text documents
M.V. Khachumov1), 2)
1) Federal Research Center “Computer Science and Control” of Russian Academy of Sciences, Moscow, Russia
2) Российский институт дружбы народов, г. Москва, Россия
Annotation
Improvement of text documents clusterization technology based on number clusters optimization and their initial allocation, and also a choice of the most adequate metrics are considered. The results received during experiments confirm efficiency of the offered approach.
Keywords
text, Clustering, class, vector, metrics, centre of cluster, heading, experiment