The task of clusterization of text documents

    M.V. Khachumov1), 2)

    1) Federal Research Center “Computer Science and Control” of Russian Academy of Sciences, Moscow, Russia
    2) Российский институт дружбы народов, г. Москва, Россия
    Annotation

    Improvement of text documents clusterization technology based on number clusters optimization and their initial allocation, and also a choice of the most adequate metrics are considered. The results received during experiments confirm efficiency of the offered approach.

    Keywords

    text, Clustering, class, vector, metrics, centre of cluster, heading, experiment