Creating and sharing knowledge for telecommunications

A classification-based approach to semi-supervised clustering with pairwise constraints

Smieja, M. ; Struski, Ł. ; Figueiredo, M. A. T.

Neural Networks Vol. 127, Nº -, pp. 193 - 203, July, 2020.

ISSN (print): 0893-6080
ISSN (online): 1879-2782

Scimago Journal Ranking: 1,40 (in 2020)

Digital Object Identifier: 10.1016/j.neunet.2020.04.017

In this paper, we introduce a neural network framework for semi-supervised clustering with pairwise (must-link or cannot-link) constraints. In contrast to existing approaches, we decompose semi-supervised clustering into two simpler classification tasks: the first stage uses a pair of Siamese neural networks to label the unlabeled pairs of points as must-link or cannot-link; the second stage uses the fully pairwise-labeled dataset produced by the first stage in a supervised neural-network-based clustering method. The proposed approach is motivated by the observation that binary classification (such as assigning pairwise relations) is usually easier than multi-class clustering with partial supervision. On the other hand, being classification-based, our method solves only well-defined classification problems, rather than less well specified clustering tasks. Extensive experiments on various datasets demonstrate the high performance of the proposed method.