Spike and slab biclustering
; Bicego, M.
; Farinelli, A.
Figueiredo, M. A. T.
Pattern Recognition Vol. 72, Nº -, pp. 186 - 195, December, 2017.
ISSN (print): 0031-3203
Journal Impact Factor: 3,279 (in 2008)
Digital Object Identifier: 10.1016/j.patcog.2017.07.021
Biclustering refers to the problem of simultaneously clustering the rows and columns of a given data matrix, with the goal of obtaining submatrices where the selected rows present a coherent behaviour in the selected columns, and vice-versa. To face this intrinsically difficult problem, we propose a novel generative model, where biclustering is approached from a sparse low-rank matrix factorization perspective. The main idea is to design a probabilistic model describing the factorization of a given data matrix in two other matrices, from which information about rows and columns belonging to the sought for biclusters can be obtained. One crucial ingredient in the proposed model is the use of a spike and slab sparsity-inducing prior, thus we term the approach spike and slab biclustering (SSBi). To estimate the parameters of the SSBi model, we propose an expectation-maximization (EM) algorithm, termed SSBiEM, which solves a low-rank factorization problem at each iteration, using a recently proposed augmented Lagrangian algorithm. Experiments with both synthetic and real data show that the SSBi approach compares favorably with the state-of-the-art.