Swamy, Ranganath Biligere Narayana (2016) Sparse topical analysis of dyadic data using matrix tri-factorization. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery, SEP 19-23, 2016, Riva del Garda, ITALY, pp. 441-466.
PDF
Mac_Lea_104-2_441_2016.pdf - Published Version Restricted to Registered users only Download (680kB) | Request a copy |
Abstract
Many applications involve dyadic data, where associations between one pair of domain entities, such as documents, words and associations between another pair, such as documents, users are completely observed. We motivate the analysis of such dyadic data introducing an additional discrete dimension, which we call topics, and explore sparse relationships between the domain entities and the topic, such as user-topic and document-topic relationships. For this problem of sparse topical analysis of dyadic data, we propose a formulation using sparse matrix tri-factorization. This formulation requires sparsity constraints, not only on the individual factor matrices, but also on the product of two of the factors. To the best of our knowledge, this problem of sparse matrix tri-factorization has not been studied before. We propose a solution that introduces a surrogate for the product of factors and enforce sparsity on this surrogate as well as on the individual factors through L1-regularization. The resulting optimization problem is efficiently solvable in an alternating minimization framework over sub-problems involving individual factors using the well known FISTA algorithm. For the sub-problems that are constrained, we use a projected variant of the FISTA algorithm. We also show that our formulation leads to independent sub-problems towards solving a factor matrix, thereby supporting parallel implementation leading to scalable solution. We perform experiments over bibliographic and product review data to show that the proposed framework based on sparse tri-factorization formulation results in better generalization ability and factorization accuracy compared to baselines that use sparse bi-factorization.
Item Type: | Conference Proceedings |
---|---|
Publication: | MACHINE LEARNING |
Additional Information: | Copy right for this article belongs to the SPRINGER, VAN GODEWIJCKSTRAAT 30, 3311 GZ DORDRECHT, NETHERLANDS |
Department/Centre: | Division of Electrical Sciences > Computer Science & Automation |
Date Deposited: | 22 Oct 2016 10:13 |
Last Modified: | 22 Oct 2016 10:13 |
URI: | http://eprints.iisc.ac.in/id/eprint/55087 |
Actions (login required)
View Item |