Sparse topical analysis of dyadic data using matrix tri-factorization

Swamy, Ranganath Biligere Narayana (2016) Sparse topical analysis of dyadic data using matrix tri-factorization. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery, SEP 19-23, 2016, Riva del Garda, ITALY, pp. 441-466.

PDF
Mac_Lea_104-2_441_2016.pdf - Published Version
Restricted to Registered users only
Download (680kB) | Request a copy

Official URL: http://dx.doi.org/10.1007/s10994-015-5537-5

Abstract

Many applications involve dyadic data, where associations between one pair of domain entities, such as documents, words and associations between another pair, such as documents, users are completely observed. We motivate the analysis of such dyadic data introducing an additional discrete dimension, which we call topics, and explore sparse relationships between the domain entities and the topic, such as user-topic and document-topic relationships. For this problem of sparse topical analysis of dyadic data, we propose a formulation using sparse matrix tri-factorization. This formulation requires sparsity constraints, not only on the individual factor matrices, but also on the product of two of the factors. To the best of our knowledge, this problem of sparse matrix tri-factorization has not been studied before. We propose a solution that introduces a surrogate for the product of factors and enforce sparsity on this surrogate as well as on the individual factors through L1-regularization. The resulting optimization problem is efficiently solvable in an alternating minimization framework over sub-problems involving individual factors using the well known FISTA algorithm. For the sub-problems that are constrained, we use a projected variant of the FISTA algorithm. We also show that our formulation leads to independent sub-problems towards solving a factor matrix, thereby supporting parallel implementation leading to scalable solution. We perform experiments over bibliographic and product review data to show that the proposed framework based on sparse tri-factorization formulation results in better generalization ability and factorization accuracy compared to baselines that use sparse bi-factorization.

Item Type:	Conference Proceedings
Publication:	MACHINE LEARNING
Additional Information:	Copy right for this article belongs to the SPRINGER, VAN GODEWIJCKSTRAAT 30, 3311 GZ DORDRECHT, NETHERLANDS
Department/Centre:	Division of Electrical Sciences > Computer Science & Automation
Date Deposited:	22 Oct 2016 10:13
Last Modified:	22 Oct 2016 10:13
URI:	http://eprints.iisc.ac.in/id/eprint/55087

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India