A NOVEL LATENT FACTOR MODEL FOR RECOMMENDER SYSTEM

Kumar, Bipul

doi:10.4301/S1807-17752016000300008

Acessibilidade / Reportar erro

Brasil

JISTEM - Journal of Information Systems and Technology Management

Español English

Brasil

Español English

sumário « anterior atual seguinte »

Sumário

Articles • JISTEM J.Inf.Syst. Technol. Manag. 13 (3) • Sep-Dec 2016 • https://doi.org/10.4301/S1807-17752016000300008 copy

A NOVEL LATENT FACTOR MODEL FOR RECOMMENDER SYSTEM

Authorship SCIMAGO INSTITUTIONS RANKINGS

ABSTRACT

Matrix factorization (MF) has evolved as one of the better practice to handle sparse data in field of recommender systems. Funk singular value decomposition (SVD) is a variant of MF that exists as state-of-the-art method that enabled winning the Netflix prize competition. The method is widely used with modifications in present day research in field of recommender systems. With the potential of data points to grow at very high velocity, it is prudent to devise newer methods that can handle such data accurately as well as efficiently than Funk-SVD in the context of recommender system. In view of the growing data points, I propose a latent factor model that caters to both accuracy and efficiency by reducing the number of latent features of either users or items making it less complex than Funk-SVD, where latent features of both users and items are equal and often larger. A comprehensive empirical evaluation of accuracy on two publicly available, amazon and ml-100 k datasets reveals the comparable accuracy and lesser complexity of proposed methods than Funk-SVD.

Keywords:
Latent factor model; singular value decomposition; recommender system; E-commerce; E-services; complexity

Models	Method(Deterministic/probabilistic)	Key features
SVD (B. Sarwar, Karypis, Konstan & Riedl 2000bSarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2000). Application of dimensionality reduction in recommender system-a case study. In ACM WebKDD 2000 Workshop.)	Deterministic	• Decomposes the user-item preference (rating) matrix into three matrices, viz., user feature matrix, singular matrix, and item feature matrix of lower rank • Sparse data in user-item preference (rating) matrix to be filled by imputation Not scalable
Incremental SVD (B. Sarwar, Karypis, Konstan & Riedl 2002Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2002). Incremental singular value decomposition algorithms for highly scalable recommender systems. Fifth International Conference on Computer and Information Science, 27-28.)	Deterministic	• Decompose the user-item preference (rating) in the same way as SVD • Incremental SVD is made scalable and faster by applying folding-in technique by adding new users and items • Folding-in can result in loss of quality
SVD+ANN (Billsus & Pazzani 1998Billsus, D., & Pazzani, M. J. (1998). Learning Collaborative Information Filters. In Proceedings of the Fifteenth International Conference on Machine Learning (pp. 46-54). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Retrieved from http://dl.acm.org/citation.cfm?id=645527.657311 http://dl.acm.org/citation.cfm?id=645527... )	Deterministic	• Convert user-item preference (rating) matrix into Boolean form; resulting in the matrix filled with zeros (dislike) and ones (like) • Compute SVD in the same way as above • Train an ANN with user and item feature vectors computed using SVD which is used for prediction
Regularized SVD (Paterek 2007Paterek, A. (2007). Improving regularized singular value decomposition for collaborative filtering. In Proc. KDD Cup Workshop at SIGKDD'07, 13th ACM Int. Conf. on Knowledge Discovery and Data Mining (pp. 39-42). Retrieved from http://serv1.ist.psu.edu:8080/viewdoc/summary;jsessionid=CBC0A80E61E800DE518520F9469B2FD1?doi=10.1.1.96.7652 http://serv1.ist.psu.edu:8080/viewdoc/su... )	Deterministic	• Decomposes the user-item preference (rating) matrix into two matrices, user feature matrix and item feature matrix of lower rank • Parameters are estimated by minimizing the sum of squared residuals against user-item preference (rating), one feature at a time, using gradient descent method with regularization and early stopping
SVD++ (Koren 2008Koren, Y. (2008). Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 426-434). ACM.)	Deterministic	• Integrates implicit preference (purchase behavior) with regularized SVD • It is regarded as the best single model in Netflix Prize for accurate prediction
SVD + Demographic data (Vozalis & Margaritis 2007Vozalis, M., & Margaritis, K. (2007). Using SVD and demographic data for the enhancement of generalized Collaborative Filtering. Information Sciences, 177(15), 3017-3037. https://doi.org/10.1016/j.ins.2007.02.036 https://doi.org/10.1016/j.ins.2007.02.03... )	Deterministic	• Demographic data and SVD is combined to predict the rating • Utilizes SVD as an augmenting technique and demographic data, as a source of additional information, in order to enhance the efficiency and improve the accuracy of the generated predictions
Probabilistic latent semantic analysis (pLSA) (Hofmann 2004Hofmann, T. (2004). Latent Semantic Models for Collaborative Filtering. ACM Transaction on Information System, 22(1), 89-115. https://doi.org/10.1145/963770.963774 https://doi.org/10.1145/963770.963774... )	Probabilistic	• Introduces latent class variables in a mixture model setting to discover user communities and prototypical interest profiles using statistical modeling technique • It can be thought as probabilistic modeling of SVD • Expectation maximization (EM) algorithm ensures learning probabilistic user communities and prototypical interest profile
Probabilistic matrix factorization (PMF) (Salakhutdinov & Mnih 2008Salakhutdinov, R., & Mnih, A. (2008). Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th international conference on Machine learning (pp. 880-887).)	Probabilistic	• Full Bayesian analysis by introducing prior distribution over latent factors of items and users. • To avoid over-fitting, training of parameters in PMF is done using Markov Chain Monte Carlo (MCMC) technique • Ensures improvement in accuracy in comparison to SVD
Regression-based latent factor model (RLFM) (Agarwal & Chen 2009Agarwal, D., & Chen, B.-C. (2009). Regression-based latent factor models. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 19-28). ACM.)	Probabilistic	• Features of users and item as well as latent features learned from the database using SVD is used to predict the ratings • In the case of PMF we use zero mean prior over latent factors but in RLFM the prior is estimated by running regression over features of items and users. • Suitable for cold start and warm start situations in RS
Latent Factor Augmented with User preference Model (LFUM) (Ahmed et al. 2013Ahmed, A., Kanagal, B., Pandey, S., Josifovski, V., Pueyo, L. G., & Yuan, J. (2013). Latent factor models with additive and hierarchically-smoothed user preferences. Proceedings of the Sixth ACM International Conference on Web Search and Data Mining - WSDM '13. https://doi.org/10.1145/2433396.2433445 https://doi.org/10.1145/2433396.2433445... )	Probabilistic	• A hybrid model that combines the observed item attributes with a latent factor model • It doesn't learn a regression function over item attributes but rather learn a user-specific probability distribution over item attributes • Training of dataset is done using discriminative Bayesian personalized ranking (BPR) which takes both purchased and non-purchased items by users into account
Latent Dirichlet Allocation (LDA) (Blei, Ng & Jordan 2003Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993-1022. Retrieved from http://dl.acm.org/citation.cfm?id=944919.944937 http://dl.acm.org/citation.cfm?id=944919... )	Probabilistic	• While pLSA does not assume a speciﬁc prior distribution over the number of dimensions in hidden variables, LDA assumes that priors have the form of the Dirichlet distribution • Gibbs sampling or Expectation maximization (EM) is used to estimate the parameters of LDA model
Probabilistic factor analysis (Canny 2002Canny, J. (2002). Collaborative Filtering with Privacy via Factor Analysis. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 238-245). New York, NY, USA: ACM. https://doi.org/10.1145/564376.564419 https://doi.org/10.1145/564376.564419... )	Probabilistic	• Factor analysis is a probabilistic formulation of a linear fit, which generalizes SVD and linear regression. • EM is used to learn the factors of the model.
Eigentaste (Goldberg, Roeder, Gupta, & Perkins 2001Goldberg, K., Roeder, T., Gupta, D., & Perkins, C. (2001). Eigentaste: A Constant Time Collaborative Filtering Algorithm. Information Retrieval, 4(2), 133-151. https://doi.org/10.1023/A:1011419012209 https://doi.org/10.1023/A:1011419012209... )	Deterministic	• Offline phase: uses principal component analysis(PCA) for optimal dimensionality reduction and then clusters users in the lower dimensional subspace • Online phase: uses eigenvectors to project new users into clusters and a lookup table to recommend appropriate items
Maximum-margin Matrix Factorization (MMF) (Rennie & Srebro 2005Rennie, J. D. M., & Srebro, N. (2005). Fast Maximum Margin Matrix Factorization for Collaborative Prediction. In Proceedings of the 22Nd International Conference on Machine Learning (pp. 713-719). New York, NY, USA: ACM. https://doi.org/10.1145/1102351.1102441 https://doi.org/10.1145/1102351.1102441... )	Deterministic	• Decomposes the user-item preference(rating) matrix into two matrices, user feature matrix and item feature matrix • It works on the principle of lowering the norm of matrices instead of reducing the rank of matrices
Non -parametric matrix factorization (Yu, Zhu, Lafferty & Gong 2009Yu, K., Zhu, S., Lafferty, J., & Gong, Y. (2009). Fast Nonparametric Matrix Factorization for Large-scale Collaborative Filtering. In Proceedings of the 32Nd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 211-218). New York, NY, USA: ACM. https://doi.org/10.1145/1571941.1571979 https://doi.org/10.1145/1571941.1571979... )	Deterministic	• Decomposes the user-item preference (rating) matrix into two matrices, user feature matrix and item feature matrix • In non-parametric matrix factorization, the number of factors is learned from given data rather than prefixing it to a lower rank as in the case of RSVD
Discrete wavelet transform (DWT) (Russell & Yoon 2008Russell, S., & Yoon, V. (2008). Applications of Wavelet Data Reduction in a Recommender System. Expert System with Applications, 34(4), 2316-2325. https://doi.org/10.1016/j.eswa.2007.03.009 https://doi.org/10.1016/j.eswa.2007.03.0... )	Deterministic	• Haar wavelet transformation is used to original user-item preference matrix • k-nearest neighborhood model is used over transformed matrix for prediction of rating of test user
Restricted Boltzmann Machine (RBM) (Salakhutdinov, Mnih & Hinton 2007Salakhutdinov, R., Mnih, A., & Hinton, G. (2007). Restricted Boltzmann machines for collaborative filtering. In Proceedings of the 24th international conference on Machine learning (pp. 791-798). ACM.)	Probabilistic	• Two-layer undirected graphical models with hidden units which learn feature of users and items • It is a scalable method for rating prediction

	d	5	10	15	20	25	30	MEAN	SD
RSVD	RMSE	0.948	0.949	0.948	0.956	0.959	0.967	0.955	0.007
Proposed Model	RMSE	0.936	0.939	0.943	0.95	0.959	0.977	0.951	0.014

	d	5	10	15	20	25	30	MEAN	SD
RSVD	RMSE	1.0571	1.0582	1.0591	1.0603	1.0609	1.0625	1.0597	0.0019
Proposed Model	RMSE	1.0342	1.0298	1.0291	1.0278	1.0453	1.0567	1.0371	0.0115

TECSI Laboratório de Tecnologia e Sistemas de Informação - FEA/USP Av. Prof. Luciano Gualberto, 908 FEA 3, 05508-900 - São Paulo/SP Brasil, Tel.: +55 11 2648 6389, +55 11 2648 6364 - São Paulo - SP - Brazil
E-mail: jistemusp@gmail.com

Acompanhe os números deste periódico no seu leitor de RSS

[1] Bipul Kumar, Indian Institute of Management, Ranchi Ranchi, India E-mail bippthk22@gmail.com