SVD (B. Sarwar, Karypis, Konstan & Riedl 2000bSarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2000). Application of dimensionality reduction in recommender system-a case study. In ACM WebKDD 2000 Workshop.) |
Deterministic |
• Decomposes the user-item preference (rating) matrix into three matrices, viz., user feature matrix, singular matrix, and item feature matrix of lower rank • Sparse data in user-item preference (rating) matrix to be filled by imputation Not scalable |
Incremental SVD (B. Sarwar, Karypis, Konstan & Riedl 2002Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2002). Incremental singular value decomposition algorithms for highly scalable recommender systems. Fifth International Conference on Computer and Information Science, 27-28.) |
Deterministic |
• Decompose the user-item preference (rating) in the same way as SVD • Incremental SVD is made scalable and faster by applying folding-in technique by adding new users and items • Folding-in can result in loss of quality |
SVD+ANN (Billsus & Pazzani 1998Billsus, D., & Pazzani, M. J. (1998). Learning Collaborative Information Filters. In Proceedings of the Fifteenth International Conference on Machine Learning (pp. 46-54). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Retrieved from http://dl.acm.org/citation.cfm?id=645527.657311
http://dl.acm.org/citation.cfm?id=645527...
) |
Deterministic |
• Convert user-item preference (rating) matrix into Boolean form; resulting in the matrix filled with zeros (dislike) and ones (like) • Compute SVD in the same way as above • Train an ANN with user and item feature vectors computed using SVD which is used for prediction |
Regularized SVD (Paterek 2007Paterek, A. (2007). Improving regularized singular value decomposition for collaborative filtering. In Proc. KDD Cup Workshop at SIGKDD'07, 13th ACM Int. Conf. on Knowledge Discovery and Data Mining (pp. 39-42). Retrieved from http://serv1.ist.psu.edu:8080/viewdoc/summary;jsessionid=CBC0A80E61E800DE518520F9469B2FD1?doi=10.1.1.96.7652
http://serv1.ist.psu.edu:8080/viewdoc/su...
) |
Deterministic |
• Decomposes the user-item preference (rating) matrix into two matrices, user feature matrix and item feature matrix of lower rank • Parameters are estimated by minimizing the sum of squared residuals against user-item preference (rating), one feature at a time, using gradient descent method with regularization and early stopping |
SVD++ (Koren 2008Koren, Y. (2008). Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 426-434). ACM.) |
Deterministic |
• Integrates implicit preference (purchase behavior) with regularized SVD • It is regarded as the best single model in Netflix Prize for accurate prediction |
SVD + Demographic data (Vozalis & Margaritis 2007Vozalis, M., & Margaritis, K. (2007). Using SVD and demographic data for the enhancement of generalized Collaborative Filtering. Information Sciences, 177(15), 3017-3037. https://doi.org/10.1016/j.ins.2007.02.036
https://doi.org/10.1016/j.ins.2007.02.03...
) |
Deterministic |
• Demographic data and SVD is combined to predict the rating • Utilizes SVD as an augmenting technique and demographic data, as a source of additional information, in order to enhance the efficiency and improve the accuracy of the generated predictions |
Probabilistic latent semantic analysis (pLSA) (Hofmann 2004Hofmann, T. (2004). Latent Semantic Models for Collaborative Filtering. ACM Transaction on Information System, 22(1), 89-115. https://doi.org/10.1145/963770.963774
https://doi.org/10.1145/963770.963774...
) |
Probabilistic |
• Introduces latent class variables in a mixture model setting to discover user communities and prototypical interest profiles using statistical modeling technique • It can be thought as probabilistic modeling of SVD • Expectation maximization (EM) algorithm ensures learning probabilistic user communities and prototypical interest profile |
Probabilistic matrix factorization (PMF) (Salakhutdinov & Mnih 2008Salakhutdinov, R., & Mnih, A. (2008). Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th international conference on Machine learning (pp. 880-887).) |
Probabilistic |
• Full Bayesian analysis by introducing prior distribution over latent factors of items and users. • To avoid over-fitting, training of parameters in PMF is done using Markov Chain Monte Carlo (MCMC) technique • Ensures improvement in accuracy in comparison to SVD |
Regression-based latent factor model (RLFM) (Agarwal & Chen 2009Agarwal, D., & Chen, B.-C. (2009). Regression-based latent factor models. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 19-28). ACM.) |
Probabilistic |
• Features of users and item as well as latent features learned from the database using SVD is used to predict the ratings • In the case of PMF we use zero mean prior over latent factors but in RLFM the prior is estimated by running regression over features of items and users. • Suitable for cold start and warm start situations in RS |
Latent Factor Augmented with User preference Model (LFUM) (Ahmed et al. 2013Ahmed, A., Kanagal, B., Pandey, S., Josifovski, V., Pueyo, L. G., & Yuan, J. (2013). Latent factor models with additive and hierarchically-smoothed user preferences. Proceedings of the Sixth ACM International Conference on Web Search and Data Mining - WSDM '13. https://doi.org/10.1145/2433396.2433445
https://doi.org/10.1145/2433396.2433445...
) |
Probabilistic |
• A hybrid model that combines the observed item attributes with a latent factor model • It doesn't learn a regression function over item attributes but rather learn a user-specific probability distribution over item attributes • Training of dataset is done using discriminative Bayesian personalized ranking (BPR) which takes both purchased and non-purchased items by users into account |
Latent Dirichlet Allocation (LDA) (Blei, Ng & Jordan 2003Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993-1022. Retrieved from http://dl.acm.org/citation.cfm?id=944919.944937
http://dl.acm.org/citation.cfm?id=944919...
) |
Probabilistic |
• While pLSA does not assume a specific prior distribution over the number of dimensions in hidden variables, LDA assumes that priors have the form of the Dirichlet distribution • Gibbs sampling or Expectation maximization (EM) is used to estimate the parameters of LDA model |
Probabilistic factor analysis (Canny 2002Canny, J. (2002). Collaborative Filtering with Privacy via Factor Analysis. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 238-245). New York, NY, USA: ACM. https://doi.org/10.1145/564376.564419
https://doi.org/10.1145/564376.564419...
) |
Probabilistic |
• Factor analysis is a probabilistic formulation of a linear fit, which generalizes SVD and linear regression. • EM is used to learn the factors of the model. |
Eigentaste (Goldberg, Roeder, Gupta, & Perkins 2001Goldberg, K., Roeder, T., Gupta, D., & Perkins, C. (2001). Eigentaste: A Constant Time Collaborative Filtering Algorithm. Information Retrieval, 4(2), 133-151. https://doi.org/10.1023/A:1011419012209
https://doi.org/10.1023/A:1011419012209...
) |
Deterministic |
• Offline phase: uses principal component analysis(PCA) for optimal dimensionality reduction and then clusters users in the lower dimensional subspace • Online phase: uses eigenvectors to project new users into clusters and a lookup table to recommend appropriate items |
Maximum-margin Matrix Factorization (MMF) (Rennie & Srebro 2005Rennie, J. D. M., & Srebro, N. (2005). Fast Maximum Margin Matrix Factorization for Collaborative Prediction. In Proceedings of the 22Nd International Conference on Machine Learning (pp. 713-719). New York, NY, USA: ACM. https://doi.org/10.1145/1102351.1102441
https://doi.org/10.1145/1102351.1102441...
) |
Deterministic |
• Decomposes the user-item preference(rating) matrix into two matrices, user feature matrix and item feature matrix • It works on the principle of lowering the norm of matrices instead of reducing the rank of matrices |
Non -parametric matrix factorization (Yu, Zhu, Lafferty & Gong 2009Yu, K., Zhu, S., Lafferty, J., & Gong, Y. (2009). Fast Nonparametric Matrix Factorization for Large-scale Collaborative Filtering. In Proceedings of the 32Nd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 211-218). New York, NY, USA: ACM. https://doi.org/10.1145/1571941.1571979
https://doi.org/10.1145/1571941.1571979...
) |
Deterministic |
• Decomposes the user-item preference (rating) matrix into two matrices, user feature matrix and item feature matrix • In non-parametric matrix factorization, the number of factors is learned from given data rather than prefixing it to a lower rank as in the case of RSVD |
Discrete wavelet transform (DWT) (Russell & Yoon 2008Russell, S., & Yoon, V. (2008). Applications of Wavelet Data Reduction in a Recommender System. Expert System with Applications, 34(4), 2316-2325. https://doi.org/10.1016/j.eswa.2007.03.009
https://doi.org/10.1016/j.eswa.2007.03.0...
) |
Deterministic |
• Haar wavelet transformation is used to original user-item preference matrix • k-nearest neighborhood model is used over transformed matrix for prediction of rating of test user |
Restricted Boltzmann Machine (RBM) (Salakhutdinov, Mnih & Hinton 2007Salakhutdinov, R., Mnih, A., & Hinton, G. (2007). Restricted Boltzmann machines for collaborative filtering. In Proceedings of the 24th international conference on Machine learning (pp. 791-798). ACM.) |
Probabilistic |
• Two-layer undirected graphical models with hidden units which learn feature of users and items • It is a scalable method for rating prediction |