Summary of Approaches

MODEL           DETAILS
Content Based Cosine Similarity between one-hot encoding vector for 18 genres

multiplied by

Exponential Decay Similarity for years math.exp(-diff / 10.0)
User-Based Collaborative Filtering (for Top-N) Recommend the highest rated movies by the Top-10 similar users
User-Based KNN (for Rating) Predict every missing rating using the Top-10 similar users who have watched that movie

using Cosine Similarity
Item-Based KNN (for Rating) Predict every missing rating using the Top-10 similar movies watched by the user

using Cosine Similarity
SVD Based Matrix Factorization (for Rating) # Principal Component Analysis
* PCA on R = U (Users x Latent Features)
* PCA on RT = M (Movies x Latent Features)

# R = U ∑ MT
* U = User Matrix
* ∑ = Diagonal Matrix (which tells the strength of latent factors)
* MT = Movie Matrix

# Singular Value Decomposition
* It’s a way of computing U ∑ MT in one shot
* Null Values - fill missing values with certain defaults like mean
* SGD or ALS to calculate SVD inspired algorithm
* Since, SVD doesn’t works with missing values
* Rating = dot product of (User x Latent Factors) & (Movie x Latent Factors)
SVD++ Based Matrix Factorization (for Rating) The SVD++ algorithm, an extension of SVD taking into account implicit ratings
Deep Learning (for Rating) Generate User & Movie embeddings to model Ratings using ANN

Updated:

Leave a Comment