Summary of Approaches
MODEL | DETAILS |
---|---|
Content Based | Cosine Similarity between one-hot encoding vector for 18 genres multiplied by Exponential Decay Similarity for years math.exp(-diff / 10.0) |
User-Based Collaborative Filtering (for Top-N) | Recommend the highest rated movies by the Top-10 similar users |
User-Based KNN (for Rating) | Predict every missing rating using the Top-10 similar users who have watched that movie using Cosine Similarity |
Item-Based KNN (for Rating) | Predict every missing rating using the Top-10 similar movies watched by the user using Cosine Similarity |
SVD Based Matrix Factorization (for Rating) | # Principal Component Analysis * PCA on R = U (Users x Latent Features) * PCA on RT = M (Movies x Latent Features) # R = U ∑ MT * U = User Matrix * ∑ = Diagonal Matrix (which tells the strength of latent factors) * MT = Movie Matrix # Singular Value Decomposition * It’s a way of computing U ∑ MT in one shot * Null Values - fill missing values with certain defaults like mean * SGD or ALS to calculate SVD inspired algorithm * Since, SVD doesn’t works with missing values * Rating = dot product of (User x Latent Factors) & (Movie x Latent Factors) |
SVD++ Based Matrix Factorization (for Rating) | The SVD++ algorithm, an extension of SVD taking into account implicit ratings |
Deep Learning (for Rating) | Generate User & Movie embeddings to model Ratings using ANN |
Leave a Comment