The popularity based recommendations (i.e. non-personalized) can be a good fallback approach for cold-start users. Some possible approaches :

  • By Release Year - we can simply show the new users, a selection of highly rated & popular (having a minimum user count) movies from recent or past decades (year can be extracted from title text in MovieLens dataset)
  • By Genre - or show the new users, a selection of popular & highly rated movies from top genres (after removing duplicate titles which show-up in multiple genres)

2010s

rating_count_per_movie[(rating_count_per_movie['year'] >= 2010) &\
                       (rating_count_per_movie['rating_count'] > 30)]\
.sort_values(by=['rating_mean'], ascending=False).head(5).reset_index(drop=True)

2010

2000s

rating_count_per_movie[(rating_count_per_movie['year'] >= 2000) &\
                       (rating_count_per_movie['year'] < 2010) &\
                       (rating_count_per_movie['rating_count'] > 30)]\
.sort_values(by=['rating_mean'], ascending=False).head(5).reset_index(drop=True)

2000

1990s

rating_count_per_movie[(rating_count_per_movie['year'] >= 1990) &\
                       (rating_count_per_movie['year'] < 2000) &\
                       (rating_count_per_movie['rating_count'] > 30)]\
.sort_values(by=['rating_mean'], ascending=False).head(5).reset_index(drop=True)

1990

action

rating_count_per_movie[(rating_count_per_movie['genres'].str.contains('Action')) & \
                       (rating_count_per_movie['rating_count'] > 100) & \
                       (rating_count_per_movie['year'] >= 1990)] \
.sort_values(by=['rating_mean'], ascending=False).head(5).reset_index(drop=True)

action

comedy

rating_count_per_movie[(rating_count_per_movie['genres'].str.contains('Comedy')) & \
                       (~rating_count_per_movie['genres'].str.contains('Action')) & \
                       (rating_count_per_movie['rating_count'] > 100) & \
                       (rating_count_per_movie['year'] >= 1990)] \
.sort_values(by=['rating_mean'], ascending=False).head(5).reset_index(drop=True)

comedy

romance

rating_count_per_movie[(rating_count_per_movie['genres'].str.contains('Romance')) & \
                       (~rating_count_per_movie['genres'].str.contains('Action')) & \
                       (~rating_count_per_movie['genres'].str.contains('Comedy')) & \
                       (rating_count_per_movie['rating_count'] > 100) & \
                       (rating_count_per_movie['year'] >= 1990)] \
.sort_values(by=['rating_mean'], ascending=False).head(5).reset_index(drop=True)

romance

Updated:

Leave a Comment