cold-start problem

Cold-Start : new user solutions

  • use implicit data (as soon as the user looks at their first item or the landing page)
  • use cookies (carefully) - maybe they’re an existing customer who are not logged in and you can use browser cookies to map user session to a user account
  • geo-ip - IP address can be mapped to geography, can recommend items uniquely popular in that location (though tenuous relationship)
  • recommended top-sellers or promotions or ads
  • interview the user

Cold-Start : new item solutions

  • just don’t worry about it (show up in search results, promotions or ads)
  • use content-based attributes (item title, attributes, description, category) - never a good idea to completely rely on content attributes
  • map attributes to latent features (see LearnAROMA) - associate latent features (via matrix factorization or deep learning) with content attributes through; and use the behavior patterns you learned from existing items to inform relationships with new items.
  • random exploration - use extra slots in Top-N recommendations to surface new items to a user

stoplists

  • easy to offend people through RS
  • example - walmart accidently recommended movie about martin luther king with planet of the apes
  • key words or terms in titles, description or categories
    • exclude titles associated with a race
    • prevent controversy
  • stoplists :
    • adult-oriented content
    • vulgarity
    • legally prohibited topics (example - Mein Kampf)
    • terrorism / political extremism
    • bereavement / medical
    • competing products
    • drug use
    • religion

filter bubbles

  • extreme right or left wing ideology
  • keep amplifying pre-existing interests
  • bad from ethical standpoint though no necessarily bad from business standpoint
  • add extra-diversity; add items with broader universal appeal to whole population

trust & transparency

  • why the model recommended something? (transparency as antidote to spurious results)
  • show something familiar / popular for trust
  • amazon example - we’re recommedning Science Fiction & Fantasy Books because you purchased

outliers users or items

  • example - what if some users are actually bots or web crawlers and may have large influence
  • example - frequent users get too much weight OR institutional buyers

gaming the system

  • to promote their business or malicious agenda or amusement
  • purchase data (or vote with wallets) is impervious to the problem compared to click data
  • click based recommendation can end up being pornography detection system

international markets and laws

  • example - do you pool data together or separate the data
    • separate the data is better approach because of foreign language, different movie releases, local restrictions (nazi content in germany or political topics in China)

temporal effects

  • take ratings recency into account
  • value-aware recommender systems - hit rate VERSUS profitability

Updated:

Leave a Comment