cold-start problem
Cold-Start : new user solutions
- use implicit data (as soon as the user looks at their first item or the landing page)
- use cookies (carefully) - maybe they’re an existing customer who are not logged in and you can use browser cookies to map user session to a user account
- geo-ip - IP address can be mapped to geography, can recommend items uniquely popular in that location (though tenuous relationship)
- recommended top-sellers or promotions or ads
- interview the user
Cold-Start : new item solutions
- just don’t worry about it (show up in search results, promotions or ads)
- use content-based attributes (item title, attributes, description, category) - never a good idea to completely rely on content attributes
- map attributes to latent features (see LearnAROMA) - associate latent features (via matrix factorization or deep learning) with content attributes through; and use the behavior patterns you learned from existing items to inform relationships with new items.
- random exploration - use extra slots in Top-N recommendations to surface new items to a user
stoplists
- easy to offend people through RS
- example - walmart accidently recommended movie about martin luther king with planet of the apes
- key words or terms in titles, description or categories
- exclude titles associated with a race
- prevent controversy
- stoplists :
- adult-oriented content
- vulgarity
- legally prohibited topics (example - Mein Kampf)
- terrorism / political extremism
- bereavement / medical
- competing products
- drug use
- religion
filter bubbles
- extreme right or left wing ideology
- keep amplifying pre-existing interests
- bad from ethical standpoint though no necessarily bad from business standpoint
- add extra-diversity; add items with broader universal appeal to whole population
trust & transparency
- why the model recommended something? (transparency as antidote to spurious results)
- show something familiar / popular for trust
- amazon example - we’re recommedning Science Fiction & Fantasy Books because you purchased
outliers users or items
- example - what if some users are actually bots or web crawlers and may have large influence
- example - frequent users get too much weight OR institutional buyers
gaming the system
- to promote their business or malicious agenda or amusement
- purchase data (or vote with wallets) is impervious to the problem compared to click data
- click based recommendation can end up being pornography detection system
international markets and laws
- example - do you pool data together or separate the data
- separate the data is better approach because of foreign language, different movie releases, local restrictions (nazi content in germany or political topics in China)
temporal effects
- take ratings recency into account
- value-aware recommender systems - hit rate VERSUS profitability
Leave a Comment