Real-World Challenges -

cold-start problem

use implicit data (as soon as the user looks at their first item or the landing page)
use cookies (carefully) - maybe they’re an existing customer who are not logged in and you can use browser cookies to map user session to a user account
geo-ip - IP address can be mapped to geography, can recommend items uniquely popular in that location (though tenuous relationship)
recommended top-sellers or promotions or ads
interview the user

just don’t worry about it (show up in search results, promotions or ads)
use content-based attributes (item title, attributes, description, category) - never a good idea to completely rely on content attributes
map attributes to latent features (see LearnAROMA) - associate latent features (via matrix factorization or deep learning) with content attributes through; and use the behavior patterns you learned from existing items to inform relationships with new items.
random exploration - use extra slots in Top-N recommendations to surface new items to a user

easy to offend people through RS
example - walmart accidently recommended movie about martin luther king with planet of the apes
key words or terms in titles, description or categories
- exclude titles associated with a race
- prevent controversy
stoplists :
- adult-oriented content
- vulgarity
- legally prohibited topics (example - Mein Kampf)
- terrorism / political extremism
- bereavement / medical
- competing products
- drug use
- religion

extreme right or left wing ideology
keep amplifying pre-existing interests
bad from ethical standpoint though no necessarily bad from business standpoint
add extra-diversity; add items with broader universal appeal to whole population

why the model recommended something? (transparency as antidote to spurious results)
show something familiar / popular for trust
amazon example - we’re recommedning Science Fiction & Fantasy Books because you purchased

example - what if some users are actually bots or web crawlers and may have large influence
example - frequent users get too much weight OR institutional buyers

to promote their business or malicious agenda or amusement
purchase data (or vote with wallets) is impervious to the problem compared to click data
click based recommendation can end up being pornography detection system

example - do you pool data together or separate the data
- separate the data is better approach because of foreign language, different movie releases, local restrictions (nazi content in germany or political topics in China)