Large-scale Information Processing, Summer Recommender Systems (part 2)

Large-scale Information Processing, Summer 2015 5 th Exercise Recommender Systems (part 2) Emmanouil Tzouridis tzouridis@kma.informatik.tu-darmstadt.de Knowledge Mining & Assessment

SVM question When a point has ξ > 0 is support vector?

SVM question Use equations from dual

SVM question Use equations from dual ξ>0 a = C support vector

What is line search? Line search

Line search What is line search? Given an update direction, we need to find a satisfactory step size? (Approximately) find minimizer of g

Line search (Approximately) find minimizer of g Exact line search Expensive! We need good step size without making too much effort on finding it

Line search (Approximately) find minimizer of g Bad steps Good steps

Wolfe conditions Line search

Line search Wolfe conditions Sufficient decrease Decrease is proportional to step length and derivative

Line search Wolfe conditions Sufficient decrease Decrease is proportional to step length and derivative But this does not eliminate very small steps

Line search Wolfe conditions Sufficient decrease Strong curvature condition Ensures that slope in new point is sufficiently greater than earlier slope Do not allow slope to be too positive

Goldstein conditions Line search

Line search Goldstein conditions Sufficient decrease + prevent small solutions Might exclude minimizers

Backtracking? Line Search

Line Search Backtracking Start with a big η and decrease it until condition is met

Multi armed Bandits Many slot machines Which machine to play? Maximize gains/rewards

Multi armed Bandits Many slot machines Which machine to play? Maximize gains/rewards Can model Recommender Systems Slot machines (or Arms) are the items Maximize Click-through-rate (or other metric)

Multi armed Bandits Exploration Exploitation dilemma Should we keep using the current best arm? Exploitation Should we we pick arms at random to gather data? Exploration Only Exploitation Make assumptions using only few data Only Exploration Only gather data without using the knowledge from them Trade-off ε-greedy UCB

ε-greedy With ε probabily explore With 1-ε probabily exploite

UCB Upper Confidence Bound Use upper Bound of reward for picking an arm Pick this one Expected reward is less but upper bound is larger

UCB Upper Confidence Bound Use upper Bound of reward for picking an arm Implications? Pick this one Expected reward is less but upper bound is larger

UCB Upper Confidence Bound Use upper Bound of reward for picking an arm Implications? Huge variance then it is picked Explore Huge expected reward then it is picked Exploit

LinUCB Contextual recommendations Personalized Context can vary User profile Item content Season information } Lederhose e.g. How likely is that a Bavarian guy wants a before Oktoberfest?

LinUCB Reward is linear to the contexts Use ridge regression to learn parameters Model variance Arm selection

LinUCB

UCB New items in UCB? Cold start problem?

UCB New items in UCB? Cold start problem? No New item means high uncertainty high upper confidence bound

UCB New items in UCB? Cold start problem? No New item means high uncertainty high upper confidence bound Logarithmic regret Difference of optimal reward from reward that we got

Matrix Factorization Sophisticated collaborative filtering Reconstruct ratings using latent factors Preferences of Users Item attributes R = Q P

Matrix Factorization Objective Function q, p are unknown Non-convex Objective Assuming known q Convex minimization for p Alternating least squares

Matrix Factorization Deriving a SGD learning

Matrix Factorization Deriving a SGD learning Work on specific ratings (r_ui)

Matrix Factorization Extensions Bias Different users rate differently (same for items) Implicit feedback No ratings #Clicks is a metric of confidence for user preference Use confidence to adjust loss

Thank you for your attention!