Large-scale Information Processing, Summer 2015 5 th Exercise Recommender Systems (part 2) Emmanouil Tzouridis tzouridis@kma.informatik.tu-darmstadt.de Knowledge Mining & Assessment
SVM question When a point has ξ > 0 is support vector?
SVM question Use equations from dual
SVM question Use equations from dual ξ>0 a = C support vector
What is line search? Line search
Line search What is line search? Given an update direction, we need to find a satisfactory step size? (Approximately) find minimizer of g
Line search (Approximately) find minimizer of g Exact line search Expensive! We need good step size without making too much effort on finding it
Line search (Approximately) find minimizer of g Bad steps Good steps
Wolfe conditions Line search
Line search Wolfe conditions Sufficient decrease Decrease is proportional to step length and derivative
Line search Wolfe conditions Sufficient decrease Decrease is proportional to step length and derivative But this does not eliminate very small steps
Line search Wolfe conditions Sufficient decrease Strong curvature condition Ensures that slope in new point is sufficiently greater than earlier slope Do not allow slope to be too positive
Goldstein conditions Line search
Line search Goldstein conditions Sufficient decrease + prevent small solutions Might exclude minimizers
Backtracking? Line Search
Line Search Backtracking Start with a big η and decrease it until condition is met
Multi armed Bandits Many slot machines Which machine to play? Maximize gains/rewards
Multi armed Bandits Many slot machines Which machine to play? Maximize gains/rewards Can model Recommender Systems Slot machines (or Arms) are the items Maximize Click-through-rate (or other metric)
Multi armed Bandits Exploration Exploitation dilemma Should we keep using the current best arm? Exploitation Should we we pick arms at random to gather data? Exploration Only Exploitation Make assumptions using only few data Only Exploration Only gather data without using the knowledge from them Trade-off ε-greedy UCB
ε-greedy With ε probabily explore With 1-ε probabily exploite
UCB Upper Confidence Bound Use upper Bound of reward for picking an arm Pick this one Expected reward is less but upper bound is larger
UCB Upper Confidence Bound Use upper Bound of reward for picking an arm Implications? Pick this one Expected reward is less but upper bound is larger
UCB Upper Confidence Bound Use upper Bound of reward for picking an arm Implications? Huge variance then it is picked Explore Huge expected reward then it is picked Exploit
LinUCB Contextual recommendations Personalized Context can vary User profile Item content Season information } Lederhose e.g. How likely is that a Bavarian guy wants a before Oktoberfest?
LinUCB Contextual recommendations Personalized Context can vary User profile Item content Season information } Lederhose e.g. How likely is that a Bavarian guy wants a before Oktoberfest?
LinUCB Reward is linear to the contexts Use ridge regression to learn parameters Model variance Arm selection
LinUCB
UCB New items in UCB? Cold start problem?
UCB New items in UCB? Cold start problem? No New item means high uncertainty high upper confidence bound
UCB New items in UCB? Cold start problem? No New item means high uncertainty high upper confidence bound Logarithmic regret Difference of optimal reward from reward that we got
Matrix Factorization Sophisticated collaborative filtering Reconstruct ratings using latent factors Preferences of Users Item attributes R = Q P
Matrix Factorization Objective Function q, p are unknown Non-convex Objective Assuming known q Convex minimization for p Alternating least squares
Matrix Factorization Deriving a SGD learning
Matrix Factorization Deriving a SGD learning Work on specific ratings (r_ui)
Matrix Factorization Deriving a SGD learning Work on specific ratings (r_ui)
Matrix Factorization Deriving a SGD learning Work on specific ratings (r_ui)
Matrix Factorization Extensions Bias Different users rate differently (same for items) Implicit feedback No ratings #Clicks is a metric of confidence for user preference Use confidence to adjust loss
Thank you for your attention!