Large-scale Information Processing, Summer Recommender Systems (part 2)

Size: px

Start display at page:

Download "Large-scale Information Processing, Summer Recommender Systems (part 2)"

Ezra Bailey
6 years ago
Views:

1 Large-scale Information Processing, Summer th Exercise Recommender Systems (part 2) Emmanouil Tzouridis tzouridis@kma.informatik.tu-darmstadt.de Knowledge Mining & Assessment

2 SVM question When a point has ξ > 0 is support vector?

3 SVM question Use equations from dual

4 SVM question Use equations from dual ξ>0 a = C support vector

5 What is line search? Line search

6 Line search What is line search? Given an update direction, we need to find a satisfactory step size? (Approximately) find minimizer of g

7 Line search (Approximately) find minimizer of g Exact line search Expensive! We need good step size without making too much effort on finding it

8 Line search (Approximately) find minimizer of g Bad steps Good steps

9 Wolfe conditions Line search

10 Line search Wolfe conditions Sufficient decrease Decrease is proportional to step length and derivative

11 Line search Wolfe conditions Sufficient decrease Decrease is proportional to step length and derivative But this does not eliminate very small steps

12 Line search Wolfe conditions Sufficient decrease Strong curvature condition Ensures that slope in new point is sufficiently greater than earlier slope Do not allow slope to be too positive

13 Goldstein conditions Line search

14 Line search Goldstein conditions Sufficient decrease + prevent small solutions Might exclude minimizers

15 Backtracking? Line Search

16 Line Search Backtracking Start with a big η and decrease it until condition is met

17 Multi armed Bandits Many slot machines Which machine to play? Maximize gains/rewards

18 Multi armed Bandits Many slot machines Which machine to play? Maximize gains/rewards Can model Recommender Systems Slot machines (or Arms) are the items Maximize Click-through-rate (or other metric)

19 Multi armed Bandits Exploration Exploitation dilemma Should we keep using the current best arm? Exploitation Should we we pick arms at random to gather data? Exploration Only Exploitation Make assumptions using only few data Only Exploration Only gather data without using the knowledge from them Trade-off ε-greedy UCB

20 ε-greedy With ε probabily explore With 1-ε probabily exploite

21 UCB Upper Confidence Bound Use upper Bound of reward for picking an arm Pick this one Expected reward is less but upper bound is larger

22 UCB Upper Confidence Bound Use upper Bound of reward for picking an arm Implications? Pick this one Expected reward is less but upper bound is larger

23 UCB Upper Confidence Bound Use upper Bound of reward for picking an arm Implications? Huge variance then it is picked Explore Huge expected reward then it is picked Exploit

24 LinUCB Contextual recommendations Personalized Context can vary User profile Item content Season information } Lederhose e.g. How likely is that a Bavarian guy wants a before Oktoberfest?

25 LinUCB Contextual recommendations Personalized Context can vary User profile Item content Season information } Lederhose e.g. How likely is that a Bavarian guy wants a before Oktoberfest?

26 LinUCB Reward is linear to the contexts Use ridge regression to learn parameters Model variance Arm selection

27 LinUCB

28 UCB New items in UCB? Cold start problem?

29 UCB New items in UCB? Cold start problem? No New item means high uncertainty high upper confidence bound

30 UCB New items in UCB? Cold start problem? No New item means high uncertainty high upper confidence bound Logarithmic regret Difference of optimal reward from reward that we got

31 Matrix Factorization Sophisticated collaborative filtering Reconstruct ratings using latent factors Preferences of Users Item attributes R = Q P

32 Matrix Factorization Objective Function q, p are unknown Non-convex Objective Assuming known q Convex minimization for p Alternating least squares

33 Matrix Factorization Deriving a SGD learning

34 Matrix Factorization Deriving a SGD learning Work on specific ratings (r_ui)

35 Matrix Factorization Deriving a SGD learning Work on specific ratings (r_ui)

36 Matrix Factorization Deriving a SGD learning Work on specific ratings (r_ui)

37 Matrix Factorization Extensions Bias Different users rate differently (same for items) Implicit feedback No ratings #Clicks is a metric of confidence for user preference Use confidence to adjust loss

38 Thank you for your attention!

Bandit Algorithms. Zhifeng Wang ... Department of Statistics Florida State University

Bandit Algorithms. Zhifeng Wang ... Department of Statistics Florida State University Bandit Algorithms Zhifeng Wang Department of Statistics Florida State University Outline Multi-Armed Bandits (MAB) Exploration-First Epsilon-Greedy Softmax UCB Thompson Sampling Adversarial Bandits Exp3