Collective Intelligence

Size: px
Start display at page:

Download "Collective Intelligence"

Transcription

1 Collective Intelligence

2 Collective Intelligence Prediction

3

4 A Tale of Two Models Lu Hong and Scott Page Interpreted and Generated Signals Journal of Economic Theory, 2009

5 Generated Signals Interpreted Signals

6 Generated Signal: disturbance or interference (social scientists/statisticians) Interpreted Signal: prediction from a model (computer scientists/psychologists)

7 Fundamental Question Generated Signals Collective Intelligence via Generated Signals Interpreted Signals Collective Intelligence via Interpreted Signals The Netflix Prize

8 Democracy: information aggregation Markets: prices are forecasts rational expectations

9 Generated Signals

10 Generated Signal noise Outcome Signal

11 UCSC!

12 L L-ε L+ε

13 Collective Intelligence 1.0

14 Outcome: θ in Θ

15 Signal: si

16 Distribution: f(s i θ)

17 Error = (si - 2 θ)

18 AveError = 1 n n i=1 (s i θ) 2

19 c = 1 n n i=1 s i

20 Crowd Error = (c - 2 θ)

21 Div = 1 n n i=1 (s i c ) 2

22 Diversity Prediction Theorem Crowd Error = Average Error - Diversity

23 Diversity Prediction Theorem Crowd Error = Average Error - Diversity n n (c θ) 2 = 1 n (s i θ) 2 1 n (s i c) 2 i=1 i=1

24 Crowd Error = Average Error - Diversity

25 Crowd Error

26 Crowd Error = Average Error

27 Average = Diversity Crowd Error - Error

28

29 0.6 = 2,

30 Collective Intelligence 2.0

31 Signals as Random Variables Mean of i s signal: µ i (θ) ias of i s signal: b i = (µ i (θ) - θ) Variance of i s signal: v i =E[(µ i (θ) - θ)] 2 Average ias = Average Variance V = Average Covariance C = 1 n(n 1) 1 n n i=1 b i 1 n n i=1 v i n E[ s i µ i ][s j µ j ] i=1 j i

32 ias Variance Decomposition E[SqE(c)] = n V + n 1 n C

33 Resolving the Paradox Diversity Prediction Theorem: Predictive Diversity is realized diversity, which improves accuracy. ias Variance Decomposition: Variance corresponds to noisier signals, which reduces accuracy. Negative covariance implies diverse realized diversity and improves expected accuracy of collective predictions.

34 Large Population Accuracy If the signals are independent, unbiased, and with bounded variance, then as n approaches infinity the crowd s error goes to zero E[SqE(c)] = n V + n 1 n 0

35 Ecologies of Models Suppose that there exist K possible models so that there exists a distribution across those models. p i = proportion of the population using model i

36 Collective Accuracy: Diverse Types D = 1 p i 2 i D V + D 1 D C Economo, Hong, Page

37 Weighting

38 Weighting by Accuracy Accuracy A = 1/σ 2

39 Weighting by Accuracy Accuracy A = 1/σ 2 Weights: w i = A i /(A 1 +A 2 +A 3 +A 4 +A n ) E M i=1 2 A i M s j=1 A i = j M i=1 A 2 i ( M k=1 A k) 2(s i)+ M i=1 j=i A i A j ( M k=1 A k) 2(s i,s j )

40 Example Three predictors with variances: 1,2, and 4 Equally weighted: E[SqE(C)] = 7/9 Accuracy weighted: E[SqE(C a )] = 4/7 accuracies: 1, 0.5, 0.25

41 Accuracy and Covariance Σ = variance covariance matrix u = (1,1,1,1,1,1) Weights: w = 1/n (Σ -1 u) Error: (u Σ -1 u) -1

42 Example: Two Models Weight on model a σ b 2 cov(a,b) σ a 2 + σ b 2 2cov(a,b)

43 Forecast Standard Use equal weights unless you have strong evidence to support unequal weighting of forecasts (Armstrong 2001)

44 Interpreted Signals

45 Interpretive Signal model Attributes Prediction

46 Concepts and Categorization Categorization enables a wide variety of subordinate functions because classifying something as a category member allows people to bring their knowledge of the category to bear on the new instance. Once people categorize some novel entity for example, then can use relevant knowledge for understanding and prediction. Medin and Rips

47 Interpretive Signal Example Charisma H MH ML L H G G G Experience MH G G G G ML G L G

48 Experience Interpretation 75 % Correct H G G G G MH G G G G G ML G L G

49 Interpreted Signals Accuracy: Number or boxes Diversity: Different boxes

50 inary Interpreted Signals Model Set of objects X =N Set of outcomes S = {G,} Interpretation: I j = {m j,1,m j,2 m j,nj } is a partition of X P(m j,i ) = probability m j,i arises

51 Collective Intelligence 3.0

52 Interpretive Signals and Collective Accuracy Charisma H MH ML L H G G G Experience MH G G G G ML G L G

53 Experience Interpretation 75 % Correct H G G G G MH G G G G G ML G L G

54 Charisma Interpretation H MH ML L G G G 75% Correct G G G G G G G G

55 alanced Interpretation 75% Correct G G G Extreme on one measure. Moderate on the other G G G G G G G

56 Voting Outcome H MH ML L H GG GGG GG G MH GGG GG G G GG ML GG G G L G GG G

57 Reality H MH ML L H G G G MH G G G G ML G L G

58 Collective Measurability: The outcome function F is measurable with respect to σ(m i ) iεn, the smallest sigma field such that all M i are measurable. Proposition: F satisfies collective measurability if and only if F(x) = G(M 1 (x) M n (x)) for all x in X

59 Agent 1 Outcome Function Agent 2

60 Agent 1 Outcome Function Agent 2

61 Threshold Separable Additivity: Given F, {M i } iεn, and G:{0,1} N into {0,1}, there exists an integer k and a set of functions h i :{0,1} into {0,1}, such that G(M 1 (x) M n (x)) = 1 if and only if h i (M i (x)) > k N.. This does not mean that the function is linear in the models, only that it can be expressed this way!

62 Theorem: A classification problem can be solved by a threshold voting mechanism if and only if it satisfies collective measurability and threshold separable additivity with respect to the agents models.

63 Interpreted Signal Example V(a,b,c,d) = a + b + c + d a,b,c,d independent N(0,1) Model 1: a+b Model 2: c Model 3: d Optimal Statistical weighting: 3/5, 1/5, 1/5 Optimal weighting: 1,1,1

64 Interpreted Signal Example V(a,b,c,d) = a + b + c + d a,b,c,d independent N(0,1) Model 1: a+b Model 2: c+d Model 3: a+b+d Optimal Statistical weighting:0, 1/3, 2/3 Optimal Weighting: 1,1,0

65

66

67 Some Details Netflix users rank movies from 1 to 5 Six years of data Half million users 17,700 movies Data divided into (training, testing) Testing Data dived into (probe, quiz, test)

68 Singular Value Decomposition Each movie represented by a vector: (p 1,p 2,p 3,p 4 p n ) Each person represented by a vector: (q 1,q 2,q 3,q 4 q n ) Rating: r ij = m i + a j + p q Training: choose p,q to minimiize (actual ij r ij ) 2 + c( p 2 + q 2 )

69 ellkor 50 dimensions in each of 107 models est Model: 6.8% improvement Combination of Models: 8.4% improvement

70 ellkor s Pragmatic Chaos est Model 8.4% Ensemble: 10.1%

71 Enter ``The Ensemble 23 Teams 30 Countries

72 And The Winner is RMSE for The Ensemble: RMSE for ellkor's Pragmatic Chaos:

73 WEIGHT vs RMSE

74 Interpreted Signal Example V(a,b,c,d) = a + b + c + d a,b,c,d independent N(0,1) Model 1: a+b+c Model 2: b+c+d Model 3: b+c Optimal Weighting: 1,1,-1

75 +

76 + +

77 + + -

78 Weighting question: context dependent

79 Generated Signals: Errors Cancel Interpreted Signals: undling

80 Ability: accuracy

81 Ability: accuracy Diversity: correlation or partitions?

Diversity and Team Science

Diversity and Team Science Diversity and Team Science Scott E Page Santa Fe Institute University of Michigan Outline A Great Big Complex World Diversity and Prediction Diversity Prediction Theorem Model Diversity Theorem Categorical

More information

Scott E Page. University of Michigan Santa Fe Institute. Leveraging Diversity

Scott E Page. University of Michigan Santa Fe Institute. Leveraging Diversity Scott E Page University of Michigan Santa Fe Institute Leveraging Diversity Outline Big Challenges Kiddie Pools Cattle and Netflix Problem Solving Poverty Climate Change Energy Health Care Disease/Epidemics

More information

Social Choice and Networks

Social Choice and Networks Social Choice and Networks Elchanan Mossel UC Berkeley All rights reserved Logistics 1 Different numbers for the course: Compsci 294 Section 063 Econ 207A Math C223A Stat 206A Room: Cory 241 Time TuTh

More information

Binary Principal Component Analysis in the Netflix Collaborative Filtering Task

Binary Principal Component Analysis in the Netflix Collaborative Filtering Task Binary Principal Component Analysis in the Netflix Collaborative Filtering Task László Kozma, Alexander Ilin, Tapani Raiko first.last@tkk.fi Helsinki University of Technology Adaptive Informatics Research

More information

Cpc Analyse X. askia analyse Significancy tests User guide

Cpc Analyse X. askia analyse Significancy tests User guide Cpc Analyse 5.3.2.X askia analyse Significancy tests User guide The aim of this document is to help you step by step to apply the significancy tests in askiaanalyse 5.3.2.X Analyse provides you Significancy

More information

Political Economy of Institutions and Development: Problem Set 1. Due Date: Thursday, February 23, in class.

Political Economy of Institutions and Development: Problem Set 1. Due Date: Thursday, February 23, in class. Political Economy of Institutions and Development: 14.773 Problem Set 1 Due Date: Thursday, February 23, in class. Answer Questions 1-3. handed in. The other two questions are for practice and are not

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

Review of the General Linear Model

Review of the General Linear Model Review of the General Linear Model EPSY 905: Multivariate Analysis Online Lecture #2 Learning Objectives Types of distributions: Ø Conditional distributions The General Linear Model Ø Regression Ø Analysis

More information

Preliminaries. Data Mining. The art of extracting knowledge from large bodies of structured data. Let s put it to use!

Preliminaries. Data Mining. The art of extracting knowledge from large bodies of structured data. Let s put it to use! Data Mining The art of extracting knowledge from large bodies of structured data. Let s put it to use! 1 Recommendations 2 Basic Recommendations with Collaborative Filtering Making Recommendations 4 The

More information

Using SVD to Recommend Movies

Using SVD to Recommend Movies Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last

More information

Variance Reduction and Ensemble Methods

Variance Reduction and Ensemble Methods Variance Reduction and Ensemble Methods Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Last Time PAC learning Bias/variance tradeoff small hypothesis

More information

Bagging and Other Ensemble Methods

Bagging and Other Ensemble Methods Bagging and Other Ensemble Methods Sargur N. Srihari srihari@buffalo.edu 1 Regularization Strategies 1. Parameter Norm Penalties 2. Norm Penalties as Constrained Optimization 3. Regularization and Underconstrained

More information

WINGS scientific whitepaper, version 0.8

WINGS scientific whitepaper, version 0.8 WINGS scientific whitepaper, version 0.8 Serguei Popov March 11, 2017 Abstract We describe our system, a forecasting method enabling decentralized decision taking via predictions on particular proposals,

More information

Review of Vectors and Matrices

Review of Vectors and Matrices A P P E N D I X D Review of Vectors and Matrices D. VECTORS D.. Definition of a Vector Let p, p, Á, p n be any n real numbers and P an ordered set of these real numbers that is, P = p, p, Á, p n Then P

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

INTRODUCTION TO LOG-LINEAR MODELING

INTRODUCTION TO LOG-LINEAR MODELING INTRODUCTION TO LOG-LINEAR MODELING Raymond Sin-Kwok Wong University of California-Santa Barbara September 8-12 Academia Sinica Taipei, Taiwan 9/8/2003 Raymond Wong 1 Hypothetical Data for Admission to

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

Singular value decomposition (SVD) of large random matrices. India, 2010

Singular value decomposition (SVD) of large random matrices. India, 2010 Singular value decomposition (SVD) of large random matrices Marianna Bolla Budapest University of Technology and Economics marib@math.bme.hu India, 2010 Motivation New challenge of multivariate statistics:

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

Areal data. Infant mortality, Auckland NZ districts. Number of plant species in 20cm x 20 cm patches of alpine tundra. Wheat yield

Areal data. Infant mortality, Auckland NZ districts. Number of plant species in 20cm x 20 cm patches of alpine tundra. Wheat yield Areal data Reminder about types of data Geostatistical data: Z(s) exists everyhere, varies continuously Can accommodate sudden changes by a model for the mean E.g., soil ph, two soil types with different

More information

Recommender Systems. Dipanjan Das Language Technologies Institute Carnegie Mellon University. 20 November, 2007

Recommender Systems. Dipanjan Das Language Technologies Institute Carnegie Mellon University. 20 November, 2007 Recommender Systems Dipanjan Das Language Technologies Institute Carnegie Mellon University 20 November, 2007 Today s Outline What are Recommender Systems? Two approaches Content Based Methods Collaborative

More information

Andriy Mnih and Ruslan Salakhutdinov

Andriy Mnih and Ruslan Salakhutdinov MATRIX FACTORIZATION METHODS FOR COLLABORATIVE FILTERING Andriy Mnih and Ruslan Salakhutdinov University of Toronto, Machine Learning Group 1 What is collaborative filtering? The goal of collaborative

More information

Algorithms for Collaborative Filtering

Algorithms for Collaborative Filtering Algorithms for Collaborative Filtering or How to Get Half Way to Winning $1million from Netflix Todd Lipcon Advisor: Prof. Philip Klein The Real-World Problem E-commerce sites would like to make personalized

More information

Probability Models of Information Exchange on Networks Lecture 1

Probability Models of Information Exchange on Networks Lecture 1 Probability Models of Information Exchange on Networks Lecture 1 Elchanan Mossel UC Berkeley All Rights Reserved Motivating Questions How are collective decisions made by: people / computational agents

More information

LECTURE 10: LINEAR MODEL SELECTION PT. 1. October 16, 2017 SDS 293: Machine Learning

LECTURE 10: LINEAR MODEL SELECTION PT. 1. October 16, 2017 SDS 293: Machine Learning LECTURE 10: LINEAR MODEL SELECTION PT. 1 October 16, 2017 SDS 293: Machine Learning Outline Model selection: alternatives to least-squares Subset selection - Best subset - Stepwise selection (forward and

More information

Learning representations

Learning representations Learning representations Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 4/11/2016 General problem For a dataset of n signals X := [ x 1 x

More information

Variable Selection and Weighting by Nearest Neighbor Ensembles

Variable Selection and Weighting by Nearest Neighbor Ensembles Variable Selection and Weighting by Nearest Neighbor Ensembles Jan Gertheiss (joint work with Gerhard Tutz) Department of Statistics University of Munich WNI 2008 Nearest Neighbor Methods Introduction

More information

From inductive inference to machine learning

From inductive inference to machine learning From inductive inference to machine learning ADAPTED FROM AIMA SLIDES Russel&Norvig:Artificial Intelligence: a modern approach AIMA: Inductive inference AIMA: Inductive inference 1 Outline Bayesian inferences

More information

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr. Simulation Discrete-Event System Simulation Chapter 0 Output Analysis for a Single Model Purpose Objective: Estimate system performance via simulation If θ is the system performance, the precision of the

More information

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Linear Models DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Linear regression Least-squares estimation

More information

Classification 1: Linear regression of indicators, linear discriminant analysis

Classification 1: Linear regression of indicators, linear discriminant analysis Classification 1: Linear regression of indicators, linear discriminant analysis Ryan Tibshirani Data Mining: 36-462/36-662 April 2 2013 Optional reading: ISL 4.1, 4.2, 4.4, ESL 4.1 4.3 1 Classification

More information

CS540 ANSWER SHEET

CS540 ANSWER SHEET CS540 ANSWER SHEET Name Email 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 1 2 Final Examination CS540-1: Introduction to Artificial Intelligence Fall 2016 20 questions, 5 points

More information

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model Chapter Output Analysis for a Single Model. Contents Types of Simulation Stochastic Nature of Output Data Measures of Performance Output Analysis for Terminating Simulations Output Analysis for Steady-state

More information

Collaborative topic models: motivations cont

Collaborative topic models: motivations cont Collaborative topic models: motivations cont Two topics: machine learning social network analysis Two people: " boy Two articles: article A! girl article B Preferences: The boy likes A and B --- no problem.

More information

9. Least squares data fitting

9. Least squares data fitting L. Vandenberghe EE133A (Spring 2017) 9. Least squares data fitting model fitting regression linear-in-parameters models time series examples validation least squares classification statistics interpretation

More information

6.1 Polynomial Functions

6.1 Polynomial Functions 6.1 Polynomial Functions Definition. A polynomial function is any function p(x) of the form p(x) = p n x n + p n 1 x n 1 + + p 2 x 2 + p 1 x + p 0 where all of the exponents are non-negative integers and

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Advanced Digital Design with the Verilog HDL, Second Edition Michael D. Ciletti Prentice Hall, Pearson Education, 2011

Advanced Digital Design with the Verilog HDL, Second Edition Michael D. Ciletti Prentice Hall, Pearson Education, 2011 Problem 2-1 Recall that a minterm is a cube in which every variable appears. A Boolean expression in SOP form is canonical if every cube in the expression has a unique representation in which all of the

More information

Suppose we needed four batches of formaldehyde, and coulddoonly4runsperbatch. Thisisthena2 4 factorial in 2 2 blocks.

Suppose we needed four batches of formaldehyde, and coulddoonly4runsperbatch. Thisisthena2 4 factorial in 2 2 blocks. 58 2. 2 factorials in 2 blocks Suppose we needed four batches of formaldehyde, and coulddoonly4runsperbatch. Thisisthena2 4 factorial in 2 2 blocks. Some more algebra: If two effects are confounded with

More information

Mathathon Round 1 (2 points each) 1. If this mathathon has 7 rounds of 3 problems each, how many problems does it have in total? (Not a trick!

Mathathon Round 1 (2 points each) 1. If this mathathon has 7 rounds of 3 problems each, how many problems does it have in total? (Not a trick! Mathathon Round (2 points each). If this mathathon has 7 rounds of 3 problems each, how many problems does it have in total? (Not a trick!) Answer: 2 Solution: 7 3 = 2 (Difficulty: ) 2. Five people, named

More information

Factorial designs (Chapter 5 in the book)

Factorial designs (Chapter 5 in the book) Factorial designs (Chapter 5 in the book) Ex: We are interested in what affects ph in a liquide. ph is the response variable Choose the factors that affect amount of soda air flow... Choose the number

More information

Other-Regarding Preferences: Theory and Evidence

Other-Regarding Preferences: Theory and Evidence Other-Regarding Preferences: Theory and Evidence June 9, 2009 GENERAL OUTLINE Economic Rationality is Individual Optimization and Group Equilibrium Narrow version: Restrictive Assumptions about Objective

More information

6.034 Introduction to Artificial Intelligence

6.034 Introduction to Artificial Intelligence 6.34 Introduction to Artificial Intelligence Tommi Jaakkola MIT CSAIL The world is drowning in data... The world is drowning in data...... access to information is based on recommendations Recommending

More information

Chapter Summary. Sets The Language of Sets Set Operations Set Identities Functions Types of Functions Operations on Functions Computability

Chapter Summary. Sets The Language of Sets Set Operations Set Identities Functions Types of Functions Operations on Functions Computability Chapter 2 1 Chapter Summary Sets The Language of Sets Set Operations Set Identities Functions Types of Functions Operations on Functions Computability Sequences and Summations Types of Sequences Summation

More information

A Simple Algorithm for Nuclear Norm Regularized Problems

A Simple Algorithm for Nuclear Norm Regularized Problems A Simple Algorithm for Nuclear Norm Regularized Problems ICML 00 Martin Jaggi, Marek Sulovský ETH Zurich Matrix Factorizations for recommender systems Y = Customer Movie UV T = u () The Netflix challenge:

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2017 Notes on Lecture the most technical lecture of the course includes some scary looking math, but typically with intuitive interpretation use of standard machine

More information

Ensemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan

Ensemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan Ensemble Methods NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan How do you make a decision? What do you want for lunch today?! What did you have last night?! What are your favorite

More information

day month year documentname/initials 1

day month year documentname/initials 1 ECE471-571 Pattern Recognition Lecture 13 Decision Tree Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi

More information

Stock Prices, News, and Economic Fluctuations: Comment

Stock Prices, News, and Economic Fluctuations: Comment Stock Prices, News, and Economic Fluctuations: Comment André Kurmann Federal Reserve Board Elmar Mertens Federal Reserve Board Online Appendix November 7, 213 Abstract This web appendix provides some more

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Mu Alpha Theta National Convention 2013

Mu Alpha Theta National Convention 2013 Practice Round Alpha School Bowl P1. What is the common difference of the arithmetic sequence 10, 23,? P2. Find the sum of the digits of the base ten representation of 2 15. P3. Find the smaller value

More information

Ensembles of Classifiers.

Ensembles of Classifiers. Ensembles of Classifiers www.biostat.wisc.edu/~dpage/cs760/ 1 Goals for the lecture you should understand the following concepts ensemble bootstrap sample bagging boosting random forests error correcting

More information

Matrix Factorization In Recommender Systems. Yong Zheng, PhDc Center for Web Intelligence, DePaul University, USA March 4, 2015

Matrix Factorization In Recommender Systems. Yong Zheng, PhDc Center for Web Intelligence, DePaul University, USA March 4, 2015 Matrix Factorization In Recommender Systems Yong Zheng, PhDc Center for Web Intelligence, DePaul University, USA March 4, 2015 Table of Contents Background: Recommender Systems (RS) Evolution of Matrix

More information

Eco517 Fall 2014 C. Sims FINAL EXAM

Eco517 Fall 2014 C. Sims FINAL EXAM Eco517 Fall 2014 C. Sims FINAL EXAM This is a three hour exam. You may refer to books, notes, or computer equipment during the exam. You may not communicate, either electronically or in any other way,

More information

Power Functions for. Process Behavior Charts

Power Functions for. Process Behavior Charts Power Functions for Process Behavior Charts Donald J. Wheeler and Rip Stauffer Every data set contains noise (random, meaningless variation). Some data sets contain signals (nonrandom, meaningful variation).

More information

CS246 Final Exam, Winter 2011

CS246 Final Exam, Winter 2011 CS246 Final Exam, Winter 2011 1. Your name and student ID. Name:... Student ID:... 2. I agree to comply with Stanford Honor Code. Signature:... 3. There should be 17 numbered pages in this exam (including

More information

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Schütze s, linked from http://informationretrieval.org/ IR 8: Evaluation & SVD Paul Ginsparg Cornell University, Ithaca, NY 20 Sep 2011

More information

UVA CS 4501: Machine Learning

UVA CS 4501: Machine Learning UVA CS 4501: Machine Learning Lecture 21: Decision Tree / Random Forest / Ensemble Dr. Yanjun Qi University of Virginia Department of Computer Science Where are we? è Five major sections of this course

More information

Feature Engineering, Model Evaluations

Feature Engineering, Model Evaluations Feature Engineering, Model Evaluations Giri Iyengar Cornell University gi43@cornell.edu Feb 5, 2018 Giri Iyengar (Cornell Tech) Feature Engineering Feb 5, 2018 1 / 35 Overview 1 ETL 2 Feature Engineering

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 3 We are given a set of training examples, consisting of input-output pairs (x,y), where: 1. x is an item of the type we want to evaluate. 2. y is the value of some

More information

Experimental design (DOE) - Design

Experimental design (DOE) - Design Experimental design (DOE) - Design Menu: QCExpert Experimental Design Design Full Factorial Fract Factorial This module designs a two-level multifactorial orthogonal plan 2 n k and perform its analysis.

More information

ABC is a triangle. The point D lies on AC. Angle BDC = 90 BD = 10 cm, AB = 15 cm and DC = 12.5 cm.

ABC is a triangle. The point D lies on AC. Angle BDC = 90 BD = 10 cm, AB = 15 cm and DC = 12.5 cm. 1. Mr McGrath s special questions FOUNDATION Paper B ABC is a triangle. The point D lies on AC. Angle BDC = 90 BD = 10 cm, AB = 15 cm and DC = 12.5 cm. (a) Calculate the length of AD. Give your answer

More information

Stat 406: Algorithms for classification and prediction. Lecture 1: Introduction. Kevin Murphy. Mon 7 January,

Stat 406: Algorithms for classification and prediction. Lecture 1: Introduction. Kevin Murphy. Mon 7 January, 1 Stat 406: Algorithms for classification and prediction Lecture 1: Introduction Kevin Murphy Mon 7 January, 2008 1 1 Slides last updated on January 7, 2008 Outline 2 Administrivia Some basic definitions.

More information

CSSS/STAT/SOC 321 Case-Based Social Statistics I. Levels of Measurement

CSSS/STAT/SOC 321 Case-Based Social Statistics I. Levels of Measurement CSSS/STAT/SOC 321 Case-Based Social Statistics I Levels of Measurement Christopher Adolph Department of Political Science and Center for Statistics and the Social Sciences University of Washington, Seattle

More information

Accuracy & confidence

Accuracy & confidence Accuracy & confidence Most of course so far: estimating stuff from data Today: how much do we trust our estimates? Last week: one answer to this question prove ahead of time that training set estimate

More information

DISCUSSION CLASS OF DAX IS ON 22ND MARCH, TIME : 9-12 BRING ALL YOUR DOUBTS [STRAIGHT OBJECTIVE TYPE]

DISCUSSION CLASS OF DAX IS ON 22ND MARCH, TIME : 9-12 BRING ALL YOUR DOUBTS [STRAIGHT OBJECTIVE TYPE] DISCUSSION CLASS OF DAX IS ON ND MARCH, TIME : 9- BRING ALL YOUR DOUBTS [STRAIGHT OBJECTIVE TYPE] Q. Let y = cos x (cos x cos x). Then y is (A) 0 only when x 0 (B) 0 for all real x (C) 0 for all real x

More information

A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE

A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE Li Sheng Institute of intelligent information engineering Zheiang University Hangzhou, 3007, P. R. China ABSTRACT In this paper, a neural network-driven

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring / Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring 2015 http://ce.sharif.edu/courses/93-94/2/ce717-1 / Agenda Combining Classifiers Empirical view Theoretical

More information

Fundamentals of Operations Research. Prof. G. Srinivasan. Indian Institute of Technology Madras. Lecture No. # 15

Fundamentals of Operations Research. Prof. G. Srinivasan. Indian Institute of Technology Madras. Lecture No. # 15 Fundamentals of Operations Research Prof. G. Srinivasan Indian Institute of Technology Madras Lecture No. # 15 Transportation Problem - Other Issues Assignment Problem - Introduction In the last lecture

More information

Regrese a predikce pomocí fuzzy asociačních pravidel

Regrese a predikce pomocí fuzzy asociačních pravidel Regrese a predikce pomocí fuzzy asociačních pravidel Pavel Rusnok Institute for Research and Applications of Fuzzy Modeling University of Ostrava Ostrava, Czech Republic pavel.rusnok@osu.cz March 1, 2018,

More information

Multivariate GARCH models.

Multivariate GARCH models. Multivariate GARCH models. Financial market volatility moves together over time across assets and markets. Recognizing this commonality through a multivariate modeling framework leads to obvious gains

More information

DS-GA 1002 Lecture notes 12 Fall Linear regression

DS-GA 1002 Lecture notes 12 Fall Linear regression DS-GA Lecture notes 1 Fall 16 1 Linear models Linear regression In statistics, regression consists of learning a function relating a certain quantity of interest y, the response or dependent variable,

More information

An Introduction to Parameter Estimation

An Introduction to Parameter Estimation Introduction Introduction to Econometrics An Introduction to Parameter Estimation This document combines several important econometric foundations and corresponds to other documents such as the Introduction

More information

Ensembles. Léon Bottou COS 424 4/8/2010

Ensembles. Léon Bottou COS 424 4/8/2010 Ensembles Léon Bottou COS 424 4/8/2010 Readings T. G. Dietterich (2000) Ensemble Methods in Machine Learning. R. E. Schapire (2003): The Boosting Approach to Machine Learning. Sections 1,2,3,4,6. Léon

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

STA 414/2104, Spring 2014, Practice Problem Set #1

STA 414/2104, Spring 2014, Practice Problem Set #1 STA 44/4, Spring 4, Practice Problem Set # Note: these problems are not for credit, and not to be handed in Question : Consider a classification problem in which there are two real-valued inputs, and,

More information

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals

More information

Forecast comparison of principal component regression and principal covariate regression

Forecast comparison of principal component regression and principal covariate regression Forecast comparison of principal component regression and principal covariate regression Christiaan Heij, Patrick J.F. Groenen, Dick J. van Dijk Econometric Institute, Erasmus University Rotterdam Econometric

More information

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i

More information

Describing Contingency tables

Describing Contingency tables Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds

More information

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University Chap 1. Overview of Statistical Learning (HTF, 2.1-2.6, 2.9) Yongdai Kim Seoul National University 0. Learning vs Statistical learning Learning procedure Construct a claim by observing data or using logics

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Linear Algebra Review

Linear Algebra Review Chapter 1 Linear Algebra Review It is assumed that you have had a beginning course in linear algebra, and are familiar with matrix multiplication, eigenvectors, etc I will review some of these terms here,

More information

Multi-dimensional Human Development Measures : Trade-offs and Inequality

Multi-dimensional Human Development Measures : Trade-offs and Inequality Multi-dimensional Human Development Measures : Trade-offs and Inequality presented by Jaya Krishnakumar University of Geneva UNDP Workshop on Measuring Human Development June 14, 2013 GIZ, Eschborn, Frankfurt

More information

Evaluation requires to define performance measures to be optimized

Evaluation requires to define performance measures to be optimized Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation

More information

LESSON 8.1 RATIONAL EXPRESSIONS I

LESSON 8.1 RATIONAL EXPRESSIONS I LESSON 8. RATIONAL EXPRESSIONS I LESSON 8. RATIONAL EXPRESSIONS I 7 OVERVIEW Here is what you'll learn in this lesson: Multiplying and Dividing a. Determining when a rational expression is undefined Almost

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Context-based Reasoning in Ambient Intelligence - CoReAmI -

Context-based Reasoning in Ambient Intelligence - CoReAmI - Context-based in Ambient Intelligence - CoReAmI - Hristijan Gjoreski Department of Intelligent Systems, Jožef Stefan Institute Supervisor: Prof. Dr. Matjaž Gams Co-supervisor: Dr. Mitja Luštrek Background

More information

Ensemble learning 11/19/13. The wisdom of the crowds. Chapter 11. Ensemble methods. Ensemble methods

Ensemble learning 11/19/13. The wisdom of the crowds. Chapter 11. Ensemble methods. Ensemble methods The wisdom of the crowds Ensemble learning Sir Francis Galton discovered in the early 1900s that a collection of educated guesses can add up to very accurate predictions! Chapter 11 The paper in which

More information

UNC Charlotte 2005 Comprehensive March 7, 2005

UNC Charlotte 2005 Comprehensive March 7, 2005 March 7, 2005 1 The numbers x and y satisfy 2 x = 15 and 15 y = 32 What is the value xy? (A) 3 (B) 4 (C) 5 (D) 6 (E) none of A, B, C or D 2 Suppose x, y, z, and w are real numbers satisfying x/y = 4/7,

More information

ECON0702: Mathematical Methods in Economics

ECON0702: Mathematical Methods in Economics ECON0702: Mathematical Methods in Economics Yulei Luo SEF of HKU January 12, 2009 Luo, Y. (SEF of HKU) MME January 12, 2009 1 / 35 Course Outline Economics: The study of the choices people (consumers,

More information

Lecture 3 Linear Algebra Background

Lecture 3 Linear Algebra Background Lecture 3 Linear Algebra Background Dan Sheldon September 17, 2012 Motivation Preview of next class: y (1) w 0 + w 1 x (1) 1 + w 2 x (1) 2 +... + w d x (1) d y (2) w 0 + w 1 x (2) 1 + w 2 x (2) 2 +...

More information

Section Summary. Definition of a Function.

Section Summary. Definition of a Function. Section 2.3 Section Summary Definition of a Function. Domain, Codomain Image, Preimage Injection, Surjection, Bijection Inverse Function Function Composition Graphing Functions Floor, Ceiling, Factorial

More information

Fast Adaptive Algorithm for Robust Evaluation of Quality of Experience

Fast Adaptive Algorithm for Robust Evaluation of Quality of Experience Fast Adaptive Algorithm for Robust Evaluation of Quality of Experience Qianqian Xu, Ming Yan, Yuan Yao October 2014 1 Motivation Mean Opinion Score vs. Paired Comparisons Crowdsourcing Ranking on Internet

More information

Oligopoly Theory 2 Bertrand Market Games

Oligopoly Theory 2 Bertrand Market Games 1/10 Oligopoly Theory 2 Bertrand Market Games May 4, 2014 2/10 Outline 1 Bertrand Market Game 2 Bertrand Paradox 3 Asymmetric Firms 3/10 Bertrand Duopoly Market Game Discontinuous Payoff Functions (1 p

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

HMMT November 2017 November 11, 2017

HMMT November 2017 November 11, 2017 HMMT November 017 November 11, 017 Theme Round 1. Two ordered pairs (a, b) and (c, d), where a, b, c, d are real numbers, form a basis of the coordinate plane if ad bc. Determine the number of ordered

More information

Construction of Mixed-Level Orthogonal Arrays for Testing in Digital Marketing

Construction of Mixed-Level Orthogonal Arrays for Testing in Digital Marketing Construction of Mixed-Level Orthogonal Arrays for Testing in Digital Marketing Vladimir Brayman Webtrends October 19, 2012 Advantages of Conducting Designed Experiments in Digital Marketing Availability

More information