Nearest-Neighbor Matchup Effects: Predicting March Madness
|
|
- Laura Miles
- 6 years ago
- Views:
Transcription
1 Nearest-Neighbor Matchup Effects: Predicting March Madness Andrew Hoegh, Marcos Carzolio, Ian Crandell, Xinran Hu, Lucas Roberts, Yuhyun Song, Scotland Leman Department of Statistics, Virginia Tech September 26, 2015
2 NCAA Bracket Competition Who is has filled out an NCAA bracket before? Introduction Motivation 2 / 36
3 NCAA Bracket Competition Who has won an NCAA bracket competition? Introduction Motivation 3 / 36
4 NCAA Bracket Competition Who has lost a bracket competition to someone that does not know what a basketball looks like? Introduction Motivation 4 / 36
5 Talk Overview General modeling strategies for NCAA tournament competitions Matchup effects modeling framework Uncertainty inherent in NCAA tournament & competitions Introduction Overview 5 / 36
6 Competition Types General Modeling Competition Types 6 / 36
7 Loss Functions: Bracket L(y, ŷ) = cδ(y ŷ) L(y, ŷ, r) = c r δ(y ŷ) General Modeling Decision Theory 7 / 36
8 Loss Functions: Probabilistic Prediction Kaggle loss function L(y, p) = y log(p) (1 y) log(1 p) This is a proper scoring rule. L(y=1,p) P General Modeling Decision Theory 8 / 36
9 Relevant Data: Team Characteristics There are many important characteristics useful for modeling the strength of an NCAA basketball team: Winning percentage, Point differential, Strength of schedule, Conference affiliation,... Rebounding percentage, Adjusted offensive efficiency, and Adjusted defensive efficiency. General Modeling Data 9 / 36
10 Relevant Data: Ratings and Rankings Rather than using team characteristics, there are many available rating and ranking systems: ESPN BPI, Sagarin, RPI, Pomeroy, and Logistic Regression/Markov Chain (LRMC). General Modeling Data 10 / 36
11 Relative Strength Models Typically the model framework for predicting winner of games (or winning probabilities) can be formulated as a relative strength model. Formally this can be expressed as: y ij = f (θ i, θ j ) General Modeling Data 11 / 36
12 Relative Strength Models Typically the model framework for predicting winner of games (or winning probabilities) can be formulated as a relative strength model. Formally this can be expressed as: y ij = f (θ i, θ j ) such as linear model: y ij = β home + (θ i θ j )β D + ɛ ij, where ɛ ij N (0, σ 2 ) General Modeling Data 11 / 36
13 Relative Strength Models Typically the model framework for predicting winner of games (or winning probabilities) can be formulated as a relative strength model. Formally this can be expressed as: y ij = f (θ i, θ j ) such as linear model: y ij = β home + (θ i θ j )β D + ɛ ij, where ɛ ij N (0, σ 2 ) logistic regression: y ij Bernoulli(p ij ) where or logit(p ij ) = β home + (θ i θ j )β D General Modeling Data 11 / 36
14 Transitivity Note that relative strength models of this type are strictly transitive, where P A>B denotes the probability that team A beats team B. Under transitive models, {P A>B > 0.5 P B>C > 0.5} P A>C > 0.5. General Modeling Data 12 / 36
15 Bayesian Relative Strength Models Bayesian Linear Model Prior β N (β; m, s) Likelihood β Y, X N (β; ˆβ, Σ) Posterior β Y, X N (β; µ, Σ β ) General Modeling Data 13 / 36
16 Bayesian Relative Strength Models Bayesian Linear Model Prior β N (β; m, s) Likelihood β Y, X N (β; ˆβ, Σ) Posterior β Y, X N (β; µ, Σ β ) General Modeling Data 14 / 36
17 Bayesian Relative Strength Models Bayesian Linear Model Prior β N (β; m, s) Likelihood β Y, X N (β; ˆβ, Σ) Posterior β Y, X N (β; µ, Σ β ) General Modeling Data 15 / 36
18 Bayesian Relative Strength Models Bayesian Linear Model Predictive: p(y X, Y, X) = p(y X, β)p(β Y, X)dβ P(Y < 0) = 0 p(y X, Y, X)dY General Modeling Data 16 / 36
19 Analytics for Bracket Competitions General Modeling Data 17 / 36
20 Analytics for Kaggle-Style Competitions A substantial challenge in predictive modeling of sports competitions, and major discussion point, is predicting upsets. Consider predicting upsets as a comparison of the probabilistic predictions for competitors. General Modeling Data 18 / 36
21 Analytics for Kaggle-Style Competitions Distribution of Predictions A substantial challenge in predictive modeling of sports competitions, and major discussion point, is predicting upsets. Consider predicting upsets as a comparison of the probabilistic predictions for competitors. Your Prediction P(Upset) General Modeling Data 18 / 36
22 Matchup Effects Motivating Quote Well, it s hard to predict a particular team. You have to look at the higher seed and see, do they overwhelm the smaller school or the lower seed? Can they overwhelm someone with their athleticism or length or size or quickness or speed? Do they play a particular style of the pressure defense? And I think I look at the lower seed then and say, can they counter the higher seed s strengths? Do they shoot the three point shot well? Do they have athleticism at certain positions? Do they play a particular style that will give the higher seed trouble? Andy Enfield Nearest Neighbor Matchup Effects motivation 19 / 36
23 NNME Intuition Observed Results Games Nearest Neighbor Matchup Effects motivation 20 / 36
24 NNME Intuition Observed Results Estimated Strength Games Nearest Neighbor Matchup Effects motivation 21 / 36
25 NNME Intuition Observed Results Games Nearest Neighbor Matchup Effects motivation 22 / 36
26 NNME Intuition Observed Results Games Nearest Neighbor Matchup Effects motivation 23 / 36
27 Outline of the NNME Estimation of the nearest-neighbor matchup effects has three components: 1. Fit a relative strength model, 2. Identify neighbors for the matchup, and 3. Calibrate the matchup adjustment. Nearest Neighbor Matchup Effects Outline 24 / 36
28 Relative Strength Model Consider the simple relative strength model using the Sagarin ratings and home court effect. Y ij = β home + D ij β D + ɛ ij, ɛ ij N (0, σ 2 ) where D ij = Sagarin i Sagarin j Nearest Neighbor Matchup Effects Relative Strength Model 25 / 36
29 Relative Strength Model Consider the simple relative strength model using the Sagarin ratings and home court effect. Y ij = β home + D ij β D + ɛ ij, ɛ ij N (0, σ 2 ) where D ij = Sagarin i Sagarin j Note that relative strength models of this type are strictly transitive. Nearest Neighbor Matchup Effects Relative Strength Model 25 / 36
30 Identifying Neighbors Identifying neighbors first requires specifying team characteristics and a distance function between teams. We used a collection of data from Ken Pomeroy s including: Effective Height Adjusted Tempo Effective Field Goal Percentage Defense Offensive Rebound Percentage Block Percentage Steal Rate Three Point Field Goal Contribution Nearest Neighbor Matchup Effects Identifying Neighbors 26 / 36
31 Uniformly Weighting Characteristics Nearest Neighbor Matchup Effects Identifying Neighbors 27 / 36
32 Bayesian Visual Analytics Nearest Neighbor Matchup Effects Identifying Neighbors 28 / 36
33 Bayesian Visual Analytics Nearest Neighbor Matchup Effects Identifying Neighbors 29 / 36
34 Finding Neighbors For each particular matchup, say Dayton vs. Stanford, identify past opponents to see who is most like the current opponent. Teams Dayton played similar to Stanford Teams Stanford played similar to Dayton California George Mason Georgia Tech George Washington Gonzaga Cal Poly California Oregon Pittsburgh Utah Nearest Neighbor Matchup Effects Identifying Neighbors 30 / 36
35 Matchup Adjustment Notation Let R j (i) = 1 K k N j (Y ik µ ik ), where N j are the neighbors for team i with respect to team j, Y ik is the observed point differential between team i, and team k and µ ik is the expected point differential between team i and team k. Then φ ij = ρ(r i (j) R j (i)), where ρ controls how much information is passed from the neighbors. Nearest Neighbor Matchup Effects Matchup Adjustment 31 / 36
36 Predictive Distribution Then using the relative strength model previously described, the predictive distribution becomes: Y ij X ij = β home + D ij β D + φ ij + ɛ ij Effectively φ ij shifts the predictive distribution and results in a non-transitive model. Nearest Neighbor Matchup Effects Matchup Adjustment 32 / 36
37 Estimating Model Parameters Using data across NCAA tournaments from , the model parameters are estimated. β home β D σ 2 ρ Posterior mean Credible interval (3.83,3.91) (0.909,0.916) (120.0,123.2) (0.012,0.454) The positive credible interval suggests a moderate, but meaningful result from the matchup effect. Nearest Neighbor Matchup Effects Results 33 / 36
38 Demonstration: 2014 NCAA Tournament Largest shifts in expected point spread (φ ij ). Team 1 Team 1 R 1 (2) R 2 (1) φ 12 Cal Poly Wichita St UConn St. Joes Dayton Stanford Dayton Syracuse Kentucky Michigan UMass Tennessee Memphis Virginia Michigan Tennessee Michigan Texas Syracuse W.Mich Nearest Neighbor Matchup Effects Demonstration 34 / 36
39 Closer Look: 2014 NCAA Tournament Team Neighbor Point Diff. E[Point Diff] Residual Dayton California Dayton Gonzaga Dayton George Mason Dayton Georgia Tech Dayton George Washington Stanford California Stanford California Stanford Oregon Stanford Pittsburgh Stanford Cal Poly Stanford Utah Nearest Neighbor Matchup Effects Demonstration 35 / 36
40 NNME Concluding Thoughts 1. Evaluated on small number of data points leads to very uncertain outcomes, even with quality predictions 2. Does this contest actually have a proper scoring rule? 3. Alternative strategies (maximizing expected return). Nearest Neighbor Matchup Effects Demonstration 36 / 36
41 NNME Concluding Thoughts 1. Evaluated on small number of data points leads to very uncertain outcomes, even with quality predictions 2. Does this contest actually have a proper scoring rule? 3. Alternative strategies (maximizing expected return). Nearest Neighbor Matchup Effects Demonstration 36 / 36
Lesson Plan. AM 121: Introduction to Optimization Models and Methods. Lecture 17: Markov Chains. Yiling Chen SEAS. Stochastic process Markov Chains
AM : Introduction to Optimization Models and Methods Lecture 7: Markov Chains Yiling Chen SEAS Lesson Plan Stochastic process Markov Chains n-step probabilities Communicating states, irreducibility Recurrent
More informationA Better BCS. Rahul Agrawal, Sonia Bhaskar, Mark Stefanski
1 Better BCS Rahul grawal, Sonia Bhaskar, Mark Stefanski I INRODUCION Most NC football teams never play each other in a given season, which has made ranking them a long-running and much-debated problem
More informationColley s Method. r i = w i t i. The strength of the opponent is not factored into the analysis.
Colley s Method. Chapter 3 from Who s # 1 1, chapter available on Sakai. Colley s method of ranking, which was used in the BCS rankings prior to the change in the system, is a modification of the simplest
More informationAll-Time Conference Standings
All-Time Conference Standings Pac 12 Conference Conf. Matches Sets Overall Matches Team W L Pct W L Pct. Score Opp Last 10 Streak Home Away Neutral W L Pct. Arizona 6 5.545 22 19.537 886 889 6-4 W5 4-2
More informationRanking using Matrices.
Anjela Govan, Amy Langville, Carl D. Meyer Northrop Grumman, College of Charleston, North Carolina State University June 2009 Outline Introduction Offense-Defense Model Generalized Markov Model Data Game
More informationSwarthmore Honors Exam 2012: Statistics
Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may
More informationApproximate Inference in Practice Microsoft s Xbox TrueSkill TM
1 Approximate Inference in Practice Microsoft s Xbox TrueSkill TM Daniel Hernández-Lobato Universidad Autónoma de Madrid 2014 2 Outline 1 Introduction 2 The Probabilistic Model 3 Approximate Inference
More informationarxiv: v1 [math.pr] 14 Dec 2016
Random Knockout Tournaments arxiv:1612.04448v1 [math.pr] 14 Dec 2016 Ilan Adler Department of Industrial Engineering and Operations Research University of California, Berkeley Yang Cao Department of Industrial
More informationApproximate Bayesian binary and ordinal regression for prediction with structured uncertainty in the inputs
Approximate Bayesian binary and ordinal regression for prediction with structured uncertainty in the inputs Aleksandar Dimitriev Erik Štrumbelj Abstract We present a novel approach to binary and ordinal
More informationBayesian locally-optimal design of knockout tournaments
Bayesian locally-optimal design of knockout tournaments Mark E. Glickman Department of Health Services Boston University School of Public Health Abstract The elimination or knockout format is one of the
More informationSEEDING, COMPETITIVE INTENSITY AND QUALITY IN KNOCK-OUT TOURNAMENTS
Dmitry Dagaev, Alex Suzdaltsev SEEDING, COMPETITIVE INTENSITY AND QUALITY IN KNOCK-OUT TOURNAMENTS BASIC RESEARCH PROGRAM WORKING PAPERS SERIES: ECONOMICS WP BRP 91/EC/2015 This Working Paper is an output
More informationg-priors for Linear Regression
Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,
More informationThe AGA Ratings System
The AGA Ratings System Philip Waldron June 3, 00 Basic description The AGA ratings system uses a Bayesian approach where a player s rating rating is updated based on additional information that is available
More informationRank University AMJ AMR ASQ JAP OBHDP OS PPSYCH SMJ SUM 1 University of Pennsylvania (T) Michigan State University
Rank University AMJ AMR ASQ JAP OBHDP OS PPSYCH SMJ SUM 1 University of Pennsylvania 4 1 2 0 2 4 0 9 22 2(T) Michigan State University 2 0 0 9 1 0 0 4 16 University of Michigan 3 0 2 5 2 0 0 4 16 4 Harvard
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationBayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling
Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation
More informationBayesian model selection: methodology, computation and applications
Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program
More information12 Generalized linear models
12 Generalized linear models In this chapter, we combine regression models with other parametric probability models like the binomial and Poisson distributions. Binary responses In many situations, we
More informationRidge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation
Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking
More informationMaximum Likelihood, Logistic Regression, and Stochastic Gradient Training
Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Charles Elkan elkan@cs.ucsd.edu January 17, 2013 1 Principle of maximum likelihood Consider a family of probability distributions
More informationMultivariate Analysis
Multivariate Analysis Chapter 5: Cluster analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2015/2016 Master in Business Administration and
More informationTime Series Prediction by Kalman Smoother with Cross-Validated Noise Density
Time Series Prediction by Kalman Smoother with Cross-Validated Noise Density Simo Särkkä E-mail: simo.sarkka@hut.fi Aki Vehtari E-mail: aki.vehtari@hut.fi Jouko Lampinen E-mail: jouko.lampinen@hut.fi Abstract
More informationA Bayesian Perspective on Residential Demand Response Using Smart Meter Data
A Bayesian Perspective on Residential Demand Response Using Smart Meter Data Datong-Paul Zhou, Maximilian Balandat, and Claire Tomlin University of California, Berkeley [datong.zhou, balandat, tomlin]@eecs.berkeley.edu
More informationBIOS 312: Precision of Statistical Inference
and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample
More informationSYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I
SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability
More informationSummary of Extending the Rank Likelihood for Semiparametric Copula Estimation, by Peter Hoff
Summary of Extending the Rank Likelihood for Semiparametric Copula Estimation, by Peter Hoff David Gerard Department of Statistics University of Washington gerard2@uw.edu May 2, 2013 David Gerard (UW)
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationPredictive Engineering and Computational Sciences. Research Challenges in VUQ. Robert Moser. The University of Texas at Austin.
PECOS Predictive Engineering and Computational Sciences Research Challenges in VUQ Robert Moser The University of Texas at Austin March 2012 Robert Moser 1 / 9 Justifying Extrapolitive Predictions Need
More informationHypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33
Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett
More informationA Discussion of the Bayesian Approach
A Discussion of the Bayesian Approach Reference: Chapter 10 of Theoretical Statistics, Cox and Hinkley, 1974 and Sujit Ghosh s lecture notes David Madigan Statistics The subject of statistics concerns
More informationAMS-207: Bayesian Statistics
Linear Regression How does a quantity y, vary as a function of another quantity, or vector of quantities x? We are interested in p(y θ, x) under a model in which n observations (x i, y i ) are exchangeable.
More informationOptimal Seedings in Elimination Tournaments
Optimal Seedings in Elimination Tournaments Christian Groh Department of Economics, University of Bonn Konrad-Adenauerallee 24-42, 53131 Bonn, Germany groh@wiwi.uni-bonn.de Benny Moldovanu Department of
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationBayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London
Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline
More informationStatistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018
Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical
More informationLecture 3: Multiclass Classification
Lecture 3: Multiclass Classification Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Some slides are adapted from Vivek Skirmar and Dan Roth CS6501 Lecture 3 1 Announcement v Please enroll in
More informationEdexcel GCE Statistics S1 Advanced/Advanced Subsidiary
physicsandmathstutor.com Centre No. Candidate No. Paper Reference(s) 6683/01 Edexcel GCE Statistics S1 Advanced/Advanced Subsidiary Friday 20 May 2011 Afternoon Time: 1 hour 30 minutes Materials required
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationIE 4521 Midterm #1. Prof. John Gunnar Carlsson. March 2, 2010
IE 4521 Midterm #1 Prof. John Gunnar Carlsson March 2, 2010 Before you begin: This exam has 9 pages (including the normal distribution table) and a total of 8 problems. Make sure that all pages are present.
More informationHypothesis Registration: Structural Predictions for the 2013 World Schools Debating Championships
Hypothesis Registration: Structural Predictions for the 2013 World Schools Debating Championships Tom Gole * and Simon Quinn January 26, 2013 Abstract This document registers testable predictions about
More informationAnnouncements. CS 188: Artificial Intelligence Spring Mini-Contest Winners. Today. GamesCrafters. Adversarial Games
CS 188: Artificial Intelligence Spring 2009 Lecture 7: Expectimax Search 2/10/2009 John DeNero UC Berkeley Slides adapted from Dan Klein, Stuart Russell or Andrew Moore Announcements Written Assignment
More informationParameter Estimation. Industrial AI Lab.
Parameter Estimation Industrial AI Lab. Generative Model X Y w y = ω T x + ε ε~n(0, σ 2 ) σ 2 2 Maximum Likelihood Estimation (MLE) Estimate parameters θ ω, σ 2 given a generative model Given observed
More informationMeasuring Consensus in Binary Forecasts: NFL Game Predictions
Measuring Consensus in Binary Forecasts: NFL Game Predictions ChiUng Song Science and Technology Policy Institute 26F., Specialty Construction Center 395-70 Shindaebang-dong, Tongjak-ku Seoul 156-714,
More informationA hybrid heuristic for minimizing weighted carry-over effects in round robin tournaments
A hybrid heuristic for minimizing weighted carry-over effects Allison C. B. Guedes Celso C. Ribeiro Summary Optimization problems in sports Preliminary definitions The carry-over effects minimization problem
More informationWhy a Bayesian researcher might prefer observational data
Why a Bayesian researcher might prefer observational data Macartan Humphreys May 2, 206 Abstract I give an illustration of a simple problem in which a Bayesian researcher can choose between random assignment
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationClassification and Support Vector Machine
Classification and Support Vector Machine Yiyong Feng and Daniel P. Palomar The Hong Kong University of Science and Technology (HKUST) ELEC 5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationMulticlass Classification-1
CS 446 Machine Learning Fall 2016 Oct 27, 2016 Multiclass Classification Professor: Dan Roth Scribe: C. Cheng Overview Binary to multiclass Multiclass SVM Constraint classification 1 Introduction Multiclass
More informationST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks
(9) Model selection and goodness-of-fit checks Objectives In this module we will study methods for model comparisons and checking for model adequacy For model comparisons there are a finite number of candidate
More informationplayers based on a rating system
A Bayesian regression approach to handicapping tennis players based on a rating system Timothy C. Y. Chan 1 and Raghav Singal,2 1 Department of Mechanical and Industrial Engineering, University of Toronto
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationMaking rating curves - the Bayesian approach
Making rating curves - the Bayesian approach Rating curves what is wanted? A best estimate of the relationship between stage and discharge at a given place in a river. The relationship should be on the
More informationR-squared for Bayesian regression models
R-squared for Bayesian regression models Andrew Gelman Ben Goodrich Jonah Gabry Imad Ali 8 Nov 2017 Abstract The usual definition of R 2 (variance of the predicted values divided by the variance of the
More informationCS 4649/7649 Robot Intelligence: Planning
CS 4649/7649 Robot Intelligence: Planning Probability Primer Sungmoon Joo School of Interactive Computing College of Computing Georgia Institute of Technology S. Joo (sungmoon.joo@cc.gatech.edu) 1 *Slides
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationNaïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824
Naïve Bayes Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative HW 1 out today. Please start early! Office hours Chen: Wed 4pm-5pm Shih-Yang: Fri 3pm-4pm Location: Whittemore 266
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning
More informationLecture 3: Sports rating models
Lecture 3: Sports rating models David Aldous January 27, 2016 Sports are a popular topic for course projects usually involving details of some specific sport and statistical analysis of data. In this lecture
More informationRank and Rating Aggregation. Amy Langville Mathematics Department College of Charleston
Rank and Rating Aggregation Amy Langville Mathematics Department College of Charleston Hamilton Institute 8/6/2008 Collaborators Carl Meyer, N. C. State University College of Charleston (current and former
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationClass 26: review for final exam 18.05, Spring 2014
Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event
More informationBayesian Melding. Assessing Uncertainty in UrbanSim. University of Washington
Bayesian Melding Assessing Uncertainty in UrbanSim Hana Ševčíková University of Washington hana@stat.washington.edu Joint work with Paul Waddell and Adrian Raftery University of Washington UrbanSim Workshop,
More informationMachine Learning for NLP
Machine Learning for NLP Uppsala University Department of Linguistics and Philology Slides borrowed from Ryan McDonald, Google Research Machine Learning for NLP 1(50) Introduction Linear Classifiers Classifiers
More informationBe a Winner, Against All Odds by Rusty Withers
Be a Winner, Against All Odds by Rusty Withers Rusty Withers is coming to OCA s Lecture Hall introduce you to the Game Chart The Game Chart tells who has a chance to win against Las Vegas Point Spread.
More informationRuss Althof For the DBF Governing Committee
The 2013 AIAA/Cessna Aircraft Company/Raytheon Missile Systems Design/Build/Fly Competition Flyoff was held at TIMPA Field in Tucson, AZ on the weekend of April 19-21, 2013. This was the 17th year the
More informationWarwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014
Warwick Business School Forecasting System Summary Ana Galvao, Anthony Garratt and James Mitchell November, 21 The main objective of the Warwick Business School Forecasting System is to provide competitive
More informationConvergence of Random Processes
Convergence of Random Processes DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Define convergence for random
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationModel comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection
Model comparison Patrick Breheny March 28 Patrick Breheny BST 760: Advanced Regression 1/25 Wells in Bangladesh In this lecture and the next, we will consider a data set involving modeling the decisions
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationarxiv: v1 [physics.soc-ph] 24 Sep 2009
Final version is published in: Journal of Applied Statistics Vol. 36, No. 10, October 2009, 1087 1095 RESEARCH ARTICLE Are soccer matches badly designed experiments? G. K. Skinner a and G. H. Freeman b
More informationProbabilistic Reasoning in Deep Learning
Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian
More informationMachine Learning (CS 567) Lecture 5
Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationA few Murray connections: Extended Bradley-Terry models. Bradley-Terry model. Pair-comparison studies. Some data. Some data
Introduction A few Murray connections: Extended Bradley-Terry models David Firth (with Heather Turner) Department of Statistics University of Warwick psychometrics sport Lancaster missing data structured
More informationRecap Social Choice Fun Game Voting Paradoxes Properties. Social Choice. Lecture 11. Social Choice Lecture 11, Slide 1
Social Choice Lecture 11 Social Choice Lecture 11, Slide 1 Lecture Overview 1 Recap 2 Social Choice 3 Fun Game 4 Voting Paradoxes 5 Properties Social Choice Lecture 11, Slide 2 Formal Definition Definition
More informationSparse Linear Models (10/7/13)
STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine
More informationUsing Box-Scores to Determine a Position's Contribution to Winning Basketball Games
Brigham Young University BYU ScholarsArchive All Theses and Dissertations 2005-08-16 Using Box-Scores to Determine a Position's Contribution to Winning Basketball Games Garritt L. Page Brigham Young University
More informationStatistical learning. Chapter 20, Sections 1 4 1
Statistical learning Chapter 20, Sections 1 4 Chapter 20, Sections 1 4 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete
More informationSTAT Lecture 11: Bayesian Regression
STAT 491 - Lecture 11: Bayesian Regression Generalized Linear Models Generalized linear models (GLMs) are a class of techniques that include linear regression, logistic regression, and Poisson regression.
More informationLecture 4 How to Forecast
Lecture 4 How to Forecast Munich Center for Mathematical Philosophy March 2016 Glenn Shafer, Rutgers University 1. Neutral strategies for Forecaster 2. How Forecaster can win (defensive forecasting) 3.
More informationBayesian GLMs and Metropolis-Hastings Algorithm
Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,
More informationLecture 3: Sports rating models
Lecture 3: Sports rating models David Aldous August 31, 2017 Sports are a popular topic for course projects usually involving details of some specific sport and statistical analysis of data. In this lecture
More informationThe linear model is the most fundamental of all serious statistical models encompassing:
Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10
EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped
More informationBrian Richardet For the DBF Organizing Committee
The 2018 AIAA/Textron Aviation/Raytheon Missile Systems Design/Build/Fly Competition Flyoff was held at Cessna East Field in Wichita, KS on the weekend of April 19-22, 2018. This was the 22nd year for
More informationSpatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions
Spatial inference I will start with a simple model, using species diversity data Strong spatial dependence, Î = 0.79 what is the mean diversity? How precise is our estimate? Sampling discussion: The 64
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationDEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE
Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: August 30, 2018, 14.00 19.00 RESPONSIBLE TEACHER: Niklas Wahlström NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationProteomics and Variable Selection
Proteomics and Variable Selection p. 1/55 Proteomics and Variable Selection Alex Lewin With thanks to Paul Kirk for some graphs Department of Epidemiology and Biostatistics, School of Public Health, Imperial
More informationBayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017
Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries
More informationBAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS
BAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS Andrew A. Neath 1 and Joseph E. Cavanaugh 1 Department of Mathematics and Statistics, Southern Illinois University, Edwardsville, Illinois 606, USA
More informationPima Community College Students who Enrolled at Top 200 Ranked Universities
Pima Community College Students who Enrolled at Top 200 Ranked Universities Institutional Research, Planning and Effectiveness Project #20170814-MH-60-CIR August 2017 Students who Attended Pima Community
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationModel Checking and Improvement
Model Checking and Improvement Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Model Checking All models are wrong but some models are useful George E. P. Box So far we have looked at a number
More information