COPS. Cluster Optimized Proximity Scaling

Size: px
Start display at page:

Download "COPS. Cluster Optimized Proximity Scaling"

Transcription

1 COPS Cluster Optimized Proximity Scaling SLIDE 1 Psychoco 2015,

2 Outline 1 Objectives of Multidimensional Scaling 2 COPS: Cluster optimized proximity scaling C-Clusteredness and an Index The COPS Procedure Optimization Package 3 Conclusion And Outlook This is joint work with Patrick Mair and Kurt Hornik. SLIDE 2 Psychoco 2015,

3 Multidimensional Scaling (MDS) - I Popular method for representing multivariate high-dimensional proximities in some lower-dimensional space MDS utilizes a loss function, e.g., a least squares one σ MDS (X) = i<j w ij [f(δ ij ) g(d ij (X))] 2 and minimizes it to find the configuration X arg min σ MDS (X) X d ij (X)... fitted distances δ ij... proximities w ij... finite weights g( ), f( )... transformation functions, usually the identity function I( ) SLIDE 3 Psychoco 2015,

4 Multidimensional Scaling (MDS) - II Provides an optimal map into continuous space R M and looks for directions of spread in the low dimensional space (objective 1) But often one is also interested in discrete structures of similarity between objects ( clusters ; objective 2) MDS does solve objective 1 but not objective 2. The latter is often inferred from the former by how it looks It can happen that what is optimal for objective 1 is not very useful for objective 2 SLIDE 4 Psychoco 2015,

5 Illustration I m a Republican, because... from Mair et al. (2014) Supporters of the Republican Party have been asked why they are Republican (254 statements) Natural language data that was scraped and processed = Sparse data matrix (document term matrix) Objects are the words (we use only words that appeared at least 10 times) We look for themes in the statements: Mantras (words that occur often together) We use a cosine distance for word co-occurences and apply standard least squares MDS (SMACOF) for representation. SLIDE 5 Psychoco 2015,

6 Illustration Republican Mantras? Configurations D responsibility personal low limited military taxes defense national strong fiscal people right small government individual constitution liberty freedom life principles free best family values founding market conservative country party great america will american god nation work hard Configurations D1 SLIDE 6 Psychoco 2015,

7 Illustration Optimal configuration does not have an an all too obvious clustering structure. One way out: Fit metric MDS with power transformation by setting, e.g., f(δ ij ) = δ 20 ij Clustering is clearer but the fit is now worse (0.946 versus 0.947) Republican Mantras?! people workhard Configurations D small military taxes fiscal responsibility personal freedom government strong limited defense national right individual liberty low values conservative family principles american life constitution free party america great country nation will best market god founding Configurations D1 SLIDE 7 Psychoco 2015,

8 COPS for the Rescue We propose a general solution to this problem that consists of the following steps: Use MDS loss with θ-parametrized strictly monotonic nonlinear transformations of either proximities or fitted distances or both e.g., power transformations (powerstress, g(d ij (X)) = d ij (X) κ and f(δ ij ) = δ λ ij, so θ = c(κ, λ)) Use an index of the obtained degree of clusteredness in the configuration (c-clusteredness) to quantify how clustered the result is Combine the stress function, the transformations and the clusteredness index into a single target function and optimize over the parameters We call this COPS (Cluster Optimized Proximity Scaling; Rusch et al., 2015a) SLIDE 8 Psychoco 2015,

9 C-Clusteredness C-Clusteredness: The amount of clusteredness of a configuration c clusteredness= 0 c clusteredness= 0.03 c clusteredness= 0.23 D1 c clusteredness= 0.36 D1 c clusteredness= 0.61 D1 c clusteredness= 1 D1 SLIDE 9 Psychoco 2015, D1 D1

10 OPTICS Cordillera - I Index for clusteredness: OPTICS cordillera Employs OPTICS (Ankerst et al., 1999) with metaparameters k, ɛ on the configuration distances. For row vectors x j of X returns an ordering R of these points, R = {x (i) } i=1,...,n. So, x (1) is the x j that is at position 1 in the ordering. OPTICS also returns a reachability plot (dendrogram of minimum reachabilities r (i) of point x (i)) Ordering and reachability represent the clustering structure. We aggregate that to an index OC(X) by defining (for metaparameter q > 0) ( N i=2 OC(X) = r (i) ) 1/q r (i 1) q C C... (optional) normalizing constant SLIDE 10 Psychoco 2015,

11 OPTICS Cordillera - II c clusteredness= 0 c clusteredness= 0.03 c clusteredness= 0.36 c clusteredness= 1 D1 D1 D1 D SLIDE 11 Psychoco 2015,

12 Properties of the OPTICS Cordillera For given metaparameters ɛ, k, q the following applies (Rusch et al., 2015a) Upper bound for OC(X) in the maximal c-clusteredness case ( ) C (X, d max, ɛ, k, q) = d q N 1 N 1 max + k k Cluster assignment or a priori defined number or shape of clusters not needed OC(X) typically increases when Distances between clusters increase (Emphasis Property) Points are more densely clustered (Density Property) Number of clusters increases (Tally Property) Does not pick up unbalancedness in the number of points in a cluster as a sign of c-clusteredness (Balance Property) SLIDE 12 Psychoco 2015,

13 The Full COPS Procedure Combine the θ parametrized MDS loss measure, σ MDS (X(θ), θ) and the OPTICS cordillera OC(X) to cluster optimized loss (coploss): coploss(θ) = v 1 σ MDS (X(θ), θ) v 2 OC (X(θ)) (1) with arg min X σ MDS (X, θ) := X(θ) and v 1, v 2 R controlling how much weight should be given to the individual parts of coploss, e.g, v 1 = 1, v 2 = σ MDS (X(θ 0 ), θ 0 ) OC (X(θ 0 )) with θ 0 some reference solution, e.g., for powerstress θ 0 = (1, 1). SLIDE 13 Psychoco 2015,

14 Optimization - I We need to do (θ is t dimensional) coploss(θ) min θ! We use a nested algorithm that first solves for X(θ) and then minimizes (1) over θ. For the inner part, i.e., finding X(θ) standard MDS optimization is used (e.g., majorization) The outer part of this optimization problem is complicated so we employ metaheuristics The inner minimization is costly, so a useful metaheuristic makes little evaluations of the outer function (which is okay if t is small) Simulated Annealing or population based algorithms are not that well suited We made good experiences with a customized Luus-Jaakola algorithm (usually converges to good solution in < 200 iterations for a minimal search space width (accd) of ). SLIDE 14 Psychoco 2015,

15 Optimization - II Adaptive Luus-Jakola Algorithm (ALJ): An adaptation of Luus-Jakola search (Luus & Jaakola, 1973) Sample θ (i) from within t-orthotope [l, u] t with l, u are lower, upper boundaries Set d to be the length of the search space Repeat until termination (accd, maxiter, acc) : Pick a (i) U t ( d, d) Set θ (i+1) θ (i) + a (i) If coploss(θ (i+1) ) < coploss(θ (i) ) set θ (opt) = θ (i+1), else set d = d s Here (this is the customized part): s = o ) m+1 i m, m = min, maxiter and 0 o 1. ( log(accd) log(max(u l)) log(o) SLIDE 15 Psychoco 2015,

16 R Package stops All of this is implemented in the R package stops High level function cops(proximitymatrix,loss,...) Prespecified MDS models are strain, symmetric SMACOF (smacofsym), sammon mapping, elastic scaling, SMACOF on a sphere (smacofsphere), sstress, rstress, powerstress, Sammon mapping and elastic scaling with powers (powersammon, powerelastic) Optimization with ALJ or simulated annealing (SANN) or a particle swarm algorithm (pso) Features cordillera and interface to OPTICS in ELKI (optics) S3 methods: plot, summary, print, coef, residuals, plot3d, plot3dstatic SLIDE 16 Psychoco 2015,

17 Example: Republicans We now use COPS with powerstress on the I m a Republican, because... data set: R> resc <- cops(dt.dist,loss="powerstress", + lower=c(1,1),minpts=6,upper=c(3,20)) R> resc Call: cops(dis = dt.dist, loss = "powerstress", theta = c(1, 1), minpts = 6, lower = c(1, 1), upper = c(3, 20)) Model: COPS with powerstress loss function and parameters kappa= lambda= Number of objects: 37 MDS loss value: OPTICS cordillera: Raw= Normed= Cluster optimized loss (coploss): MDS loss weight: 1, OPTICS cordillera weight: Number of iterations of ALJ optimization: 117 SLIDE 17 Psychoco 2015,

18 Example: Republicans R> plot(resc) Republican Mantras! Configurations D american foundinglife great nation conservative party free Paleocon+Populist Right constitution america principles liberty individual right national defense Neocon+Liberalism limited strong will market country god hard work Traditionalist+Compassionate best responsibility personal government taxes low Fiscalcon+Libertarian fiscal military values small freedom Unclustered (cut at eps=0.6) family people Configurations D1 SLIDE 18 Psychoco 2015,

19 Summary COPS COPS works well when the objective is to obtain both a scaling and a clustering It is easily adaptable to many other loss functions It is particularly useful when there is only little variability in the proximities C-Clusteredness and OPTICS cordillera A concept and a measure of goodness-of-clustering in dimension reduction results that has appealing properties Interesting beyond COPS SLIDE 19 Psychoco 2015,

20 Outlook Beyond COPS c-clusteredness is an aspect of a more general idea which we coined c-structuredness (Rusch et al., 2015b) The idea of COPS can be generalized to Augmented Nonlinear Dimension Reduction and STOPS (Structure optimized proximity scaling) (Rusch et al., 2015b) Nearly there, only a few kinks to even out Future research Issues with finding global optimum Speed up the optimization problem (inner minimization) Inference is still unsolved (but we re working on that too) SLIDE 20 Psychoco 2015,

21 References Ankerst, M., Breunig, M., Kriegel, H.-P. & Sander, J. (1999) OPTICS: Ordering points to identify the clustering structure, ACM Sigmod Record 28, Luus, R. & Jaakola, T. (1973) Optimization by direct search and systematic reduction of the size of search region, AIChE Journal, 19, Mair, P., Rusch, T. & Hornik, K. (2014) The grand old party - A party of values? SpringerPlus, 3:697. Rusch, T., Mair, P. & Hornik, K. (2015a) COPS: Cluster optimized proximity scaling, Report 2015/1, Discussion Paper Series, Center for Empirical Research Methods, WU Vienna University of Economics and Business. Rusch, T., Mair, P. & Hornik, K. (2015b) Structuredness Indices and Augmented Nonlinear Dimension Reduction, Report 2015/X, Discussion Paper Series, Center for Empirical Research Methods, WU Vienna University of Economics and Business. forthcoming SLIDE 21 Psychoco 2015,

22 Thank you for your attention Thomas Rusch Competence Center for Empirical Research Methods URL: WU Vienna University of Economics and Business Welthandelsplatz 1, 1020 Vienna Austria SLIDE 22 Psychoco 2015,

Multidimensional Scaling in R: SMACOF

Multidimensional Scaling in R: SMACOF Multidimensional Scaling in R: SMACOF Patrick Mair Institute for Statistics and Mathematics WU Vienna University of Economics and Business Jan de Leeuw Department of Statistics University of California,

More information

Recommendation Systems

Recommendation Systems Recommendation Systems Popularity Recommendation Systems Predicting user responses to options Offering news articles based on users interests Offering suggestions on what the user might like to buy/consume

More information

SGN Advanced Signal Processing Project bonus: Sparse model estimation

SGN Advanced Signal Processing Project bonus: Sparse model estimation SGN 21006 Advanced Signal Processing Project bonus: Sparse model estimation Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 12 Sparse models Initial problem: solve

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

Statistical Estimation

Statistical Estimation Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from

More information

Preprocessing & dimensionality reduction

Preprocessing & dimensionality reduction Introduction to Data Mining Preprocessing & dimensionality reduction CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Dimensionality reduction Yale - Fall 2016

More information

The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R

The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R Xingguo Li Tuo Zhao Tong Zhang Han Liu Abstract We describe an R package named picasso, which implements a unified framework

More information

Solving Linear Systems

Solving Linear Systems Solving Linear Systems Iterative Solutions Methods Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) Linear Systems Fall 2015 1 / 12 Introduction We continue looking how to solve linear systems of

More information

Multidimensional scaling (MDS)

Multidimensional scaling (MDS) Multidimensional scaling (MDS) Just like SOM and principal curves or surfaces, MDS aims to map data points in R p to a lower-dimensional coordinate system. However, MSD approaches the problem somewhat

More information

Recommender Systems. Dipanjan Das Language Technologies Institute Carnegie Mellon University. 20 November, 2007

Recommender Systems. Dipanjan Das Language Technologies Institute Carnegie Mellon University. 20 November, 2007 Recommender Systems Dipanjan Das Language Technologies Institute Carnegie Mellon University 20 November, 2007 Today s Outline What are Recommender Systems? Two approaches Content Based Methods Collaborative

More information

Nonlinear Manifold Learning Summary

Nonlinear Manifold Learning Summary Nonlinear Manifold Learning 6.454 Summary Alexander Ihler ihler@mit.edu October 6, 2003 Abstract Manifold learning is the process of estimating a low-dimensional structure which underlies a collection

More information

More on Unsupervised Learning

More on Unsupervised Learning More on Unsupervised Learning Two types of problems are to find association rules for occurrences in common in observations (market basket analysis), and finding the groups of values of observational data

More information

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33 Introduction 1 STA442/2101 Fall 2016 1 See last slide for copyright information. 1 / 33 Background Reading Optional Chapter 1 of Linear models with R Chapter 1 of Davison s Statistical models: Data, and

More information

Evaluating Goodness of Fit in

Evaluating Goodness of Fit in Evaluating Goodness of Fit in Nonmetric Multidimensional Scaling by ALSCAL Robert MacCallum The Ohio State University Two types of information are provided to aid users of ALSCAL in evaluating goodness

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Longitudinal Data Analysis of Health Outcomes

Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis Workshop Running Example: Days 2 and 3 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development

More information

Mixed Models II - Behind the Scenes Report Revised June 11, 2002 by G. Monette

Mixed Models II - Behind the Scenes Report Revised June 11, 2002 by G. Monette Mixed Models II - Behind the Scenes Report Revised June 11, 2002 by G. Monette What is a mixed model "really" estimating? Paradox lost - paradox regained - paradox lost again. "Simple example": 4 patients

More information

Iterative regularization of nonlinear ill-posed problems in Banach space

Iterative regularization of nonlinear ill-posed problems in Banach space Iterative regularization of nonlinear ill-posed problems in Banach space Barbara Kaltenbacher, University of Klagenfurt joint work with Bernd Hofmann, Technical University of Chemnitz, Frank Schöpfer and

More information

Regression Clustering

Regression Clustering Regression Clustering In regression clustering, we assume a model of the form y = f g (x, θ g ) + ɛ g for observations y and x in the g th group. Usually, of course, we assume linear models of the form

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large

More information

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification

More information

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint

More information

Fundamentals of Operations Research. Prof. G. Srinivasan. Indian Institute of Technology Madras. Lecture No. # 15

Fundamentals of Operations Research. Prof. G. Srinivasan. Indian Institute of Technology Madras. Lecture No. # 15 Fundamentals of Operations Research Prof. G. Srinivasan Indian Institute of Technology Madras Lecture No. # 15 Transportation Problem - Other Issues Assignment Problem - Introduction In the last lecture

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

Item Response Theory for Conjoint Survey Experiments

Item Response Theory for Conjoint Survey Experiments Item Response Theory for Conjoint Survey Experiments Devin Caughey Hiroto Katsumata Teppei Yamamoto Massachusetts Institute of Technology PolMeth XXXV @ Brigham Young University July 21, 2018 Conjoint

More information

Linear Regression Models

Linear Regression Models Linear Regression Models Model Description and Model Parameters Modelling is a central theme in these notes. The idea is to develop and continuously improve a library of predictive models for hazards,

More information

Lecture 5: Gradient Descent. 5.1 Unconstrained minimization problems and Gradient descent

Lecture 5: Gradient Descent. 5.1 Unconstrained minimization problems and Gradient descent 10-725/36-725: Convex Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 5: Gradient Descent Scribes: Loc Do,2,3 Disclaimer: These notes have not been subjected to the usual scrutiny reserved for

More information

Linear Programming The Simplex Algorithm: Part II Chapter 5

Linear Programming The Simplex Algorithm: Part II Chapter 5 1 Linear Programming The Simplex Algorithm: Part II Chapter 5 University of Chicago Booth School of Business Kipp Martin October 17, 2017 Outline List of Files Key Concepts Revised Simplex Revised Simplex

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Guy Lebanon February 19, 2011 Maximum likelihood estimation is the most popular general purpose method for obtaining estimating a distribution from a finite sample. It was

More information

Inverse Power Method for Non-linear Eigenproblems

Inverse Power Method for Non-linear Eigenproblems Inverse Power Method for Non-linear Eigenproblems Matthias Hein and Thomas Bühler Anubhav Dwivedi Department of Aerospace Engineering & Mechanics 7th March, 2017 1 / 30 OUTLINE Motivation Non-Linear Eigenproblems

More information

Linear Regression (continued)

Linear Regression (continued) Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Statistícal Methods for Spatial Data Analysis

Statistícal Methods for Spatial Data Analysis Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis

More information

Vector Space Scoring Introduction to Information Retrieval INF 141 Donald J. Patterson

Vector Space Scoring Introduction to Information Retrieval INF 141 Donald J. Patterson Vector Space Scoring Introduction to Information Retrieval INF 141 Donald J. Patterson Content adapted from Hinrich Schütze http://www.informationretrieval.org Querying Corpus-wide statistics Querying

More information

Random Numbers and Simulation

Random Numbers and Simulation Random Numbers and Simulation Generating random numbers: Typically impossible/unfeasible to obtain truly random numbers Programs have been developed to generate pseudo-random numbers: Values generated

More information

Augmented and unconstrained: revisiting the Regional Knowledge Production Function

Augmented and unconstrained: revisiting the Regional Knowledge Production Function Augmented and unconstrained: revisiting the Regional Knowledge Production Function Sylvie Charlot (GAEL INRA, Grenoble) Riccardo Crescenzi (SERC LSE, London) Antonio Musolesi (University of Ferrara & SEEDS

More information

Robustness of Principal Components

Robustness of Principal Components PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.

More information

Machine Learning for NLP

Machine Learning for NLP Machine Learning for NLP Linear Models Joakim Nivre Uppsala University Department of Linguistics and Philology Slides adapted from Ryan McDonald, Google Research Machine Learning for NLP 1(26) Outline

More information

Oligopoly. Firm s Profit Maximization Firm i s profit maximization problem: Static oligopoly model with n firms producing homogenous product.

Oligopoly. Firm s Profit Maximization Firm i s profit maximization problem: Static oligopoly model with n firms producing homogenous product. Oligopoly Static oligopoly model with n firms producing homogenous product. Firm s Profit Maximization Firm i s profit maximization problem: Max qi P(Q)q i C i (q i ) P(Q): inverse demand curve: p = P(Q)

More information

High-dimensional statistics and data analysis Course Part I

High-dimensional statistics and data analysis Course Part I and data analysis Course Part I 3 - Computation of p-values in high-dimensional regression Jérémie Bigot Institut de Mathématiques de Bordeaux - Université de Bordeaux Master MAS-MSS, Université de Bordeaux,

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

1 Maximizing a Submodular Function

1 Maximizing a Submodular Function 6.883 Learning with Combinatorial Structure Notes for Lecture 16 Author: Arpit Agarwal 1 Maximizing a Submodular Function In the last lecture we looked at maximization of a monotone submodular function,

More information

Mixture Models and Expectation-Maximization

Mixture Models and Expectation-Maximization Mixture Models and Expectation-Maximiation David M. Blei March 9, 2012 EM for mixtures of multinomials The graphical model for a mixture of multinomials π d x dn N D θ k K How should we fit the parameters?

More information

Supply Chain Network Structure and Risk Propagation

Supply Chain Network Structure and Risk Propagation Supply Chain Network Structure and Risk Propagation John R. Birge 1 1 University of Chicago Booth School of Business (joint work with Jing Wu, Chicago Booth) IESE Business School Birge (Chicago Booth)

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Multivariate Statistics 101. Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis

Multivariate Statistics 101. Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis Multivariate Statistics 101 Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis Multivariate Statistics 101 Copy of slides and exercises PAST software download

More information

An Optimal Bidimensional Multi Armed Bandit Auction for Multi unit Procurement

An Optimal Bidimensional Multi Armed Bandit Auction for Multi unit Procurement An Optimal Bidimensional Multi Armed Bandit Auction for Multi unit Procurement Satyanath Bhat Joint work with: Shweta Jain, Sujit Gujar, Y. Narahari Department of Computer Science and Automation, Indian

More information

A Note on Demand Estimation with Supply Information. in Non-Linear Models

A Note on Demand Estimation with Supply Information. in Non-Linear Models A Note on Demand Estimation with Supply Information in Non-Linear Models Tongil TI Kim Emory University J. Miguel Villas-Boas University of California, Berkeley May, 2018 Keywords: demand estimation, limited

More information

Sparse and Robust Optimization and Applications

Sparse and Robust Optimization and Applications Sparse and and Statistical Learning Workshop Les Houches, 2013 Robust Laurent El Ghaoui with Mert Pilanci, Anh Pham EECS Dept., UC Berkeley January 7, 2013 1 / 36 Outline Sparse Sparse Sparse Probability

More information

Curvilinear Components Analysis and Bregman Divergences

Curvilinear Components Analysis and Bregman Divergences and Machine Learning. Bruges (Belgium), 8-3 April, d-side publi., ISBN -9337--. Curvilinear Components Analysis and Bregman Divergences Jigang Sun, Malcolm Crowe and Colin Fyfe Applied Computational Intelligence

More information

Hierarchical Bayesian Nonparametrics

Hierarchical Bayesian Nonparametrics Hierarchical Bayesian Nonparametrics Micha Elsner April 11, 2013 2 For next time We ll tackle a paper: Green, de Marneffe, Bauer and Manning: Multiword Expression Identification with Tree Substitution

More information

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017 Machine Learning Support Vector Machines Fabio Vandin November 20, 2017 1 Classification and Margin Consider a classification problem with two classes: instance set X = R d label set Y = { 1, 1}. Training

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Freeman (2005) - Graphic Techniques for Exploring Social Network Data

Freeman (2005) - Graphic Techniques for Exploring Social Network Data Freeman (2005) - Graphic Techniques for Exploring Social Network Data The analysis of social network data has two main goals: 1. Identify cohesive groups 2. Identify social positions Moreno (1932) was

More information

Overview of clustering analysis. Yuehua Cui

Overview of clustering analysis. Yuehua Cui Overview of clustering analysis Yuehua Cui Email: cuiy@msu.edu http://www.stt.msu.edu/~cui A data set with clear cluster structure How would you design an algorithm for finding the three clusters in this

More information

NOTES ON COOPERATIVE GAME THEORY AND THE CORE. 1. Introduction

NOTES ON COOPERATIVE GAME THEORY AND THE CORE. 1. Introduction NOTES ON COOPERATIVE GAME THEORY AND THE CORE SARA FROEHLICH 1. Introduction Cooperative game theory is fundamentally different from the types of games we have studied so far, which we will now refer to

More information

February 22, Introduction to the Simplex Algorithm

February 22, Introduction to the Simplex Algorithm 15.53 February 22, 27 Introduction to the Simplex Algorithm 1 Quotes for today Give a man a fish and you feed him for a day. Teach him how to fish and you feed him for a lifetime. -- Lao Tzu Give a man

More information

Chapter 2: Studying Geography, Economics, and Citizenship

Chapter 2: Studying Geography, Economics, and Citizenship Chapter 2: Studying Geography, Economics, and Citizenship Lesson 2.1 Studying Geography I. Displaying the Earth s Surface A. A globe of the Earth best shows the sizes of continents and the shapes of landmasses

More information

LSQ. Function: Usage:

LSQ. Function: Usage: LSQ LSQ (DEBUG, HETERO, INST=list of instrumental variables,iteru, COVU=OWN or name of residual covariance matrix,nonlinear options) list of equation names ; Function: LSQ is used to obtain least squares

More information

CSE446: Clustering and EM Spring 2017

CSE446: Clustering and EM Spring 2017 CSE446: Clustering and EM Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin, Dan Klein, and Luke Zettlemoyer Clustering systems: Unsupervised learning Clustering Detect patterns in unlabeled

More information

Matrix Assembly in FEA

Matrix Assembly in FEA Matrix Assembly in FEA 1 In Chapter 2, we spoke about how the global matrix equations are assembled in the finite element method. We now want to revisit that discussion and add some details. For example,

More information

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016 Lecture 1: Introduction and Review We begin with a short introduction to the course, and logistics. We then survey some basics about approximation algorithms and probability. We also introduce some of

More information

Unsupervised Learning

Unsupervised Learning 2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

Sparse PCA with applications in finance

Sparse PCA with applications in finance Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction

More information

LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape

LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape Nikolaus Umlauf https://eeecon.uibk.ac.at/~umlauf/ Overview Joint work with Andreas Groll, Julien Hambuckers

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

Urban Transportation Planning Prof. Dr.V.Thamizh Arasan Department of Civil Engineering Indian Institute of Technology Madras

Urban Transportation Planning Prof. Dr.V.Thamizh Arasan Department of Civil Engineering Indian Institute of Technology Madras Urban Transportation Planning Prof. Dr.V.Thamizh Arasan Department of Civil Engineering Indian Institute of Technology Madras Module #03 Lecture #12 Trip Generation Analysis Contd. This is lecture 12 on

More information

Large Scale Semi-supervised Linear SVMs. University of Chicago

Large Scale Semi-supervised Linear SVMs. University of Chicago Large Scale Semi-supervised Linear SVMs Vikas Sindhwani and Sathiya Keerthi University of Chicago SIGIR 2006 Semi-supervised Learning (SSL) Motivation Setting Categorize x-billion documents into commercial/non-commercial.

More information

Dynamic Macroeconomic Theory Notes. David L. Kelly. Department of Economics University of Miami Box Coral Gables, FL

Dynamic Macroeconomic Theory Notes. David L. Kelly. Department of Economics University of Miami Box Coral Gables, FL Dynamic Macroeconomic Theory Notes David L. Kelly Department of Economics University of Miami Box 248126 Coral Gables, FL 33134 dkelly@miami.edu Current Version: Fall 2013/Spring 2013 I Introduction A

More information

Descent methods. min x. f(x)

Descent methods. min x. f(x) Gradient Descent Descent methods min x f(x) 5 / 34 Descent methods min x f(x) x k x k+1... x f(x ) = 0 5 / 34 Gradient methods Unconstrained optimization min f(x) x R n. 6 / 34 Gradient methods Unconstrained

More information

Machine Learning for NLP

Machine Learning for NLP Machine Learning for NLP Uppsala University Department of Linguistics and Philology Slides borrowed from Ryan McDonald, Google Research Machine Learning for NLP 1(50) Introduction Linear Classifiers Classifiers

More information

Naïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824

Naïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824 Naïve Bayes Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative HW 1 out today. Please start early! Office hours Chen: Wed 4pm-5pm Shih-Yang: Fri 3pm-4pm Location: Whittemore 266

More information

Empirical Risk Minimization

Empirical Risk Minimization Empirical Risk Minimization Fabrice Rossi SAMM Université Paris 1 Panthéon Sorbonne 2018 Outline Introduction PAC learning ERM in practice 2 General setting Data X the input space and Y the output space

More information

Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms CSE 0, Winter 08 Design and Analysis of Algorithms Lecture 8: Consolidation # (DP, Greed, NP-C, Flow) Class URL: http://vlsicad.ucsd.edu/courses/cse0-w8/ Followup on IGO, Annealing Iterative Global Optimization

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity

More information

Introduction to Game Theory Lecture Note 2: Strategic-Form Games and Nash Equilibrium (2)

Introduction to Game Theory Lecture Note 2: Strategic-Form Games and Nash Equilibrium (2) Introduction to Game Theory Lecture Note 2: Strategic-Form Games and Nash Equilibrium (2) Haifeng Huang University of California, Merced Best response functions: example In simple games we can examine

More information

5. Simulated Annealing 5.2 Advanced Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini

5. Simulated Annealing 5.2 Advanced Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini 5. Simulated Annealing 5.2 Advanced Concepts Fall 2010 Instructor: Dr. Masoud Yaghini Outline Acceptance Function Initial Temperature Equilibrium State Cooling Schedule Stopping Condition Handling Constraints

More information

Fundamentals of Metaheuristics

Fundamentals of Metaheuristics Fundamentals of Metaheuristics Part I - Basic concepts and Single-State Methods A seminar for Neural Networks Simone Scardapane Academic year 2012-2013 ABOUT THIS SEMINAR The seminar is divided in three

More information

Training the linear classifier

Training the linear classifier 215, Training the linear classifier A natural way to train the classifier is to minimize the number of classification errors on the training data, i.e. choosing w so that the training error is minimized.

More information

Projection methods to solve SDP

Projection methods to solve SDP Projection methods to solve SDP Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Oberwolfach Seminar, May 2010 p.1/32 Overview Augmented Primal-Dual Method

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Trip Distribution Modeling Milos N. Mladenovic Assistant Professor Department of Built Environment

Trip Distribution Modeling Milos N. Mladenovic Assistant Professor Department of Built Environment Trip Distribution Modeling Milos N. Mladenovic Assistant Professor Department of Built Environment 25.04.2017 Course Outline Forecasting overview and data management Trip generation modeling Trip distribution

More information

Demand in Differentiated-Product Markets (part 2)

Demand in Differentiated-Product Markets (part 2) Demand in Differentiated-Product Markets (part 2) Spring 2009 1 Berry (1994): Estimating discrete-choice models of product differentiation Methodology for estimating differentiated-product discrete-choice

More information

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from Topics in Data Analysis Steven N. Durlauf University of Wisconsin Lecture Notes : Decisions and Data In these notes, I describe some basic ideas in decision theory. theory is constructed from The Data:

More information

Fitting Linear Statistical Models to Data by Least Squares II: Weighted

Fitting Linear Statistical Models to Data by Least Squares II: Weighted Fitting Linear Statistical Models to Data by Least Squares II: Weighted Brian R. Hunt and C. David Levermore University of Maryland, College Park Math 420: Mathematical Modeling April 21, 2014 version

More information

Iterative Matching Pursuit and its Applications in Adaptive Time-Frequency Analysis

Iterative Matching Pursuit and its Applications in Adaptive Time-Frequency Analysis Iterative Matching Pursuit and its Applications in Adaptive Time-Frequency Analysis Zuoqiang Shi Mathematical Sciences Center, Tsinghua University Joint wor with Prof. Thomas Y. Hou and Sparsity, Jan 9,

More information

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725 Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like

More information

Lecture 2: Linear regression

Lecture 2: Linear regression Lecture 2: Linear regression Roger Grosse 1 Introduction Let s ump right in and look at our first machine learning algorithm, linear regression. In regression, we are interested in predicting a scalar-valued

More information

Rater agreement - ordinal ratings. Karl Bang Christensen Dept. of Biostatistics, Univ. of Copenhagen NORDSTAT,

Rater agreement - ordinal ratings. Karl Bang Christensen Dept. of Biostatistics, Univ. of Copenhagen NORDSTAT, Rater agreement - ordinal ratings Karl Bang Christensen Dept. of Biostatistics, Univ. of Copenhagen NORDSTAT, 2012 http://biostat.ku.dk/~kach/ 1 Rater agreement - ordinal ratings Methods for analyzing

More information

Chapter 2 Examples of Optimization of Discrete Parameter Systems

Chapter 2 Examples of Optimization of Discrete Parameter Systems Chapter Examples of Optimization of Discrete Parameter Systems The following chapter gives some examples of the general optimization problem (SO) introduced in the previous chapter. They all concern the

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Sparse inverse covariance estimation with the lasso

Sparse inverse covariance estimation with the lasso Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty

More information

Optimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes

Optimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes Optimization Charles J. Geyer School of Statistics University of Minnesota Stat 8054 Lecture Notes 1 One-Dimensional Optimization Look at a graph. Grid search. 2 One-Dimensional Zero Finding Zero finding

More information

Estimation for nonparametric mixture models

Estimation for nonparametric mixture models Estimation for nonparametric mixture models David Hunter Penn State University Research supported by NSF Grant SES 0518772 Joint work with Didier Chauveau (University of Orléans, France), Tatiana Benaglia

More information

Copositive Programming and Combinatorial Optimization

Copositive Programming and Combinatorial Optimization Copositive Programming and Combinatorial Optimization Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria joint work with I.M. Bomze (Wien) and F. Jarre (Düsseldorf) IMA

More information