Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Size: px
Start display at page:

Download "Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis"

Transcription

1 Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 5

2 Topic Overview 1) Introduction/Unvariate Statistics 2) Bootstrapping/Monte Carlo Simulation/Kernel Estimation 3) Distance Matrices/Point Pattern Analysis 4) Bivariate/Multivariate/Spatial Regression 5) Spatial Covariance and Covariance Models 1) Spatial Stochastic Processes 6) Kriging and Spatial Estimation I 7) Kriging and Spatial Estimation II 8) Spatial Sampling Strategies 9) Principal Components and Ordination 10) Combining 1 st order and 2 nd effects 2

3 Spatial Point Patterns Definition Set of point locations with recorded events" within study region, e.g., locations of trees, disease or crime incidents point locations could correspond to all possible events or to subsets of them (mapped versus sampled point pattern) attribute values could have also been measured at event locations, e.g., tree diameter (marked point pattern) 3 Objective: Introduce statistical tools for quantifying spatial interaction of events,e.g., clustering versus randomness or regularity

4 1D Kernel Density Estimation Flowchart 1D Kernel Density Estimation Flowchart 1. choose a kernel function k(x-x i ), i.e., a PDF, and a bandwidth parameter b controlling kernel extent and consequently the smoothness" of the final estimated density profile f(x); this amounts to choosing a scaled kernel function k(x-x i ;b) 2. discretize 1D segment, i.e., choose a set of P x-coordinates {x p ; p = 1 P} at which the density function f(x) will be estimated 3. for each datum coordinate x i, evaluate the scaled kernel function k(x p -x i ;b) for all P x- values; this yields N scaled kernel profiles {k i (b); i = 1 N} each one stemming from a particular event coordinate x i 4. for each discretization coordinate x p, compute estimated density f(x p ) as the sum of the N scaled kernel values k(x p -x i ; b), after weighting each such value by 1/N: Output A (Px1) vector k(b) with estimated density values f(x) at the specified x-coordinates; the N scaled & weighted kernels {(1/N)k i (b); i = 1 N} can be regarded as N elementary profiles whose super-position builds up the final estimated density profile 4 Checkout

5 Constructing A Separable 2D Kernel Two 1D Gaussian kernels for the x- and y-dimensions Replicated 1D Gaussian kernels and 2D separable composite 5 Anisotropic kernel = multidimensional kernel with different bandwidths along different directions

6 2D Kernel Intensity Estimation 1. center a circle C(u;b) of radius b at any arbitrary location u in D 2. estimate local intensity at u as: λ(u) =N(u; b)/ C(u;b) where N(u; b) = # of events within C(u; b) C(u; b) = kernel measure, b 2 in 2D. Note: steps 1 and 2 amount to choosing a 2D kernel function that plots like a cylinder with base radius b and height 1=(b 2 ) 3. repeat estimation for set of points (typically arranged at the grid nodes of a regular raster) in the study region to create an intensity map 6 Looping over # of grid nodes (P) instead over # of events (N), yields same results as kernel density estimation case

7 Recap (Lecture 3) Event intensity of spatial point patterns λ(u): mean # of events over a unit area centered at u estimated overall intensity λ = N/ D local intensity via quadrat counts or kernel density estimation Kernel intensity estimation conversion of point data (events) to raster format (intensity surface) statistical multidimensional (multivariate) density estimation methods are used to estimate local intensity f(u). Note: Density surface integrates to 1, so multiply every such estimate f(u) by N to convert it to an intensity value f(u) resulting intensity surface depends on: (i) kernel type, and (ii) bandwidth; the latter is more influential alternative approaches for non-parametric multivariate density estimation include: k-nearest neighbor and mixture of Gaussian densities methods intensity surface can be linked (via regression models) to explanatory variables, e.g., disease occurrence intensity as function of air quality variables 7

8 Outline-Lecture 5 8 Concepts & Notation Distance & Distance Matrices Distances Involved in Spatial Point Patterns Statistical Tests Sampling Distributions Bootstrap example Quadrat count example Quantifying Spatial Interaction: Nearest Neighbor Distance Metrics G Function Proportion of minimum event-to-nearest-event distances no greater than given distance cutoff d F Function Proportion of minimum point-to-nearest-event distances no greater than given distance cutoff d Measures based on the distribution of event intensities The K function Quadrat counts and KS-test? Quantifying Spatial Interaction: K Function Points To Remember

9 Some Notation Point events Set of N locations of events occurring in a study area: Variable of interest y(s) = number of events (a count) within arbitrary domain or support s with measure (length, area, volume) s ; support s is centered at an arbitrary location u and can also be denoted as s(u); in statistics, y(s) is treated as a realization of a random variable (RV) Y(s) Objective Quantify interaction, e.g., covariation, between outcomes of any two RVs Y(s) and Y(s ). To do so, all RVs must lie in the same environment"; in other words, the long-term average (expectation) of RV Y(s) should be similar to that of Y(s ) 9

10 Intensity of Events Local intensity λ(u) Mean number of events per unit area at an arbitrary location or point u, formally defined as: Overall intensity λ First order stationarity Any RV should have the same long-term average, for a fixed areal unit s. This implies a constant intensity: 1 0 The expected number of events with a region s is just a function of s : E{Y(s)} =λ s

11 Interaction Between Count RVs Second-order intensity Long-term average (expectation) of products of counts per unit areas at any two arbitrary points u and u, formally defined as: Some terminology second-order stationarity: expectation of all RVs is constant (first-order stationarity), and second-order intensity is a function of separation vector between any two locations u and u isotropy: only distance (not orientation) of separation vector matters 11 Outlook Quantifying interaction in spatial point patterns within the above assumptions or working hypotheses amounts to studying distances between events

12 Distance A measure of proximity (typically along a crow's flight path) between any two locations or spatial entities Euclidean distance Consider two points in a 2D (geographical or other) space with coordinates u i = (x i ;y i ) and u j = (x j ;y j ). The Euclidean distance d ij between points u i and u j is computed via Pythagoras's theorem as: 1 2

13 Distance Metric Formal characteristics of a distance metric A measure d ij of proximity between locations u i and u j is a valid distance metric if it satisfies the following requirements: distance between a point and itself is always zero: d ii = 0 distance between a point and another one is always positive: d ij > 0 distance between two points is the same no matter which point you consider first: d ij = d ji the triangular inequality holds: sum of length of two sides of a triangle cannot be smaller than length of third side: d ij d il + d lj 1 3 A metric d ij need not always be Euclidean, and hence should be checked to ensure that it is a valid distance metric

14 Non-Euclidean Distances Alternative distance" measures (i) over a road, or railway, (ii) along a river, (ii) over a network Even more exotic distance" measures (i) travel time over a network, (ii) perceived travel time between urban landmarks, (iii) volume of exports/imports 1 4 Euclidean distances between network nodes actual or perceived distances on the network the latter might not even be formal distance metrics, i.e. d ij d ji

15 Minkowski's Generalized Distance Definition Consider two points in a K-dimensional (geographical or other) space R K with coordinate vectors u i = [u i1 u ik u ik ] and u j = [u j1 u jk u jk ]. The Minkowski distance of order p (with p > 1), denoted as d ij (p), between points u i and u j is computed as: Particular cases Manhattan or city-block distance: Euclidean distance Distances computed from points in multidimensional spaces are routinely used in statistical pattern recognition; points represent objects or cases, each described by K attribute values 1 5

16 Euclidean Distance Matrix: Single Set of Points Definition Consider a set of N points {u 1 u i u N } in a K-dimensional (geographical or other) space. The distance matrix D is a square (NxN) matrix containing the distances {d(u i,u j ); i = 1 N; j = 1 N} between all NxN possible pairs of points in the set by convention, u 1 is the coordinate vector of the 1st point in the set (1st entry in data file) 1 6 i-th row (or column) contains distances between i-th point ui and all others (including itself) D is symmetric with zeros along its diagonal

17 Euclidean Distance Matrix: Two Sets of Points Consider 2 sets of points {u 1 u i u N } and {t 1 t i t M } in a K-dimensional (geographical or other) space. The distance matrix D is a (NxM) matrix containing the Euclidean distances {d(u i,t j ); i = 1 N; j = 1 M} between all N x M possible pairs formed by these two sets of points by convention, u 1 is the coordinate vector of the 1st datum in the data set #1, and similarly for t i-th row contains distances between i-th point ui in set #1 and all points in set #2 j-th column contains distances between j-th point tj in set #2 and all points in set #1 D is not symmetric, i.e., d12 d21: pair {u1,t2} is not the same as pair {u2,t1}

18 Distances Between Events in A Point Pattern Event-to-event distance Distance d ij between event at location u i and another event at location u j Point-to-event distance Distance d pj between a randomly chosen point at location tp and an event at location u j : Event-to-nearest-event distance Distance d min (u i ) between an event at location u i and its nearest neighbor event: Point-to-nearest-event distance Distance d min (t p ) between a randomly chosen point at location t p and its nearest neighbor event: 1 8

19 Event-to-Nearest-Event Distances Some events might be nearest neighbors of each other: e.g., u 4,u 5, or have same nearest neighbor: e.g., u 2, u 3, u 4 are nearest neighbors of u 5 Mean nearest neighbor distance Average of all d min (u i ) values: 1 9 Drawback: single number does not suffice to describe point pattern

20 The G Function Definition Proportion of event-to-nearest-event distances d min (u i ) no greater than given distance cutoff d, estimated as: Cumulative distribution function (CDF) of all N event-tonearest-event distances Example 2 0

21 Event-to-Nearest-Event (E2NE) Distance Histograms 2 1 for evenly-spaced events, E2NE distances similar to spacing of events for clustered events, more small E2NE distances and fewer large distances

22 Point Pattern Analysis Metrics G hat function Cumulative distribution function (CDF) of all N event-to-nearest-event distances F hat function Cumulative distribution function (CDF) of all N point-to-nearest-event distances K hat function Relative number of events at distance D calculated around all events 2 2 event-to-nearest-event distances use the sample G function G(d) point-to-nearest-event distances use the sample F function F(d) event-to-event distances use the sample K function K(d)

23 Sample G Function Examples G hat =Cumulative distribution function (CDF) of all N event-tonearest-event distances 2 3 for evenly-spaced events, G(d) rises gradually up to the distance at which most events are spaced, and then increases rapidly for clustered events, G(d) rises rapidly at short distances, and then levels off at larger d-values

24 The F Function Definition Proportion of point-to-nearest-event distances d min (t j ) no greater than given distance cutoff d, estimated as: Cumulative distribution function (CDF) of all M point-to-nearest-event distances 2 4 for larger number M of random points, F(d) becomes even smoother Note: The F function provides information on event proximity to voids

25 Point-to-Nearest-Event (P2NE) Distance Histograms 2 5 for evenly-spaced events, there are more nearest events at small distances from randomly placed points for clustered events, P2NE distances are generally larger than the previous case, and there are a few large such distances

26 Sample F Function Examples 2 6 for evenly-spaced events, F(d) rises rapidly up to the distance at which most events are spaced, and then levels off (more nearest neighbors at small distances from randomly placed points) for clustered events, F(d) rises rapidly at short distances, and then levels off at larger d-values

27 The Sample K Function Concept building 1. construct set of concentric circles (of increasing radius d) around each event 2. count number of events in each distance band" 3. cumulative number of events up to radius d around all events = sample K function, K(d) Formal definition 2 7

28 Interpreting The Sample K Function Re-expressing In other words: Function K(d) is the sample cumulative distribution function (CDF) of all N 2 -N event-to-event distances, scaled by D 2 8 Note: Ignore bin at d = 0 (center plot) and point at d = 0 (right plot

29 Event-to-Event Distance Histograms 2 9 for evenly-spaced events, there are more medium-sized E2E distances than small or large such distances for clustered events, the distribution of E2E distances is multi-modal

30 Event-to-Event Distance CDFs 3 0 for clustered events, there are multiple bumps in the CDF of E2E distances due to the grouping of events in space

31 Sample K Function Examples 3 1 sample K function K(d) is monotonically increasing and is a scaled (by domain measure D version of the CDF of E2E distances

32 Recap Quantifying interaction in spatial point patterns event-to-nearest-event distances use the sample G function G(d) point-to-nearest-event distances use the sample F function F(d) event-to-event distances use the sample K function K(d) K function looks at information beyond nearest neighbors Caveats clustering is always a function of the overall intensity of a point pattern clustering might occur due to local intensity variations or due to interaction; it is very difficult to disentangle each contribution Watch out for boundaries and edge effects distance distortions due to map projections sampled versus mapped point patterns Interactions of 1 st versus second order stationarity 3 2

Interaction Analysis of Spatial Point Patterns

Interaction Analysis of Spatial Point Patterns Interaction Analysis of Spatial Point Patterns Geog 2C Introduction to Spatial Data Analysis Phaedon C Kyriakidis wwwgeogucsbedu/ phaedon Department of Geography University of California Santa Barbara

More information

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 4 Spatial Point Patterns Definition Set of point locations with recorded events" within study

More information

Overview of Spatial analysis in ecology

Overview of Spatial analysis in ecology Spatial Point Patterns & Complete Spatial Randomness - II Geog 0C Introduction to Spatial Data Analysis Chris Funk Lecture 8 Overview of Spatial analysis in ecology st step in understanding ecological

More information

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis Guofeng Cao www.spatial.ttu.edu Department of Geosciences Texas Tech University guofeng.cao@ttu.edu Fall 2018 Spatial Point Patterns

More information

Introduction. Spatial Processes & Spatial Patterns

Introduction. Spatial Processes & Spatial Patterns Introduction Spatial data: set of geo-referenced attribute measurements: each measurement is associated with a location (point) or an entity (area/region/object) in geographical (or other) space; the domain

More information

Points. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Points. Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Points Luc Anselin http://spatial.uchicago.edu 1 classic point pattern analysis spatial randomness intensity distance-based statistics points on networks 2 Classic Point Pattern Analysis 3 Classic Examples

More information

Non-parametric Methods

Non-parametric Methods Non-parametric Methods Machine Learning Torsten Möller Möller/Mori 1 Reading Chapter 2 of Pattern Recognition and Machine Learning by Bishop (with an emphasis on section 2.5) Möller/Mori 2 Outline Last

More information

Overview of Statistical Analysis of Spatial Data

Overview of Statistical Analysis of Spatial Data Overview of Statistical Analysis of Spatial Data Geog 2C Introduction to Spatial Data Analysis Phaedon C. Kyriakidis www.geog.ucsb.edu/ phaedon Department of Geography University of California Santa Barbara

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar Similarity and Dissimilarity Similarity Numerical measure of how alike two data objects are. Is higher

More information

Uncertainty Quantification and Validation Using RAVEN. A. Alfonsi, C. Rabiti. Risk-Informed Safety Margin Characterization. https://lwrs.inl.

Uncertainty Quantification and Validation Using RAVEN. A. Alfonsi, C. Rabiti. Risk-Informed Safety Margin Characterization. https://lwrs.inl. Risk-Informed Safety Margin Characterization Uncertainty Quantification and Validation Using RAVEN https://lwrs.inl.gov A. Alfonsi, C. Rabiti North Carolina State University, Raleigh 06/28/2017 Assumptions

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information

Notation. Pattern Recognition II. Michal Haindl. Outline - PR Basic Concepts. Pattern Recognition Notions

Notation. Pattern Recognition II. Michal Haindl. Outline - PR Basic Concepts. Pattern Recognition Notions Notation S pattern space X feature vector X = [x 1,...,x l ] l = dim{x} number of features X feature space K number of classes ω i class indicator Ω = {ω 1,...,ω K } g(x) discriminant function H decision

More information

Chapter 1: Systems of Linear Equations

Chapter 1: Systems of Linear Equations Chapter : Systems of Linear Equations February, 9 Systems of linear equations Linear systems Lecture A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where

More information

An Algorithmist s Toolkit Nov. 10, Lecture 17

An Algorithmist s Toolkit Nov. 10, Lecture 17 8.409 An Algorithmist s Toolkit Nov. 0, 009 Lecturer: Jonathan Kelner Lecture 7 Johnson-Lindenstrauss Theorem. Recap We first recap a theorem (isoperimetric inequality) and a lemma (concentration) from

More information

Michael Harrigan Office hours: Fridays 2:00-4:00pm Holden Hall

Michael Harrigan Office hours: Fridays 2:00-4:00pm Holden Hall Announcement New Teaching Assistant Michael Harrigan Office hours: Fridays 2:00-4:00pm Holden Hall 209 Email: michael.harrigan@ttu.edu Guofeng Cao, Texas Tech GIST4302/5302, Lecture 2: Review of Map Projection

More information

5. Discriminant analysis

5. Discriminant analysis 5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density

More information

BAYESIAN DECISION THEORY

BAYESIAN DECISION THEORY Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will

More information

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Spatial Analysis I. Spatial data analysis Spatial analysis and inference Spatial Analysis I Spatial data analysis Spatial analysis and inference Roadmap Outline: What is spatial analysis? Spatial Joins Step 1: Analysis of attributes Step 2: Preparing for analyses: working with

More information

Basic Properties of Metric and Normed Spaces

Basic Properties of Metric and Normed Spaces Basic Properties of Metric and Normed Spaces Computational and Metric Geometry Instructor: Yury Makarychev The second part of this course is about metric geometry. We will study metric spaces, low distortion

More information

GIST 4302/5302: Spatial Analysis and Modeling

GIST 4302/5302: Spatial Analysis and Modeling GIST 4302/5302: Spatial Analysis and Modeling Lecture 2: Review of Map Projections and Intro to Spatial Analysis Guofeng Cao http://thestarlab.github.io Department of Geosciences Texas Tech University

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,

More information

Statistícal Methods for Spatial Data Analysis

Statistícal Methods for Spatial Data Analysis Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis

More information

Motivating the Covariance Matrix

Motivating the Covariance Matrix Motivating the Covariance Matrix Raúl Rojas Computer Science Department Freie Universität Berlin January 2009 Abstract This note reviews some interesting properties of the covariance matrix and its role

More information

Lecture 7. Econ August 18

Lecture 7. Econ August 18 Lecture 7 Econ 2001 2015 August 18 Lecture 7 Outline First, the theorem of the maximum, an amazing result about continuity in optimization problems. Then, we start linear algebra, mostly looking at familiar

More information

Point Pattern Analysis

Point Pattern Analysis Point Pattern Analysis Nearest Neighbor Statistics Luc Anselin http://spatial.uchicago.edu principle G function F function J function Principle Terminology events and points event: observed location of

More information

GIST 4302/5302: Spatial Analysis and Modeling Lecture 2: Review of Map Projections and Intro to Spatial Analysis

GIST 4302/5302: Spatial Analysis and Modeling Lecture 2: Review of Map Projections and Intro to Spatial Analysis GIST 4302/5302: Spatial Analysis and Modeling Lecture 2: Review of Map Projections and Intro to Spatial Analysis Guofeng Cao http://www.spatial.ttu.edu Department of Geosciences Texas Tech University guofeng.cao@ttu.edu

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Lecture 10: Dimension Reduction Techniques

Lecture 10: Dimension Reduction Techniques Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review CS 4980/6980: Introduction to Data Science c Spring 2018 Lecture 2: Linear Algebra Review Instructor: Daniel L. Pimentel-Alarcón Scribed by: Anh Nguyen and Kira Jordan This is preliminary work and has

More information

Spatial Point Pattern Analysis

Spatial Point Pattern Analysis Spatial Point Pattern Analysis Jiquan Chen Prof of Ecology, University of Toledo EEES698/MATH5798, UT Point variables in nature A point process is a discrete stochastic process of which the underlying

More information

Clustering Lecture 1: Basics. Jing Gao SUNY Buffalo

Clustering Lecture 1: Basics. Jing Gao SUNY Buffalo Clustering Lecture 1: Basics Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics Clustering

More information

Algorithms for Picture Analysis. Lecture 07: Metrics. Axioms of a Metric

Algorithms for Picture Analysis. Lecture 07: Metrics. Axioms of a Metric Axioms of a Metric Picture analysis always assumes that pictures are defined in coordinates, and we apply the Euclidean metric as the golden standard for distance (or derived, such as area) measurements.

More information

Learning from Examples

Learning from Examples Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble

More information

Permutations and Combinations

Permutations and Combinations Permutations and Combinations Permutations Definition: Let S be a set with n elements A permutation of S is an ordered list (arrangement) of its elements For r = 1,..., n an r-permutation of S is an ordered

More information

Clustering compiled by Alvin Wan from Professor Benjamin Recht s lecture, Samaneh s discussion

Clustering compiled by Alvin Wan from Professor Benjamin Recht s lecture, Samaneh s discussion Clustering compiled by Alvin Wan from Professor Benjamin Recht s lecture, Samaneh s discussion 1 Overview With clustering, we have several key motivations: archetypes (factor analysis) segmentation hierarchy

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

ENGRG Introduction to GIS

ENGRG Introduction to GIS ENGRG 59910 Introduction to GIS Michael Piasecki October 13, 2017 Lecture 06: Spatial Analysis Outline Today Concepts What is spatial interpolation Why is necessary Sample of interpolation (size and pattern)

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION 59 CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION 4. INTRODUCTION Weighted average-based fusion algorithms are one of the widely used fusion methods for multi-sensor data integration. These methods

More information

A Program for Data Transformations and Kernel Density Estimation

A Program for Data Transformations and Kernel Density Estimation A Program for Data Transformations and Kernel Density Estimation John G. Manchuk and Clayton V. Deutsch Modeling applications in geostatistics often involve multiple variables that are not multivariate

More information

Lecture 20 : Markov Chains

Lecture 20 : Markov Chains CSCI 3560 Probability and Computing Instructor: Bogdan Chlebus Lecture 0 : Markov Chains We consider stochastic processes. A process represents a system that evolves through incremental changes called

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

Notion of Distance. Metric Distance Binary Vector Distances Tangent Distance

Notion of Distance. Metric Distance Binary Vector Distances Tangent Distance Notion of Distance Metric Distance Binary Vector Distances Tangent Distance Distance Measures Many pattern recognition/data mining techniques are based on similarity measures between objects e.g., nearest-neighbor

More information

Unconstrained Ordination

Unconstrained Ordination Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)

More information

Estimation of direction of increase of gold mineralisation using pair-copulas

Estimation of direction of increase of gold mineralisation using pair-copulas 22nd International Congress on Modelling and Simulation, Hobart, Tasmania, Australia, 3 to 8 December 2017 mssanz.org.au/modsim2017 Estimation of direction of increase of gold mineralisation using pair-copulas

More information

Matrix Basic Concepts

Matrix Basic Concepts Matrix Basic Concepts Topics: What is a matrix? Matrix terminology Elements or entries Diagonal entries Address/location of entries Rows and columns Size of a matrix A column matrix; vectors Special types

More information

Statistical Rock Physics

Statistical Rock Physics Statistical - Introduction Book review 3.1-3.3 Min Sun March. 13, 2009 Outline. What is Statistical. Why we need Statistical. How Statistical works Statistical Rock physics Information theory Statistics

More information

Bivariate Distributions. Discrete Bivariate Distribution Example

Bivariate Distributions. Discrete Bivariate Distribution Example Spring 7 Geog C: Phaedon C. Kyriakidis Bivariate Distributions Definition: class of multivariate probability distributions describing joint variation of outcomes of two random variables (discrete or continuous),

More information

Introduction. Semivariogram Cloud

Introduction. Semivariogram Cloud Introduction Data: set of n attribute measurements {z(s i ), i = 1,, n}, available at n sample locations {s i, i = 1,, n} Objectives: Slide 1 quantify spatial auto-correlation, or attribute dissimilarity

More information

Bayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory

Bayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Bayesian decision theory 8001652 Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Jussi Tohka jussi.tohka@tut.fi Institute of Signal Processing Tampere University of Technology

More information

CS 556: Computer Vision. Lecture 21

CS 556: Computer Vision. Lecture 21 CS 556: Computer Vision Lecture 21 Prof. Sinisa Todorovic sinisa@eecs.oregonstate.edu 1 Meanshift 2 Meanshift Clustering Assumption: There is an underlying pdf governing data properties in R Clustering

More information

Metric-based classifiers. Nuno Vasconcelos UCSD

Metric-based classifiers. Nuno Vasconcelos UCSD Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major

More information

Measurement and Data. Topics: Types of Data Distance Measurement Data Transformation Forms of Data Data Quality

Measurement and Data. Topics: Types of Data Distance Measurement Data Transformation Forms of Data Data Quality Measurement and Data Topics: Types of Data Distance Measurement Data Transformation Forms of Data Data Quality Importance of Measurement Aim of mining structured data is to discover relationships that

More information

Lecture 8. Spatial Estimation

Lecture 8. Spatial Estimation Lecture 8 Spatial Estimation Lecture Outline Spatial Estimation Spatial Interpolation Spatial Prediction Sampling Spatial Interpolation Methods Spatial Prediction Methods Interpolating Raster Surfaces

More information

Types of Spatial Data

Types of Spatial Data Spatial Data Types of Spatial Data Point pattern Point referenced geostatistical Block referenced Raster / lattice / grid Vector / polygon Point Pattern Data Interested in the location of points, not their

More information

Probability Models for Bayesian Recognition

Probability Models for Bayesian Recognition Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIAG / osig Second Semester 06/07 Lesson 9 0 arch 07 Probability odels for Bayesian Recognition Notation... Supervised Learning for Bayesian

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

Density Modeling and Clustering Using Dirichlet Diffusion Trees

Density Modeling and Clustering Using Dirichlet Diffusion Trees p. 1/3 Density Modeling and Clustering Using Dirichlet Diffusion Trees Radford M. Neal Bayesian Statistics 7, 2003, pp. 619-629. Presenter: Ivo D. Shterev p. 2/3 Outline Motivation. Data points generation.

More information

Machine Learning for Signal Processing Bayes Classification and Regression

Machine Learning for Signal Processing Bayes Classification and Regression Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For

More information

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University Lecture 19 Modeling Topics plan: Modeling (linear/non- linear least squares) Bayesian inference Bayesian approaches to spectral esbmabon;

More information

Kernel Methods. Charles Elkan October 17, 2007

Kernel Methods. Charles Elkan October 17, 2007 Kernel Methods Charles Elkan elkan@cs.ucsd.edu October 17, 2007 Remember the xor example of a classification problem that is not linearly separable. If we map every example into a new representation, then

More information

Direct Methods for Solving Linear Systems. Matrix Factorization

Direct Methods for Solving Linear Systems. Matrix Factorization Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

12 - Nonparametric Density Estimation

12 - Nonparametric Density Estimation ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6

More information

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland EnviroInfo 2004 (Geneva) Sh@ring EnviroInfo 2004 Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland Mikhail Kanevski 1, Michel Maignan 1

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Topics to be Covered Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling (CDF) Ancestral Sampling Rejection

More information

Linear Algebra I Lecture 8

Linear Algebra I Lecture 8 Linear Algebra I Lecture 8 Xi Chen 1 1 University of Alberta January 25, 2019 Outline 1 2 Gauss-Jordan Elimination Given a system of linear equations f 1 (x 1, x 2,..., x n ) = 0 f 2 (x 1, x 2,..., x n

More information

POTENTIAL THEORY AND HEAT CONDUCTION DIRICHLET S PROBLEM

POTENTIAL THEORY AND HEAT CONDUCTION DIRICHLET S PROBLEM Chapter 6 POTENTIAL THEORY AND HEAT CONDUCTION DIRICHLET S PROBLEM M. Ragheb 9/19/13 6.1 INTRODUCTION The solution of the Dirichlet problem is one of the easiest approaches to grasp using Monte Carlo methodologies.

More information

EECS 598: Statistical Learning Theory, Winter 2014 Topic 11. Kernels

EECS 598: Statistical Learning Theory, Winter 2014 Topic 11. Kernels EECS 598: Statistical Learning Theory, Winter 2014 Topic 11 Kernels Lecturer: Clayton Scott Scribe: Jun Guo, Soumik Chatterjee Disclaimer: These notes have not been subjected to the usual scrutiny reserved

More information

Measurement and Data

Measurement and Data Measurement and Data Data describes the real world Data maps entities in the domain of interest to symbolic representation by means of a measurement procedure Numerical relationships between variables

More information

3. Review of Probability and Statistics

3. Review of Probability and Statistics 3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture

More information

1 Polynomial approximation of the gamma function

1 Polynomial approximation of the gamma function AM05: Solutions to take-home midterm exam 1 1 Polynomial approximation of the gamma function Part (a) One way to answer this question is by constructing the Vandermonde linear system Solving this matrix

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry

More information

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction

More information

Transformation of Probability Densities

Transformation of Probability Densities Transformation of Probability Densities This Wikibook shows how to transform the probability density of a continuous random variable in both the one-dimensional and multidimensional case. In other words,

More information

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19 EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 19 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org (based on Dr. Raj Jain s lecture

More information

1 Probability and Random Variables

1 Probability and Random Variables 1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in

More information

Gaussian random variables inr n

Gaussian random variables inr n Gaussian vectors Lecture 5 Gaussian random variables inr n One-dimensional case One-dimensional Gaussian density with mean and standard deviation (called N, ): fx x exp. Proposition If X N,, then ax b

More information

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods)

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods) Machine Learning InstanceBased Learning (aka nonparametric methods) Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Non parametric CSE 446 Machine Learning Daniel Weld March

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

In the Name of God. Lectures 15&16: Radial Basis Function Networks

In the Name of God. Lectures 15&16: Radial Basis Function Networks 1 In the Name of God Lectures 15&16: Radial Basis Function Networks Some Historical Notes Learning is equivalent to finding a surface in a multidimensional space that provides a best fit to the training

More information

Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis

Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis For example Data reduction approaches Cluster analysis Principal components analysis

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling

More information

Proximity data visualization with h-plots

Proximity data visualization with h-plots The fifth international conference user! 2009 Proximity data visualization with h-plots Irene Epifanio Dpt. Matemàtiques, Univ. Jaume I (SPAIN) epifanio@uji.es; http://www3.uji.es/~epifanio Outline Motivating

More information

Math113: Linear Algebra. Beifang Chen

Math113: Linear Algebra. Beifang Chen Math3: Linear Algebra Beifang Chen Spring 26 Contents Systems of Linear Equations 3 Systems of Linear Equations 3 Linear Systems 3 2 Geometric Interpretation 3 3 Matrices of Linear Systems 4 4 Elementary

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Density Estimation (II)

Density Estimation (II) Density Estimation (II) Yesterday Overview & Issues Histogram Kernel estimators Ideogram Today Further development of optimization Estimating variance and bias Adaptive kernels Multivariate kernel estimation

More information

component risk analysis

component risk analysis 273: Urban Systems Modeling Lec. 3 component risk analysis instructor: Matteo Pozzi 273: Urban Systems Modeling Lec. 3 component reliability outline risk analysis for components uncertain demand and uncertain

More information

2.1 Gaussian Elimination

2.1 Gaussian Elimination 2. Gaussian Elimination A common problem encountered in numerical models is the one in which there are n equations and n unknowns. The following is a description of the Gaussian elimination method for

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is

More information

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods www.pp.rhul.ac.uk/~cowan/stat_aachen.html Graduierten-Kolleg RWTH Aachen 10-14 February 2014 Glen Cowan Physics

More information

Algorithm-Independent Learning Issues

Algorithm-Independent Learning Issues Algorithm-Independent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning

More information

Advanced statistical methods for data analysis Lecture 1

Advanced statistical methods for data analysis Lecture 1 Advanced statistical methods for data analysis Lecture 1 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline

More information

PARAMETERIZATION OF NON-LINEAR MANIFOLDS

PARAMETERIZATION OF NON-LINEAR MANIFOLDS PARAMETERIZATION OF NON-LINEAR MANIFOLDS C. W. GEAR DEPARTMENT OF CHEMICAL AND BIOLOGICAL ENGINEERING PRINCETON UNIVERSITY, PRINCETON, NJ E-MAIL:WGEAR@PRINCETON.EDU Abstract. In this report we consider

More information

Spatial and Environmental Statistics

Spatial and Environmental Statistics Spatial and Environmental Statistics Dale Zimmerman Department of Statistics and Actuarial Science University of Iowa January 17, 2019 Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January

More information