Estimating the Number of Tables via Sequential Importance Sampling

Size: px
Start display at page:

Download "Estimating the Number of Tables via Sequential Importance Sampling"

Transcription

1 Estimating the Number of Tables via Sequential Importance Sampling Jing Xi Department of Statistics University of Kentucky Jing Xi, Ruriko Yoshida, David Haws

2 Introduction combinatorics social networks regulatory networks We consider one of the most generally used model: no three way interaction model. Model

3 Conditional Poisson Distribution Let Z = (Z,..., Z l ) be independent Bernoulli trials with probability of successes p = (p,..., p l ). Define w k = p k /( p k ) for k. Then Z follows a Conditional Poisson Distribution means, P(Z = z,..., Z l = z l S Z = c) l k= w z k k. () Z Z2 Z3 Z4 Z5 Z c=3

4 Conditional Poisson Distribution Let Z = (Z,..., Z l ) be independent Bernoulli trials with probability of successes p = (p,..., p l ). Define w k = p k /( p k ) for k. Then Z follows a Conditional Poisson Distribution means, P(Z = z,..., Z l = z l S Z = c) l k= w z k k. () Z Z2 Z3 Z4 Z5 Z c=3

5 Conditional Poisson Distribution Let Z = (Z,..., Z l ) be independent Bernoulli trials with probability of successes p = (p,..., p l ). Define w k = p k /( p k ) for k. Then Z follows a Conditional Poisson Distribution means, P(Z = z,..., Z l = z l S Z = c) l k= w z k k. () Z Z2 Z3 Z4 Z5 Z c=3

6 Conditional Poisson Distribution Let Z = (Z,..., Z l ) be independent Bernoulli trials with probability of successes p = (p,..., p l ). Define w k = p k /( p k ) for k. Then Z follows a Conditional Poisson Distribution means, P(Z = z,..., Z l = z l S Z = c) l k= w z k k. () Z Z2 Z3 Z4 Z5 Z c=3

7 Conditional Poisson Distribution Let Z = (Z,..., Z l ) be independent Bernoulli trials with probability of successes p = (p,..., p l ). Define w k = p k /( p k ) for k. Then Z follows a Conditional Poisson Distribution means, P(Z = z,..., Z l = z l S Z = c) Z l k= Z2 Z3 Z4 Z5 Z w z k k. () c=3

8 Sequential Importance Sampling (SIS) Σ, the set of all tables satisfying marginal conditions. p(x): Σ [, ], target distribution, the uniform distribution over Σ, p(x) = / Σ. q(x) > for all X Σ, the proposal distribution for sampling. We have E [ ] q(x) = X Σ which can be estimated Σ by q(x) = Σ q(x) Σ = N N i= q(x i ), where X,..., X N are tables drawn iid from q(x).

9 Sequential Importance Sampling (SIS) How to get the probability of the whole table X? Denote the columns of the table X as x,, x t. By the multiplication rule we have q(x = (x,, x t )) = q(x )q(x 2 x ) q(x t x t,..., x ). We can easily compute q(x i x i,..., x ) for i = 2, 3,..., t using Conditional Poisson distribution. What if we have rejections? Having rejections means that q(x): Σ [, ] where Σ Σ. The SIS estimator is still unbiased and consistent: [ ] IX Σ E = q(x) = Σ, q(x) q(x) X Σ I X Σ where I X Σ is an indicator function for the set Σ.

10 SIS for 2-way Table Theorem [Chen et. al., 25] For the uniform distribution over all m n - tables with given row sums r,..., r m and first column sum c, the marginal distribution of the first column is the same as the conditional distribution of Z given S Z = c with p i = r i /n

11 SIS for 2-way Table Theorem [Chen et. al., 25] For the uniform distribution over all m n - tables with given row sums r,..., r m and first column sum c, the marginal distribution of the first column is the same as the conditional distribution of Z given S Z = c with p i = r i /n

12 SIS for 2-way Table Theorem [Chen et. al., 25] For the uniform distribution over all m n - tables with given row sums r,..., r m and first column sum c, the marginal distribution of the first column is the same as the conditional distribution of Z given S Z = c with p i = r i /n

13 SIS for 2-way Table Theorem [Chen et. al., 25] For the uniform distribution over all m n - tables with given row sums r,..., r m and first column sum c, the marginal distribution of the first column is the same as the conditional distribution of Z given S Z = c with p i = r i /n. 2 qc 2

14 SIS for 2-way Table Theorem [Chen et. al., 25] For the uniform distribution over all m n - tables with given row sums r,..., r m and first column sum c, the marginal distribution of the first column is the same as the conditional distribution of Z given S Z = c with p i = r i /n. 2 qc * qc2 2

15 SIS for 2-way Tables with Structural Zero s A structural zero means a cell in the table that is fixed to be. Example Theorem [Yuguo Chen, 27] ["Hand Waving" version] Key: If (i, ) is not a structural : change p i = r i /n to p i = r i /(n g i ) where g i is the number of structural zeros in the ith row; otherwise p i =. Theorem [] n=6 [] [] p2=2/(6-2) p3= r2=2

16 SIS for 3-way Tables Theorem ["Hand Waving" version] For a cell (i, j, k ), 3 columns will go through it: (i, j, ), (i,, k ), (, j, k ). Key: When generating (i, j, ), let r = (i,, k ), c = (, j, k ), define r k = X i +k and c k = X +j k, then set: p k = r k c k r k c k + (n r k g r k )(m c k g c k ), where g r k, and g c k are the numbers of structural zeros in r and c, respectively. Theorem A similar strategy can be used in multi-way tables. Theorem

17 Algorithm Example: Semimagic Cube

18 Algorithm Example: Semimagic Cube

19 Algorithm Example: Semimagic Cube

20 Algorithm Example: Semimagic Cube

21 Algorithm Example: Semimagic Cube

22 Algorithm Example: Semimagic Cube

23 Simulations - Semimagic Cubes (Table 2) This tables lists the results from m m m tables with all marginals equals to. Dim m # tables Estimation cv 2 δ % % % e e % e e % e e % e e % δ: acceptance rate

24 Simulations - Semimagic Cubes (Table 3) We can also change marginal s. Dimension m s Estimation cv 2 δ e % e % e % e % e % e % e % e % e % δ: acceptance rate

25 Experiment - Sampson s Dataset It is a dataset about the social interactions among a group of monks recorded by Sampson. Data structure: Dimension: 8 8 Rows/Columns: the 8 monks. Levels: questions: liking (3 timepoints), disliking, esteem, disesteem, positive influenc, negative influence, praise and blame. Values: answers: 3 top choices were listed in original dataset, ranks were recorded. We set these ranks as an indicator ( if in top three choices, if not). N =, estimator is.74774e + 7, cv 2 = 62.4, acceptance rate is 3%.

26 Problems Still Open Our code performs good when marginals are close to each other. But for the opposite case, the acceptance rate can become very low. How can we reduce rejection rate? Possible idea: arrange the order of columns in different ways? How can Gale-Ryser Theorem be used for 3-way tables?

27 THANK YOU! Questions? Website for this paper:

28 Model Let X = (X ijk ) of size (m, n, l), where m, n, l N and N = {, 2,...}, be a table of counts whose entries are independent Poisson random variables with parameters, {θ ijk }. Here X ijk {, }. Consider the loglinear model, log(θ ijk ) = λ + λ M i + λ N j + λ L k + λmn ij + λ ML ik + λ NL jk (2) for i =,..., m, j =,..., n, and k =,..., l where M, N, and L denote the nominal-scale factors. This model is called no three-way interaction model. Notice that the sufficient statistics under the model in (2) are the two-way marginals. Back

29 Example for Structural Zero s How can structural zero s come? Different types of cancer separated by gender for Alaska in year 989: Type of cancer Female Male Total Lung Melanoma Ovarian 8 [] 8 Prostate [] Stomach 5 5 Total The structural zeros s are denoted by "[]" Back

30 Theorem [2-way Tables with Structural Zero s] Define the set of structural zeros Ω as: Ω = {(i, j) : (i, j) is a structural zero, i =,..., m, j =,..., n} Theorem [Yuguo Chen, 27] For the uniform distribution over all m n - tables with given row sums r,..., r m, first column sum c, and the set of structural zeros Ω, the marginal distribution of the first column is the same as the conditional distribution of Z given S Z = c with p i = I [(i,)/ Ω] r i /(n g i ) where g i is the number of structural zeros in the ith row. Back

31 Theorem [SIS for 3-way Tables] Theorem For the uniform distribution over all m n l - tables with structural zeros with given marginals r k = X i +k, c k = X +j k for k =, 2,..., l, and a fixed marginal for the factor L, l, the marginal distribution of the fixed marginal l is the same as the conditional distribution of Z given S Z = l with p k := r k c k r k c k + (n r k g r k )(m c k g c k ), where g r k is the number of structural zeros in the (r, k)th column and g c is the number of structural zeros in the (c, k)th column. k Back

32 Theorem [SIS for Multi-way Tables] Theorem For the uniform distribution over all d-way - contingency tables X = (X i...i d ) of size (n n d ), where n i N for i =,... d with marginals l = X i,...id +, and r j k = X i...i j +i j+...i d k for k {,..., n d }, the marginal distribution of the fixed marginal l is the same as the conditional distribution of Z given S Z = l with p k := d j= r j k d j= r j k + d j= (n j r j k gj k ) where g j k is the number of structural zeros in the (i,..., i j, i j+,..., i d, k)th column of X. Back

Estimating the number of zero-one multi-way tables via sequential importance sampling

Estimating the number of zero-one multi-way tables via sequential importance sampling Ann Inst Stat Math (2013) 65:763 783 DOI 10.1007/s10463-012-0392-7 Estimating the number of zero-one multi-way tables via sequential importance sampling Jing Xi Rurio Yoshida David Haws Received: 1 May

More information

STAC51: Categorical data Analysis

STAC51: Categorical data Analysis STAC51: Categorical data Analysis Mahinda Samarakoon January 26, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 32 Table of contents Contingency Tables 1 Contingency Tables Mahinda Samarakoon

More information

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis

More information

MA : Introductory Probability

MA : Introductory Probability MA 320-001: Introductory Probability David Murrugarra Department of Mathematics, University of Kentucky http://www.math.uky.edu/~dmu228/ma320/ Spring 2017 David Murrugarra (University of Kentucky) MA 320:

More information

Describing Contingency tables

Describing Contingency tables Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds

More information

Neville s Method. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Neville s Method

Neville s Method. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Neville s Method Neville s Method MATH 375 Numerical Analysis J. Robert Buchanan Department of Mathematics Fall 2013 Motivation We have learned how to approximate a function using Lagrange polynomials and how to estimate

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test

STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test Rebecca Barter April 13, 2015 Let s now imagine a dataset for which our response variable, Y, may be influenced by two factors,

More information

21.0 Two-Factor Designs

21.0 Two-Factor Designs 21.0 Two-Factor Designs Answer Questions 1 RCBD Concrete Example Two-Way ANOVA Popcorn Example 21.4 RCBD 2 The Randomized Complete Block Design is also known as the two-way ANOVA without interaction. A

More information

MSH3 Generalized linear model

MSH3 Generalized linear model Contents MSH3 Generalized linear model 7 Log-Linear Model 231 7.1 Equivalence between GOF measures........... 231 7.2 Sampling distribution................... 234 7.3 Interpreting Log-Linear models..............

More information

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21 Sections 2.3, 2.4 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 21 2.3 Partial association in stratified 2 2 tables In describing a relationship

More information

Statistics 3858 : Contingency Tables

Statistics 3858 : Contingency Tables Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,

More information

(c) P(BC c ) = One point was lost for multiplying. If, however, you made this same mistake in (b) and (c) you lost the point only once.

(c) P(BC c ) = One point was lost for multiplying. If, however, you made this same mistake in (b) and (c) you lost the point only once. Solutions to First Midterm Exam, Stat 371, Fall 2010 There are two, three or four versions of each problem. The problems on your exam comprise a mix of versions. As a result, when you examine the solutions

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Lecture 28 Chi-Square Analysis

Lecture 28 Chi-Square Analysis Lecture 28 STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University 28.1 χ 2 test for For a given contingency table, we want to test if two have a relationship or not

More information

Solution to Tutorial 7

Solution to Tutorial 7 1. (a) We first fit the independence model ST3241 Categorical Data Analysis I Semester II, 2012-2013 Solution to Tutorial 7 log µ ij = λ + λ X i + λ Y j, i = 1, 2, j = 1, 2. The parameter estimates are

More information

CSE 151 Machine Learning. Instructor: Kamalika Chaudhuri

CSE 151 Machine Learning. Instructor: Kamalika Chaudhuri CSE 151 Machine Learning Instructor: Kamalika Chaudhuri Linear Classification Given labeled data: (xi, feature vector yi) label i=1,..,n where y is 1 or 1, find a hyperplane to separate from Linear Classification

More information

Stat 315c: Transposable Data Rasch model and friends

Stat 315c: Transposable Data Rasch model and friends Stat 315c: Transposable Data Rasch model and friends Art B. Owen Stanford Statistics Art B. Owen (Stanford Statistics) Rasch and friends 1 / 14 Categorical data analysis Anova has a problem with too much

More information

Linear Equations and Matrix

Linear Equations and Matrix 1/60 Chia-Ping Chen Professor Department of Computer Science and Engineering National Sun Yat-sen University Linear Algebra Gaussian Elimination 2/60 Alpha Go Linear algebra begins with a system of linear

More information

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test

More information

Statistics 135 Fall 2007 Midterm Exam

Statistics 135 Fall 2007 Midterm Exam Name: Student ID Number: Statistics 135 Fall 007 Midterm Exam Ignore the finite population correction in all relevant problems. The exam is closed book, but some possibly useful facts about probability

More information

ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition

ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University of Hong Kong Lecture 8: QR Decomposition

More information

2 Describing Contingency Tables

2 Describing Contingency Tables 2 Describing Contingency Tables I. Probability structure of a 2-way contingency table I.1 Contingency Tables X, Y : cat. var. Y usually random (except in a case-control study), response; X can be random

More information

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). STAT 515 -- Chapter 13: Categorical Data Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). Many studies allow for more than 2 categories. Example

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Ling 289 Contingency Table Statistics

Ling 289 Contingency Table Statistics Ling 289 Contingency Table Statistics Roger Levy and Christopher Manning This is a summary of the material that we ve covered on contingency tables. Contingency tables: introduction Odds ratios Counting,

More information

Chapter 2: Describing Contingency Tables - II

Chapter 2: Describing Contingency Tables - II : Describing Contingency Tables - II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Factorial Treatment Structure: Part I. Lukas Meier, Seminar für Statistik

Factorial Treatment Structure: Part I. Lukas Meier, Seminar für Statistik Factorial Treatment Structure: Part I Lukas Meier, Seminar für Statistik Factorial Treatment Structure So far (in CRD), the treatments had no structure. So called factorial treatment structure exists if

More information

Chapter 11. Hypothesis Testing (II)

Chapter 11. Hypothesis Testing (II) Chapter 11. Hypothesis Testing (II) 11.1 Likelihood Ratio Tests one of the most popular ways of constructing tests when both null and alternative hypotheses are composite (i.e. not a single point). Let

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

arxiv:math/ v1 [math.st] 23 May 2006

arxiv:math/ v1 [math.st] 23 May 2006 The Annals of Statistics 2006, Vol. 34, No. 1, 523 545 DOI: 10.1214/009053605000000822 c Institute of Mathematical Statistics, 2006 arxiv:math/0605615v1 [math.st] 23 May 2006 SEQUENTIAL IMPORTANCE SAMPLING

More information

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes Poisson Latent Feature Calculus for Generalized Indian Buffet Processes Lancelot F. James (paper from arxiv [math.st], Dec 14) Discussion by: Piyush Rai January 23, 2015 Lancelot F. James () Poisson Latent

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2015 Julien Berestycki (University of Oxford) SB2a MT 2015 1 / 16 Lecture 16 : Bayesian analysis

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Lecture 8: Summary Measures

Lecture 8: Summary Measures Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:

More information

j=1 π j = 1. Let X j be the number

j=1 π j = 1. Let X j be the number THE χ 2 TEST OF SIMPLE AND COMPOSITE HYPOTHESES 1. Multinomial distributions Suppose we have a multinomial (n,π 1,...,π k ) distribution, where π j is the probability of the jth of k possible outcomes

More information

Binomial and Poisson Probability Distributions

Binomial and Poisson Probability Distributions Binomial and Poisson Probability Distributions Esra Akdeniz March 3, 2016 Bernoulli Random Variable Any random variable whose only possible values are 0 or 1 is called a Bernoulli random variable. What

More information

Lecture 2: Linear and Mixed Models

Lecture 2: Linear and Mixed Models Lecture 2: Linear and Mixed Models Bruce Walsh lecture notes Introduction to Mixed Models SISG, Seattle 18 20 July 2018 1 Quick Review of the Major Points The general linear model can be written as y =

More information

Formalism of the Tersoff potential

Formalism of the Tersoff potential Originally written in December 000 Translated to English in June 014 Formalism of the Tersoff potential 1 The original version (PRB 38 p.990, PRB 37 p.6991) Potential energy Φ = 1 u ij i (1) u ij = f ij

More information

Relate Attributes and Counts

Relate Attributes and Counts Relate Attributes and Counts This procedure is designed to summarize data that classifies observations according to two categorical factors. The data may consist of either: 1. Two Attribute variables.

More information

Homework 1 Due: Thursday 2/5/2015. Instructions: Turn in your homework in class on Thursday 2/5/2015

Homework 1 Due: Thursday 2/5/2015. Instructions: Turn in your homework in class on Thursday 2/5/2015 10-704 Homework 1 Due: Thursday 2/5/2015 Instructions: Turn in your homework in class on Thursday 2/5/2015 1. Information Theory Basics and Inequalities C&T 2.47, 2.29 (a) A deck of n cards in order 1,

More information

Chapter 2: Describing Contingency Tables - I

Chapter 2: Describing Contingency Tables - I : Describing Contingency Tables - I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

Definition. A matrix is a rectangular array of numbers enclosed by brackets (plural: matrices).

Definition. A matrix is a rectangular array of numbers enclosed by brackets (plural: matrices). Matrices (general theory). Definition. A matrix is a rectangular array of numbers enclosed by brackets (plural: matrices). Examples. 1 2 1 1 0 2 A= 0 0 7 B= 0 1 3 4 5 0 Terminology and Notations. Each

More information

Combinations. April 12, 2006

Combinations. April 12, 2006 Combinations April 12, 2006 Combinations, April 12, 2006 Binomial Coecients Denition. The number of distinct subsets with j elements that can be chosen from a set with n elements is denoted by ( n j).

More information

Optimal exact tests for complex alternative hypotheses on cross tabulated data

Optimal exact tests for complex alternative hypotheses on cross tabulated data Optimal exact tests for complex alternative hypotheses on cross tabulated data Daniel Yekutieli Statistics and OR Tel Aviv University CDA course 29 July 2017 Yekutieli (TAU) Optimal exact tests for complex

More information

Statistics 135 Fall 2008 Final Exam

Statistics 135 Fall 2008 Final Exam Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information

Matrix Arithmetic. j=1

Matrix Arithmetic. j=1 An m n matrix is an array A = Matrix Arithmetic a 11 a 12 a 1n a 21 a 22 a 2n a m1 a m2 a mn of real numbers a ij An m n matrix has m rows and n columns a ij is the entry in the i-th row and j-th column

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information

The Jeffreys Prior. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) The Jeffreys Prior MATH / 13

The Jeffreys Prior. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) The Jeffreys Prior MATH / 13 The Jeffreys Prior Yingbo Li Clemson University MATH 9810 Yingbo Li (Clemson) The Jeffreys Prior MATH 9810 1 / 13 Sir Harold Jeffreys English mathematician, statistician, geophysicist, and astronomer His

More information

Lecture 13 and 14: Bayesian estimation theory

Lecture 13 and 14: Bayesian estimation theory 1 Lecture 13 and 14: Bayesian estimation theory Spring 2012 - EE 194 Networked estimation and control (Prof. Khan) March 26 2012 I. BAYESIAN ESTIMATORS Mother Nature conducts a random experiment that generates

More information

Investigation into the use of confidence indicators with calibration

Investigation into the use of confidence indicators with calibration WORKSHOP ON FRONTIERS IN BENCHMARKING TECHNIQUES AND THEIR APPLICATION TO OFFICIAL STATISTICS 7 8 APRIL 2005 Investigation into the use of confidence indicators with calibration Gerard Keogh and Dave Jennings

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

Introduction to Graphical Models

Introduction to Graphical Models Introduction to Graphical Models STA 345: Multivariate Analysis Department of Statistical Science Duke University, Durham, NC, USA Robert L. Wolpert 1 Conditional Dependence Two real-valued or vector-valued

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Lecture 3: Simulating social networks

Lecture 3: Simulating social networks Lecture 3: Simulating social networks Short Course on Social Interactions and Networks, CEMFI, May 26-28th, 2014 Bryan S. Graham 28 May 2014 Consider a network with adjacency matrix D = d and corresponding

More information

Yu Xie, Institute for Social Research, 426 Thompson Street, University of Michigan, Ann

Yu Xie, Institute for Social Research, 426 Thompson Street, University of Michigan, Ann Association Model, Page 1 Yu Xie, Institute for Social Research, 426 Thompson Street, University of Michigan, Ann Arbor, MI 48106. Email: yuxie@umich.edu. Tel: (734)936-0039. Fax: (734)998-7415. Association

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Correspondence Analysis

Correspondence Analysis Correspondence Analysis Q: when independence of a 2-way contingency table is rejected, how to know where the dependence is coming from? The interaction terms in a GLM contain dependence information; however,

More information

Some statistical applications of constrained Monte Carlo

Some statistical applications of constrained Monte Carlo 1 Some statistical applications of constrained Monte Carlo JULIAN BESAG Department of Mathematical Sciences, University of Bath, England Department of Statistics, University of Washington, Seattle, USA

More information

Machine Learning, Fall 2012 Homework 2

Machine Learning, Fall 2012 Homework 2 0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0

More information

Lecture 1: Introduction to Sublinear Algorithms

Lecture 1: Introduction to Sublinear Algorithms CSE 522: Sublinear (and Streaming) Algorithms Spring 2014 Lecture 1: Introduction to Sublinear Algorithms March 31, 2014 Lecturer: Paul Beame Scribe: Paul Beame Too much data, too little time, space for

More information

EA = I 3 = E = i=1, i k

EA = I 3 = E = i=1, i k MTH5 Spring 7 HW Assignment : Sec.., # (a) and (c), 5,, 8; Sec.., #, 5; Sec.., #7 (a), 8; Sec.., # (a), 5 The due date for this assignment is //7. Sec.., # (a) and (c). Use the proof of Theorem. to obtain

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

STAT 526 Spring Final Exam. Thursday May 5, 2011

STAT 526 Spring Final Exam. Thursday May 5, 2011 STAT 526 Spring 2011 Final Exam Thursday May 5, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Psych Jan. 5, 2005

Psych Jan. 5, 2005 Psych 124 1 Wee 1: Introductory Notes on Variables and Probability Distributions (1/5/05) (Reading: Aron & Aron, Chaps. 1, 14, and this Handout.) All handouts are available outside Mija s office. Lecture

More information

. Find E(V ) and var(v ).

. Find E(V ) and var(v ). Math 6382/6383: Probability Models and Mathematical Statistics Sample Preliminary Exam Questions 1. A person tosses a fair coin until she obtains 2 heads in a row. She then tosses a fair die the same number

More information

Vector, Matrix, and Tensor Derivatives

Vector, Matrix, and Tensor Derivatives Vector, Matrix, and Tensor Derivatives Erik Learned-Miller The purpose of this document is to help you learn to take derivatives of vectors, matrices, and higher order tensors (arrays with three dimensions

More information

Multivariate distance Fall

Multivariate distance Fall Multivariate distance 2017 Fall Contents Euclidean Distance Definitions Standardization Population Distance Population mean and variance Definitions proportions presence-absence data Multivariate distance

More information

Chapter 4 - MATRIX ALGEBRA. ... a 2j... a 2n. a i1 a i2... a ij... a in

Chapter 4 - MATRIX ALGEBRA. ... a 2j... a 2n. a i1 a i2... a ij... a in Chapter 4 - MATRIX ALGEBRA 4.1. Matrix Operations A a 11 a 12... a 1j... a 1n a 21. a 22.... a 2j... a 2n. a i1 a i2... a ij... a in... a m1 a m2... a mj... a mn The entry in the ith row and the jth column

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories

More information

Chapter 12. Analysis of variance

Chapter 12. Analysis of variance Serik Sagitov, Chalmers and GU, January 9, 016 Chapter 1. Analysis of variance Chapter 11: I = samples independent samples paired samples Chapter 1: I 3 samples of equal size J one-way layout two-way layout

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G Distribution free hypothesis tests 1. Classical and distribution-free

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Data Mining 2018 Bayesian Networks (1)

Data Mining 2018 Bayesian Networks (1) Data Mining 2018 Bayesian Networks (1) Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Data Mining 1 / 49 Do you like noodles? Do you like noodles? Race Gender Yes No Black Male 10

More information

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals Satoshi AOKI and Akimichi TAKEMURA Graduate School of Information Science and Technology University

More information

BSc (Hons) in Computer Games Development. vi Calculate the components a, b and c of a non-zero vector that is orthogonal to

BSc (Hons) in Computer Games Development. vi Calculate the components a, b and c of a non-zero vector that is orthogonal to 1 APPLIED MATHEMATICS INSTRUCTIONS Full marks will be awarded for the correct solutions to ANY FIVE QUESTIONS. This paper will be marked out of a TOTAL MAXIMUM MARK OF 100. Credit will be given for clearly

More information

Information Theory and Statistics Lecture 3: Stationary ergodic processes

Information Theory and Statistics Lecture 3: Stationary ergodic processes Information Theory and Statistics Lecture 3: Stationary ergodic processes Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Measurable space Definition (measurable space) Measurable space

More information

Statistics. Statistics

Statistics. Statistics The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,

More information

MATH Notebook 3 Spring 2018

MATH Notebook 3 Spring 2018 MATH448001 Notebook 3 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 3 MATH448001 Notebook 3 3 3.1 One Way Layout........................................

More information

Linear Algebra I Lecture 8

Linear Algebra I Lecture 8 Linear Algebra I Lecture 8 Xi Chen 1 1 University of Alberta January 25, 2019 Outline 1 2 Gauss-Jordan Elimination Given a system of linear equations f 1 (x 1, x 2,..., x n ) = 0 f 2 (x 1, x 2,..., x n

More information

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58 Overview of Course So far, we have studied The concept of Bayesian network Independence and Separation in Bayesian networks Inference in Bayesian networks The rest of the course: Data analysis using Bayesian

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Write your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answersheet.

Write your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answersheet. 2016 Booklet No. Test Code : PSA Forenoon Questions : 30 Time : 2 hours Write your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answersheet.

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

13.1 Categorical Data and the Multinomial Experiment

13.1 Categorical Data and the Multinomial Experiment Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Multi-Level Test of Independence for 2 X 2 Contingency Table using Cochran and Mantel Haenszel Statistics

Multi-Level Test of Independence for 2 X 2 Contingency Table using Cochran and Mantel Haenszel Statistics IJISET - International Journal of Innovative Science, Engineering & Technology, Vol. Issue 8, August 015. ISSN 348 7968 Multi-Level Test of Independence for X Contingency Table using Cochran and Mantel

More information

BALANCING GAUSSIAN VECTORS. 1. Introduction

BALANCING GAUSSIAN VECTORS. 1. Introduction BALANCING GAUSSIAN VECTORS KEVIN P. COSTELLO Abstract. Let x 1,... x n be independent normally distributed vectors on R d. We determine the distribution function of the minimum norm of the 2 n vectors

More information

Tests for Two Correlated Proportions in a Matched Case- Control Design

Tests for Two Correlated Proportions in a Matched Case- Control Design Chapter 155 Tests for Two Correlated Proportions in a Matched Case- Control Design Introduction A 2-by-M case-control study investigates a risk factor relevant to the development of a disease. A population

More information

Learning Bayesian network : Given structure and completely observed data

Learning Bayesian network : Given structure and completely observed data Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution

More information

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes Lecture Notes 7 Random Processes Definition IID Processes Bernoulli Process Binomial Counting Process Interarrival Time Process Markov Processes Markov Chains Classification of States Steady State Probabilities

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Poisson Approximation for Two Scan Statistics with Rates of Convergence

Poisson Approximation for Two Scan Statistics with Rates of Convergence Poisson Approximation for Two Scan Statistics with Rates of Convergence Xiao Fang (Joint work with David Siegmund) National University of Singapore May 28, 2015 Outline The first scan statistic The second

More information

STAT 705: Analysis of Contingency Tables

STAT 705: Analysis of Contingency Tables STAT 705: Analysis of Contingency Tables Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Analysis of Contingency Tables 1 / 45 Outline of Part I: models and parameters Basic

More information