Deciphering Math Notation. Billy Skorupski Associate Professor, School of Education

Similar documents
" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Sleep data, two drugs Ch13.xls

Review of Multiple Regression

Mathematical Notation Math Introduction to Applied Statistics

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

Sociology 6Z03 Review II

Chapter 5 Matrix Approach to Simple Linear Regression

Categorical Predictor Variables

STA 431s17 Assignment Eight 1

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Difference in two or more average scores in different groups

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

Given a sample of n observations measured on k IVs and one DV, we obtain the equation

A Introduction to Matrix Algebra and the Multivariate Normal Distribution

An Introduction to Matrix Algebra

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Analysis of Variance and Co-variance. By Manza Ramesh

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร

Regression With a Categorical Independent Variable

RCB - Example. STA305 week 10 1

Multiple Linear Regression for the Salary Data

CAMPBELL COLLABORATION

Using SPSS for One Way Analysis of Variance

Lecture 9: Linear Regression

ANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula

Multiple Linear Regression

BIOSTATISTICAL METHODS

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Simple Linear Regression Using Ordinary Least Squares

General linear models. One and Two-way ANOVA in SPSS Repeated measures ANOVA Multiple linear regression

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

Advanced Experimental Design

Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares

p = q ˆ = 1 -ˆp = sample proportion of failures in a sample size of n x n Chapter 7 Estimates and Sample Sizes

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan

Introduction to Matrix Algebra and the Multivariate Normal Distribution

1 The basics of panel data

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

THE SUMMATION NOTATION Ʃ

Data Set 8: Laysan Finch Beak Widths

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Chapter 12 - Lecture 2 Inferences about regression coefficient

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

The Simple Linear Regression Model

Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction,

An Introduction to Multivariate Methods

8/04/2011. last lecture: correlation and regression next lecture: standard MR & hierarchical MR (MR = multiple regression)

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

Regression With a Categorical Independent Variable

Applied Regression Analysis

Elementary Algebra - Problem Drill 01: Introduction to Elementary Algebra

Analysis of Variance

Chapter 4. Characterizing Data Numerically: Descriptive Statistics

Calculating Fobt for all possible combinations of variances for each sample Calculating the probability of (F) for each different value of Fobt

Section Summary. Sequences. Recurrence Relations. Summations. Examples: Geometric Progression, Arithmetic Progression. Example: Fibonacci Sequence

1/11/2011. Chapter 4: Variability. Overview

Simple Linear Regression: One Quantitative IV

One-Way ANOVA Source Table J - 1 SS B / J - 1 MS B /MS W. Pairwise Post-Hoc Comparisons of Means

Harvard University. Rigorous Research in Engineering Education

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination

Multivariate Regression (Chapter 10)

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

INTRODUCTION TO ANALYSIS OF VARIANCE

y response variable x 1, x 2,, x k -- a set of explanatory variables

Formula for the t-test

Least Squares Analyses of Variance and Covariance

5:1LEC - BETWEEN-S FACTORIAL ANOVA

Definition: A sequence is a function from a subset of the integers (usually either the set

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Hypothesis Testing hypothesis testing approach

Inferential statistics

Design of Experiments. Factorial experiments require a lot of resources

Variance Decomposition and Goodness of Fit

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Stat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb

Analysis of Variance (ANOVA)

Mathematical Notation

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Introducing Generalized Linear Models: Logistic Regression

16.400/453J Human Factors Engineering. Design of Experiments II

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

Calculus Workshop. Calculus Workshop 1

OHSU OGI Class ECE-580-DOE :Design of Experiments Steve Brainerd

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ch. 17. DETERMINATION OF SAMPLE SIZE

Vector, Matrix, and Tensor Derivatives

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Properties of Matrix Arithmetic

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Σ x i. Sigma Notation

Lectures of STA 231: Biostatistics

Gradient. x y x h = x 2 + 2h x + h 2 GRADIENTS BY FORMULA. GRADIENT AT THE POINT (x, y)

Regression With a Categorical Independent Variable

Senior Math Circles November 19, 2008 Probability II

Tribhuvan University Institute of Science and Technology 2065

Simple Linear Regression

This module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression.

Transcription:

Deciphering Math Notation Billy Skorupski Associate Professor, School of Education

Agenda General overview of data, variables Greek and Roman characters in math and statistics Parameters vs. Statistics Common operators and how they work Particular focus on Summation and Product operators, and the associated use of SUBSCRIPTS Miscellaneous statistical symbols and terminology List a few common symbols from Set Theory (to be discussed in Probability)

An Example Data Set i is an indexing variable: i= 1, 2,, N N= 9 Let s say contains measurements for a numeric variable Let s say Y indicates designations for a categorical variable i Y 1 15 1 2 27 1 3 32 1 4 11 1 5 23 2 6 21 2 7 9 2 8 44 3 9 26 3

An Example Data Set and Y are column vectors of length 9. Together, and Y make a 9 x 2 matrix In most cases, your data will have N rows, one for every subject, and one column per variable. i Y 1 15 1 2 27 1 3 32 1 4 11 1 5 23 2 6 21 2 7 9 2 8 44 3 9 26 3

Repeated Measures has been observed on 3 occasions (e.g., at Time1, Time2, Time3). We call this wide format Long format would need 27 rows: N people x 3 observations per. i Y 1 2 3 1 1 12 15 25 2 1 17 27 32 3 1 30 32 27 4 1 11 11 25 5 2 13 23 33 6 2 15 21 29 7 2 11 9 11 8 3 39 44 46 9 3 25 26 32

Repeated Measures long format (only first 3 subjects) Y is repeated 3 times for each subject. T indicates which observation of The 9 values are the first 3 rows of from the previous slide i Y T 1 1 1 12 1 1 2 15 1 1 3 25 2 1 1 17 2 1 2 27 2 1 3 32 3 1 1 30 3 1 2 32 3 1 3 27

Greek and Roman letters The purpose of most (all?) data analysis is to make an inference about population PARAMETERS that exist as part of the POPULATION. We can t directly observe them, so we make educated guesses by collecting a SAMPLE of data and calculating STATISTICS.

Greek and Roman letters Parameters are almost always indicated as Greek letters. Corresponding statistics (parameter estimates) are indicated in one of two ways: 1. 2. Using the Roman letter that corresponds to the Greek : Using a "hat"over the Greek letter b : ˆ

Greek and Roman letters So, Greek letters (e.g., ) are used to indicate population parameters, fixed constants out there in the world (things we are trying to estimate). Parameter estimates come from samples, (that s the job of inferential statistics) and are indicated by Roman letters or Greek letters with hats

Greek and Roman letters Articles using such symbols will either adopt standard practice (e.g., use 0, 1,, p as population regression coefficients), or they will establish the notation to be used in the paper. For example, if more than one regression model is presented, one model may use 0, 1,, p as coefficients, the next may use 0, 1,, p, and the next may use 0, 1,, p, and so on.

Greek and Roman letters Check out the 1 st table in the Handout

Operators Check out the 2 nd and 3 rd tables Most symbols are quite familiar, but and as operators can be confusing at first... (a Greek upper case S ) is for Summation (Add them up) (a Greek upper case P ) is for Product (Multiply them)

Subscripts Subscripts are variables that index other variables. For example, the variable i in our example data set, whose only meaning is the serial position of the subjects in the data set. N i 1 N i When you see it means, add up the variables that appear to the right. The i = 1 at the bottom of and the N at the top are instructions. i will be an indexing variable that starts at 1 and goes to N.

Subscripts Often, if the instructions are to add up all N of the values, the summation will be presented in a shorter form without subscripts: N i 1 N i or N

Another Example Population and Sample Variance have no subscripts...why? and value has a subscript to indicate each 1 ) ( ) ( 1 2 2 1 2 2 N s N i N i i N i i

ANOVA Example Let s say we ve conducted an experiment after randomly assigning participants to one of three treatment conditions. For each subject in each group, we measure the dependent variable, Each person s score can be notated as ij (or sometimes [i,j]), the score for person i in group j. i will go from 1 to n j while j goes from 1 to M, the number of groups (M=3, in this case)

A one way ANOVA table Source SS (Sum of Squares) df MS F p Between M j1 n j ( j 2 ) M-1 Sig. SS df B B MS MS B W Within M n j j1 i1 ( ij j 2 ) N-M = M j1 ( n j 1) SS df W W Total N i1 ( i 2 ) N-1 MS = Mean Square which is just another name for variance

One more (trickier) example Let s say I am describing the population Variance Covariance matrix,, among P variables (P=5). The elements are referred to as ij. What if I want to add up just the elements in the lower triangle?

Population Variance Covariance Matrix,

Sum down the rows, across the columns Say i are the rows, and j are the columns: Sum of lower triangle P i1 ji ij

Set Theory The notation presented in the final table on Set Theory will be very useful for various probability statements. This notation will also sometimes appear in Summation and Product notation when creating subsets of members for aggregating data.

Thanks! Any Questions, Discussion?