Self-study Notes Module 7 (Optional Material)

Size: px
Start display at page:

Download "Self-study Notes Module 7 (Optional Material)"

Transcription

1 Self-study Notes Module 7 (Optional Material) Dichotomous Dependent Variable A dichotomous dependent variable has just two possible values. The two values represent two mutually exclusive categories of behavior such as agreeing or disagreeing with a particular statement, passing or failing a test, or showing or not showing some behavioral characteristic. The psychologist selects one of the two categories (e.g., agreeing, passing, showing a symptom) to represent the outcome of interest. The proportion of people in the study population have the outcome of interest is represented by the Gree letter π. (The symbol π is used here to represent a population proportion and should not be confused with the irrational number 3.14 that is frequently used in mathematics.) One-group Design A population proportion can be estimated from a random sample of size n. The sample proportion n π = i=1 y i /n (7.1) is an estimate of π where y i = 0 or 1. Participants who have the outcome of interest (e.g., agreeing, passing, showing a symptom) are assigned a y-score of 1, and participants who do not have the outcome of interest (e.g., not agreeing, failing, not showing a symptom) are assigned a y-score of 0. Note that a sample proportion is simply the number of participants who have the trait of interest (denoted as f) divided by the sample size, so that Equation 7.1 also can be written as π = f/n. The estimated standard error of π is SE π = π (1 π )/n (7.2) From Equation 7.2, it is clear that increasing the sample size will decrease the value of the standard error. Small values of SE π indicate that the sample proportion (Equation 7.1) is liely to be similar to the population proportion. The estimated standard error by itself may not be an interesting value to interpret, but SE π can be combined with π to compute a confidence interval for π which does have an important interpretation. An approximate 100(1 α)% confidence interval for π is π ± z α/2 SE π (7.3) where z α/2 is a two-sided critical z-value. 1

2 Example 7.1. About 16,000 students at CSULB purchase a meal plan. The Director of Food Services wants to now the proportion of these 16,000 students who are satisfied with the quality of the food. A random sample of 300 students was obtained. All 300 students were contacted by or phone and ased if they were satisfied or dissatisfied with the quality of the food. In the sample of 300 students, 120 said they were satisfied. A 95% confidence interval for π, the proportion of all 16,000 students who would have said they are satisfied with the food quality, is computed below π = 120/300 = 0.40 SE π = 0.4(0.6)/300 = upper 95% limit = (0.0283) = lower 95% limit = (0.0283) = The Director can be 95% confident that between 34.5% and 45.5% of the 16,000 students are satisfied with the food quality. Note that a proportion multiplied by 100 can be interpreted as a percentage. Two-group Experiment In a two-group experiment, π 1 is the proportion of all people in the study population who would have the specified outcome assuming they had all received Treatment 1, and π 2 is the proportion of all people in the study population who would have the specified outcome assuming they had all received Treatment 2. A measure of effect size in a twogroup experiment is π 1 π 2. In a two-group experiment, π 1 is estimated from a sample of size n1 in group 1 and π 2 is estimated from sample of size n2 in group 2. The following sample proportion in group j is an estimate of π j π j = n j y i=1 ij /n j (7.4) where yij = 1 if participant i in group j has the outcome of interest; otherwise yij = 0. Note that the numerator of Equation 7.4 is the number of participants in group j who have the specified outcome. This frequency count is denoted as fj so that π j = fj/nj. Confidence Interval for π 1 π 2 An approximate 100(1 α)% confidence interval for π 1 π 2 is π 1 π 2 ± z α/2 SE π 1 π 2 (7.5) π 1 (1 π 1 ) π 2 (1 π 2 ) where SE π 1 π 2 = +. n 1 n 2 2

3 Example 7.2. A random sample of 200 participants was obtained from a volunteer pool of about 3,000 undergraduate students, and the sample was randomized into two groups of equal size. All participants were ased to sit quietly in a small room for two hours. The participants in group 1 also were told that the experiment would examine the effects of stimulus deprivation and were ased to sign a consent form indicating a possibility of psychological harm. Participants in group 2 were simply told that they were a control group. All participants were shown a button to push if they needed to exit the room before the end of the experiment. The button for group 1 participants was labeled panic button. The button for group 2 participants had no label. In group 1, f 1 = 57 participants pressed the button. In group 2, f 2 = 15 participants pressed the button. The 95% confidence interval for π 1 π 2 was [0.300, 0.540] and indicates that the proportion of the 3,000 students who would have pressed the button under the fear instructions would be to greater than if they had received the neutral instruction. Hypothesis Testing In a two-group study, the null hypotheses H0: π 1 = π 2 may be tested using the following test statistic X 2 = n [ f 1 (n 2 f 2 ) f 2 (n 1 f 1 ) n 2 ]2 /[n 1 n 2 (f 1 + f 2 )(n f 1 f 2 )] (7.6) where n = n 1 + n 2. The test of H0: π 1 = π 2 based on Equation 7.6 is called a chi-square test of independence. If X 2 > z 2 α/2, then H0 is rejected. Alternatively, statistical pacages will report the p-value associated with the obtained X 2 value, and H0 is rejected if the p- value is small (e.g., less than.05). If H0 is rejected, then the sample estimate of π 1 π 2 determines which alternative hypothesis to accept. For instance, if H0 is rejected and π 1 π 2 > 0, then π 1 π 2 > 0 is accepted. If the chi-square and p-value results are reported in a research report, APA guidelines now require psychologists to supplement those results with a confidence interval for a measure of effect size, such as π 1 π 2. One-factor Experiments In a one-factor experiment with levels of the independent variable, the null hypothesis of equal population proportions is H0: π 1 = π 2 = = π. A chi-square test of independence (a more general version of Equation 7.6) may be used to test H0, but this test suffers from the same problems as the test for equal population means described previously for the one-way ANOVA because we now that H0 will be false in virtually any real study. Thus, an experiment that leads to the conclusion that H0 should be rejected does not provide the type of information that can be used to support a theory or advance nowledge. Unlie the test of H0: π 1 = π 2 = = π, tests of all pairwise comparisons using Equation 7.6 with a Bonferroni adjusted α or confidence intervals for all pairwise comparisons using Equation 7.5 with a Bonferroni adjusted α provide useful information. The confidence interval for a linear contrast of population proportions described below will also provide useful information. 3

4 An approximate 100(1 α)% confidence interval for 2 j=1 c j π j ± z α/2 c j 2 where j=1 c j π j(1 π j) n j j=1 π j(1 π j) j=1 c j π j is n j (7.7) is the estimated standard error of the linear contrast of sample proportions. Bonferroni simultaneous confidence intervals are obtained by replacing α with α* = α/v in Equation 7.7 where v is the number of confidence intervals to be examined. Example 7.3. A random sample of 180 participants was obtained from a volunteer pool of about 4,500 undergraduate students at UC Davis. The 180 participants were randomized into three groups of equal size. All 180 participants were shown a video clip that showed a man robbing a jewelry store. All participants were then shown a photo lineup of three men, one of which was the robber. In the lineup shown to group 1, both innocent men looed very different than the robber. In the lineup shown to group 2, one innocent man looed similar to the robber and one looed very different than the robber. In the lineup shown to group 3, both innocent men looed similar to the robber. The proportion of participants who correctly identified the robber in the 3- photo lineup was 0.433, 0.400, and in groups 1, 2, and 3, respectively. The psychologist has developed a theory that will be supported if π 1 is similar to π 2 and π 3 > (π 1 + π 2 )/2. Larger values of π 3 (π 1 + π 2 )/2 provide greater support for the theory. Bonferroni 95% simultaneous confidence intervals are shown below. Contrast Estimate SE Lower Limit Upper Limit π 1 π π 3 (π 1 + π 2 )/ Let π j represent the proportion of all 4,500 undergraduates who would have correctly identified the robber under condition j. The psychologist can be 95% confident that π 3 is to larger than (π 1 + π 2 )/2 and that π 1 is at most smaller or at most larger than π 2. These results support for the psychologist s theory, but a larger sample will be needed to more accurately determine the similarity of π 1 and π 2. Two-factor Experiments In a 2 2 factorial experiment, the population proportions under each of the four treatment combinations are shown below. Factor A Factor B b 1 b 2 a 1 π 11 π 12 a 2 π 21 π 22 4

5 The main effects, interaction effect, and simple main effects, defined in terms of population proportions, are shown below along with the contrast coefficients. Equation 7.7 may be used to obtain confidence intervals for any of these effects. c 1 c 2 c 3 c 4 A: ½ ½ -½ -½ (π 11 + π 12 )/2 (π 21 + π 22 )/2 B: ½ -½ ½ -½ (π 11 + π 21 )/2 (π 12 + π 22 )/2 AB: π 11 π 12 π 21 + π 22 A at b1: π 11 π 21 A at b2: π 12 π 22 B at a1: π 11 π 12 B at a2: π 21 π 22 Example 7.4. A mail order company has about 400,000 customers. A new web page that includes customer ratings of each product has been developed but is more expensive to maintain than the current version. The company wants to compare the effectiveness of the two web pages before deciding which web page to use. In addition to comparing the two web pages, the company also wants to assess the effectiveness of a $20 coupon. A random sample of 2,000 customers was obtained and randomized into four groups of equal size. All 2,000 sample customers were contacted by . Customers in groups 1 and 2 were given a lin to the new web site that included product ratings, and customers in groups 3 and 4 were given a lin to the currently used web page. Customers in groups 1 and 3 were given a $20 coupon, and customers in groups 2 and 4 did not receive the coupon. The number of sample customers who made a purchase within the next 60 days was determined for each of four groups and the data are given below. Group 1 Group 2 Group 3 Group 4 f 11 = 42 f 12 = 61 f 21 = 73 f 22 = 94 The 95% confidence interval for the interaction effect was [-0.055, 0.064]. This interaction effect was determined to be small and main effects were examined next. The 95% confidence interval for the main effect of web page was [-0.094, ], indicating that the proportion of the 400,000 customers who would mae a purchase from the new web site would be to greater than with the old web site. The 95% confidence interval for the main effect of the $20 coupon was [-0.070, ], indicating that the proportion of the 400,000 customers who would mae a purchase after receiving a $20 coupon would be to greater than without the coupon. Non-experimental Designs The confidence interval for π 1 π 2 (Equation 7.5), the confidence interval for j=1 c j π j (Equation 7.7), and the chi-square test of independence also can be applied to nonexperimental designs where participants are classified into two groups according to some preexisting characteristic, such as freshman/sophomore, democrat/republican, 5

6 and male/female, rather than being randomly assigned to treatment conditions. In nonexperimental designs, an observed relation between the independent variable and the dichotomous dependent variable cannot be interpreted as a causal relation because the relation may be due to one or more unmeasured variables (i.e., confounding variables) that are related to both the dependent variable and the independent variable. The interpretation of the population proportions is not the same in experimental and non-experimental designs. Specifically, in a non-experimental design π j is the proportion of all people in the study subpopulation who belong to category j (e.g., male, democrat, freshman) and have the specified outcome. Within-subjects Experiments The analysis of within-subjects designs is more complicated with a dichotomous dependent variable than with a quantitative dependent variable. The simplest case where each participant is measured under two treatment conditions will be considered here. With two dichotomous measurements (coded 1 or 2 with 1 indicating the presence of the outcome) there are four possible response patterns in a 2-treatment study, as shown below: Treatment 1: Treatment 2: π 11 π 12 π 21 π 22 where π ij is the proportion of people in the study population who would have an i response (i = 1 or 2) for Treatment 1 and a j response (j = 1 or 2) for Treatment 2. For example, π 21 is the proportion of people in the study population who would not have exhibited the outcome under Treatment 1 (i = 2) but would have exhibited the outcome under Treatment 2 (j = 1). The two parameters of primary interest are π 1 = π 11 + π 12 and π 2 = π 11 + π 21 where π 1 is the proportion of people in the study population who would have a measurement 1 response of 1 and π 2 is the proportion of people in the study population who would have a measurement 2 response of 1. Assuming no carryover effect, π 1 π 2 has the same interpretation in a two-level within-subjects experiment as in a two-group betweensubjects experiment. In a random sample of size n, the estimates of the population proportions are π ij =f ij /n, π 1 = π 11 + π 12, and π 2 = π 11 + π 21. An approximate 100(1 α)% confidence interval for π 1 π 2 is 6

7 π 1 π 2 ± z α/2 SE π 1 π 2 (7.8) where SE π 1 π 2= [π 21 + π 12 (π 21 π 12 ) 2 ]/n is the estimated standard error of π 1 π 2 in a within-subjects design. Note that π 1 π 2 = (π 11 + π 12 ) (π 11 + π 21 ) = π 12 π 21 so that Equation 7.8 may be computed using only the estimates of π 12 and π 21. SPSS does not compute Equation 7.8 but it does provide a test of H0: π 1 = π 2 called the McNemar test. The McNemar test rejects H0 if the test statistic (f 12 f 21 ) 2 /(f 12 + f 21 ) is 2 greater than z α/2 (or alternatively if the p-value for the test statistic is less than α). Recall that in a within-subjects experiment with a quantitative dependent variable, the correlation between the measurements determines the magnitude of the standard error with larger correlations resulting in a smaller standard error. It is not obvious from the standard error in Equation 7.8 how the association between the two dichotomous measures affects the value of the standard error. After some tedious algebra, it can be shown that SE π 1 π 2= [π 21 + π 12 (π 21 π 12 ) 2 ]/n can be re-expressed as SE π 1 π 2 = π 1 (1 π 1 ) n + π 2 (1 π 2 ) n 2ρ 12 π 1(1 π 1)π 2(1 π 2) n (7.9) where ρ 12 is the sample Pearson correlation between the two dichotomous measurements. When a Pearson correlation is computed from two dichotomous variables it is usually referred to as a phi coefficient. It is clear from Equation 7.9 that the value of SE π 1 π 2 in a within-subjects experiment is determined by the magnitude of the correlation between the two dichotomous measurements. In typical within-subject experiment, the correlation between the two dichotomous measures is a positive value, and consequently SE π 1 π 2 will usually be smaller for a within-subjects experiment than for a corresponding between-subjects experiment. Example 7.5. A random sample of 40 participants was obtained from a volunteer pool of undergraduate students. One by one, each participant was brought into a room with 15 other students (who were actually woring with the experimenter and were acting out rehearsed roles). The group discussed the pros and cons of providing military aid to a particular country. As rehearsed, none the arguments given by the 15 students were very convincing. The discussion leader ased for a show of hands of those in favor of providing military aid and, as rehearsed, 14 of the 15 students raised their hands. Before leaving the meeting, everyone was ased to write their vote (yes or no) on piece of paper and place it in a box as they left the room. Each participant was measured twice their public vote and their private vote. The sample data are shown below. 7

8 Public vote: Y Y N N Private vote: Y N Y N The value of π 1 π 2 can be thought of as the effect of social pressure. The 95% confidence interval for π 1 π 2 was (26/40 4/40) ± 1.96 [ (.65.1) 2 ]/40 = [.343,.757], indicating that the proportion of all college students in the study population who would vote with the majority would be.343 to.757 greater when voting publicly than privately. Logistic Regression Model The multiple linear regression model in Module 5 is appropriate when the response variable is quantitative. If the response variable is dichotomous, the following logistic regression model can be used y i = θ i 1 + θ i + e i (7.10) where y i is a dichotomous response variable with values 0 and 1, θ i = exp(β 0 + β 1 x 1i + β 2 x 2i + + β q x qi ) is the log-odds of person i having the outcome, e i is the prediction θ error for person i, and i is the probability of person i having the outcome. The slope 1 + θ i coefficient β j describes the change in log-odds of the outcome associated with a 1-point increase in x j, controlling for all other predictor varaibles in the model. Logistic regression programs will also report an estimate of exp(β j ) which describes the multiplicative change in the odds of the outcome associated with a 1-point increase in x j, controlling for all other predictor variables in the model. The value 100[exp(β j ) 1]% describes the percent change in the odds of the outcome associated with a 1-point increase in x j, controlling for all other predictor variables in the model. To better understand the interpretation of a slope coefficient, it is necessary to understand the definition of an odds. The odds of some outcome is defined as the probability of the outcome divided by 1 minus the probability of the outcome. For example, if the probability of the outcome is.8, then the odds of the outcome is.8/(1.8) = 4. Probabilities are usually more easily interpreted than odds. The estimated probability of the outcome for participant i is θ i 1+ θ i where θ i = exp(β 0 + β 1x 1i + β 2x 2i + + β qx qi ). It is often informative to examine the estimated probability of the outcome for low, moderate, and high values of one predictor variable with the values of all other predictor variables set at their mean value. 8

9 The parameter estimates (β 0, β 1,, β q) and their standard errors do not have simple formulas. These estimates and their standard errors can be obtained using logistic regression computer programs. These programs will compute the following approximate 100(1 α)% confidence interval for β j β j ± z α/2 SE β j (7.11) and an approximate 100(1 α)% exponentiating the lower and upper limits of Equation confidence interval for exp(β j ) is obtained by Example 7.5. A random sample of 200 graduating psychology majors seeing employment was obtained and all participants agreed to report their employment status six months after graduation. The number of months of wor experience (x 1 ) and GPA in psychology courses (x 2 ) was determined for all 200 participants at the time of graduation. Employment status (employed = 1, not employed = 0) six months after graduation was determined for all 200 participants. 95% confidence intervals for β j and 100[exp(β j 1)]% are given below. β j 95% Lower Limit 95% Upper Limit 100[exp(β j) 1]% 95% Lower Limit 95% Upper Limit Wor Experience % 0.90% 7.14% Psychology GPA % 6.93% 17.23% The psychologist can be 95% confident that a 1-month increase in wor experience, controlling for psychology GPA, is associated with a 0.90% to 7.14% increase in the odds of being employed. The psychologist can be 95% confident that a 1-point increase in psychology GPA, controlling for wor experience, is associated with a 6.93% to 17.23% increase in the odds of being employed. The estimated probabilities of employment for x 1 = 0, 12, and 24 months of wor experience, assuming a psychology GPA of x 2 = 3.20, are given below. Months of Wor Experience Estimated Probability of Employment The estimated probabilities of employment for psychology GPAs of x 2 = 2.6, 3.2, and 3.8, assuming 12 months of wor experience, are given below. Psychology GPA Estimated Probability of Employment The above estimated probabilities of employment for different values of wor experience and psychology GPA provide a useful description of how strongly wor experience and psychology GPA are related to the probability of employment. 9

10 Sample Size Requirements The sample size requirement to estimate π with desired confidence and precision is approximately n = 4[π (1 π )](z α/2 /w) 2 (7.12) where w is the desired width of the 100(1 )% confidence interval and π is a planning value for π. In situations where the psychologist has no idea about the liely value of π, the planning value can be set to.5, which maximizes the term in square bracets and gives a sample size requirement that is larger than needed to obtain the desired width. The psychologist will often have a range of plausible values for π, and using the value within the plausible range that is closest to.5 will give a conservatively large sample size requirement. Example 7.6. A psychologist is woring with a public policy group to help design an advertisement that will persuade voters to support a ¼ cent sales tax increase. The sales tax will be used fund a new a psychological support program for adolescents who have entered the criminal justice system as first-time offenders. Before spending 2 million dollars to air the advertisement on TV, the psychologist wants to assess its persuasiveness using a random sample of registered voters. The psychologist decided to set π =.5 and wants a 95% confidence interval for π to have a width of.15. The required sample size is n = 4[.5(1.5)](1.96/0.15) 2 = The sample size requirement per group to estimate π 1 π 2 in a two-group design with desired confidence and precision is approximately n j = 4[π 1(1 π 1) + π 2(1 π 2)](z α/2 /w) 2 (7.13) where π j is a planning value for π j. Example 7.7. Thousands of people are currently serving prison terms because they confessed to crimes they did not commit. A psychologist is trying to understand why people mae false confessions and is planning a study to determine if college students can be pressured into confessing to a minor crime they did not commit. Participants will be randomly sampled from a volunteer pool and then randomized into two groups of equal size with group 1 serving as a control condition. Using information about false confessions from the literature, the psychologist sets π 1 =.05 and π 2 =.15. The psychologist would lie to obtain a 95% confidence interval for π 1 π 2 that has a width of 0.1. The sample size requirement per group is approximately n j = 4[.05(.95) +.15(.85)](1.96/0.1) 2 =

11 The sample size requirement per group to estimate and precision is approximately j=1 c j π j with desired confidence n j = 4[ j=1 c 2 j π j(1 π j) ](z α/2 /w) 2. (7.14) Example 7.8. A 2 2 factorial experiment is planned in which college student participants will indicate if they would or would not seriously consider purchasing a new type of smart phone. The participants will be randomized into four groups and each will be given a new smart phone to try for 30 days. The smart phones were standard size or oversized (Factor A) and had one of two different user interfaces (Factor B). From preliminary mareting research, the planning values for π 11, π 12, π 21, and π 22 were set to.1,.2,.2, and.3, respectively. The psychologist wants the 95% confidence intervals for each main effect to have a width of about 0.1. Applying Equation 7.12 gives the following approximate sample size per group n j = 4[.1(.9)/4 +.2(.8)/4 +.2(.8)/4 +.3(.7)/4](1.96/0.1) 2 = The sample size requirement to estimate π 1 π 2 in a within-subjects design with desired confidence and precision is approximately n = 4[π 1(1 π 1) + π 2(1 π 2) 2ρ 12 π 1π 2(1 π 1)(1 π 2)](z α/2 /w) 2 (7.15) where ρ 12 is a planning value of the phi coefficient. Setting ρ 12 equal to the smallest value within a range of plausible values suggested by prior research or expert opinion will give a conservatively large sample size requirement. Example 7.9. A study is planned in which community college students are given $20 and are then ased if they would lie to play two different games. In the first game, they must bet $10 in a coin flip where they will either win another $12 or lose their $10. In the second game, they must bet $5 in a coin flip where they will either win another $6 or lose their $5. Each participant can choose to play both games, only the $10 game, only the $5 game, or neither game. Based on results from a pilot study, the psychologist sets π 1=.35 (the expected proportion of students who will play the $5 game), π 2=.25 (the expected proportion of students who will play the $10 game), and ρ 12 =.6. The sample size required to estimate π 1 π 2 with 95% confidence and width of 0.1 is n = 4[.25(.75) +.35(.65) 2(.6) (. 25)(. 35)(.75)(.65)](1.96/0.1) 2 = Assumptions The confidence intervals (Equations 7.3, 7.5, 7.7, 7.8), the chi-square test of independence (Equation 7.6), and the McNemar test assume random sampling and independence among participants. The test and confidence intervals, which are given in most statistics texts, are called large-sample methods because the coverage probability of the confidence interval is guaranteed to be close to 1 α and the directional error probability of the statistical tests are guaranteed to be close to α/2 only in large samples. 11

12 In recent years, simple alternatives to Equations 7.3, 7.5, 7.7, and 7.8 have been developed that perform well in small samples. An alternative to Equation 7.3 is the following Agresti-Coull confidence interval π ± z α/2 SE π (7.3-alt) where SE π = π (1 π )/(n + 4) and π = (f + 2)/(n + 4). An alternative to Equation 7.5 is the following Agresti-Caffo confidence interval π 1 π 2 ± z α/2 SE π 1 π 2 where SE π 1 π 2 = π 1 (1 π 1 ) n π 2 (1 π 2 ) n (7.5-alt) and π j = (f j + 1)/(n j + 2). An alternative to Equation 7.7 is the following Price-Bonett confidence interval j=1 c j π j ± z α/2 SE cj π j j=1 (7.7-alt) where SE j=1 cj π j 2 π j(1 π j) = j=1 c, j n j + 4/m π j = (f j + 2/m)/(n j + 4/m), and m is the number of nonzero c j values. An alternative to Equation 7.8 is the following Bonett-Price confidence interval π 1 π 2 ± z α/2 SE π 1 π 2 (7.8-alt) where SE π 1 π 2= [π 21 + π 12 (π 21 π 12 ) 2 ] n+2 and π ij = (f ij + 1)/(n + 2). 12

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Econ 325: Introduction to Empirical Economics Chapter 9 Hypothesis Testing: Single Population Ch. 9-1 9.1 What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population

More information

Module 1. Study Population

Module 1. Study Population Module 1 Study Population A study population is a clearly defined collection of objects to be investigated by the researcher. In social and behavioral research, the objects are usually people but the objects

More information

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12) Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12) Remember: Z.05 = 1.645, Z.01 = 2.33 We will only cover one-sided hypothesis testing (cases 12.3, 12.4.2, 12.5.2,

More information

Reducing Computation Time for the Analysis of Large Social Science Datasets

Reducing Computation Time for the Analysis of Large Social Science Datasets Reducing Computation Time for the Analysis of Large Social Science Datasets Douglas G. Bonett Center for Statistical Analysis in the Social Sciences University of California, Santa Cruz Jan 28, 2014 Overview

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Module 2. General Linear Model

Module 2. General Linear Model D.G. Bonett (9/018) Module General Linear Model The relation between one response variable (y) and q 1 predictor variables (x 1, x,, x q ) for one randomly selected person can be represented by the following

More information

Prerequisite Material

Prerequisite Material Prerequisite Material Study Populations and Random Samples A study population is a clearly defined collection of people, animals, plants, or objects. In social and behavioral research, a study population

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126 Psychology 60 Fall 2013 Practice Final Actual Exam: This Wednesday. Good luck! Name: To view the solutions, check the link at the end of the document. This practice final should supplement your studying;

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers Statistical Inference Greg C Elvers 1 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population and not just the sample that we used But our sample

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. MGF 1106 Math for Liberal Arts I Summer 2008 - Practice Final Exam Dr. Schnackenberg If you do not agree with the given answers, answer "E" for "None of the above". MULTIPLE CHOICE. Choose the one alternative

More information

This gives us an upper and lower bound that capture our population mean.

This gives us an upper and lower bound that capture our population mean. Confidence Intervals Critical Values Practice Problems 1 Estimation 1.1 Confidence Intervals Definition 1.1 Margin of error. The margin of error of a distribution is the amount of error we predict when

More information

Discrete Distributions

Discrete Distributions Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have

More information

STP 226 EXAMPLE EXAM #3 INSTRUCTOR:

STP 226 EXAMPLE EXAM #3 INSTRUCTOR: STP 226 EXAMPLE EXAM #3 INSTRUCTOR: Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned. Signed Date PRINTED

More information

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors

More information

16.400/453J Human Factors Engineering. Design of Experiments II

16.400/453J Human Factors Engineering. Design of Experiments II J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential

More information

Statistics for IT Managers

Statistics for IT Managers Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

Algebra 1. Statistics and the Number System Day 3

Algebra 1. Statistics and the Number System Day 3 Algebra 1 Statistics and the Number System Day 3 MAFS.912. N-RN.1.2 Which expression is equivalent to 5 m A. m 1 5 B. m 5 C. m 1 5 D. m 5 A MAFS.912. N-RN.1.2 Which expression is equivalent to 5 3 g A.

More information

Inferences About Two Proportions

Inferences About Two Proportions Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1

More information

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments. Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

More information

PROBABILITY.

PROBABILITY. PROBABILITY PROBABILITY(Basic Terminology) Random Experiment: If in each trial of an experiment conducted under identical conditions, the outcome is not unique, but may be any one of the possible outcomes,

More information

ANOVA - analysis of variance - used to compare the means of several populations.

ANOVA - analysis of variance - used to compare the means of several populations. 12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.

More information

15: CHI SQUARED TESTS

15: CHI SQUARED TESTS 15: CHI SQUARED ESS MULIPLE CHOICE QUESIONS In the following multiple choice questions, please circle the correct answer. 1. Which statistical technique is appropriate when we describe a single population

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

FSA Algebra I End-of-Course Review Packet

FSA Algebra I End-of-Course Review Packet FSA Algebra I End-of-Course Review Packet Table of Contents MAFS.912.N-RN.1.2 EOC Practice... 3 MAFS.912.N-RN.2.3 EOC Practice... 5 MAFS.912.N-RN.1.1 EOC Practice... 8 MAFS.912.S-ID.1.1 EOC Practice...

More information

Exam III Review Math-132 (Sections 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 8.1, 8.2, 8.3)

Exam III Review Math-132 (Sections 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 8.1, 8.2, 8.3) 1 Exam III Review Math-132 (Sections 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 8.1, 8.2, 8.3) On this exam, questions may come from any of the following topic areas: - Union and intersection of sets - Complement of

More information

Psych 230. Psychological Measurement and Statistics

Psych 230. Psychological Measurement and Statistics Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information

QUESTION ONE Let 7C = Total Cost MC = Marginal Cost AC = Average Cost

QUESTION ONE Let 7C = Total Cost MC = Marginal Cost AC = Average Cost ANSWER QUESTION ONE Let 7C = Total Cost MC = Marginal Cost AC = Average Cost Q = Number of units AC = 7C MC = Q d7c d7c 7C Q Derivation of average cost with respect to quantity is different from marginal

More information

FSA Algebra I End-of-Course Review Packet

FSA Algebra I End-of-Course Review Packet FSA Algebra I End-of-Course Review Packet Table of Contents MAFS.912.N-RN.1.2 EOC Practice... 3 MAFS.912.N-RN.2.3 EOC Practice... 5 MAFS.912.N-RN.1.1 EOC Practice... 8 MAFS.912.S-ID.1.1 EOC Practice...

More information

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a

More information

Tests for Two Correlated Proportions in a Matched Case- Control Design

Tests for Two Correlated Proportions in a Matched Case- Control Design Chapter 155 Tests for Two Correlated Proportions in a Matched Case- Control Design Introduction A 2-by-M case-control study investigates a risk factor relevant to the development of a disease. A population

More information

Chapter 9. Hypothesis testing. 9.1 Introduction

Chapter 9. Hypothesis testing. 9.1 Introduction Chapter 9 Hypothesis testing 9.1 Introduction Confidence intervals are one of the two most common types of statistical inference. Use them when our goal is to estimate a population parameter. The second

More information

Module 1. Study Populations

Module 1. Study Populations Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In social and behavioral research, a study population usually consists of a specific

More information

Hypothesis testing: Steps

Hypothesis testing: Steps Review for Exam 2 Hypothesis testing: Steps Repeated-Measures ANOVA 1. Determine appropriate test and hypotheses 2. Use distribution table to find critical statistic value(s) representing rejection region

More information

Chapter # classifications of unlikely, likely, or very likely to describe possible buying of a product?

Chapter # classifications of unlikely, likely, or very likely to describe possible buying of a product? A. Attribute data B. Numerical data C. Quantitative data D. Sample data E. Qualitative data F. Statistic G. Parameter Chapter #1 Match the following descriptions with the best term or classification given

More information

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) In 2007, the number of wins had a mean of 81.79 with a standard

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Chapter 3 Review Chapter 3: Examining Relationships 1. A study is conducted to determine if one can predict the yield of a crop based on the amount of yearly rainfall. The response variable in this study

More information

1 of 6 7/16/2009 6:31 AM Virtual Laboratories > 11. Bernoulli Trials > 1 2 3 4 5 6 1. Introduction Basic Theory The Bernoulli trials process, named after James Bernoulli, is one of the simplest yet most

More information

DE CHAZAL DU MEE BUSINESS SCHOOL AUGUST 2003 MOCK EXAMINATIONS IOP 201-Q (INDUSTRIAL PSYCHOLOGICAL RESEARCH)

DE CHAZAL DU MEE BUSINESS SCHOOL AUGUST 2003 MOCK EXAMINATIONS IOP 201-Q (INDUSTRIAL PSYCHOLOGICAL RESEARCH) DE CHAZAL DU MEE BUSINESS SCHOOL AUGUST 003 MOCK EXAMINATIONS IOP 01-Q (INDUSTRIAL PSYCHOLOGICAL RESEARCH) Time: hours READ THE INSTRUCTIONS BELOW VERY CAREFULLY. Do not open this question paper until

More information

Unit 1, Activity 1, Rational Number Line Cards - Student 1 Grade 8 Mathematics

Unit 1, Activity 1, Rational Number Line Cards - Student 1 Grade 8 Mathematics Unit, Activity, Rational Number Line Cards - Student Grade 8 Mathematics Blackline Masters, Mathematics, Grade 8 Page - Unit, Activity, Rational Number Line Cards - Student Blackline Masters, Mathematics,

More information

The Purpose of Hypothesis Testing

The Purpose of Hypothesis Testing Section 8 1A:! An Introduction to Hypothesis Testing The Purpose of Hypothesis Testing See s Candy states that a box of it s candy weighs 16 oz. They do not mean that every single box weights exactly 16

More information

Marquette University Executive MBA Program Statistics Review Class Notes Summer 2018

Marquette University Executive MBA Program Statistics Review Class Notes Summer 2018 Marquette University Executive MBA Program Statistics Review Class Notes Summer 2018 Chapter One: Data and Statistics Statistics A collection of procedures and principles

More information

Basic Concepts of Probability. Section 3.1 Basic Concepts of Probability. Probability Experiments. Chapter 3 Probability

Basic Concepts of Probability. Section 3.1 Basic Concepts of Probability. Probability Experiments. Chapter 3 Probability Chapter 3 Probability 3.1 Basic Concepts of Probability 3.2 Conditional Probability and the Multiplication Rule 3.3 The Addition Rule 3.4 Additional Topics in Probability and Counting Section 3.1 Basic

More information

[ z = 1.48 ; accept H 0 ]

[ z = 1.48 ; accept H 0 ] CH 13 TESTING OF HYPOTHESIS EXAMPLES Example 13.1 Indicate the type of errors committed in the following cases: (i) H 0 : µ = 500; H 1 : µ 500. H 0 is rejected while H 0 is true (ii) H 0 : µ = 500; H 1

More information

Multiple Linear Regression

Multiple Linear Regression 1. Purpose To Model Dependent Variables Multiple Linear Regression Purpose of multiple and simple regression is the same, to model a DV using one or more predictors (IVs) and perhaps also to obtain a prediction

More information

Using SPSS for One Way Analysis of Variance

Using SPSS for One Way Analysis of Variance Using SPSS for One Way Analysis of Variance This tutorial will show you how to use SPSS version 12 to perform a one-way, between- subjects analysis of variance and related post-hoc tests. This tutorial

More information

Factorial Independent Samples ANOVA

Factorial Independent Samples ANOVA Factorial Independent Samples ANOVA Liljenquist, Zhong and Galinsky (2010) found that people were more charitable when they were in a clean smelling room than in a neutral smelling room. Based on that

More information

Regression Models REVISED TEACHING SUGGESTIONS ALTERNATIVE EXAMPLES

Regression Models REVISED TEACHING SUGGESTIONS ALTERNATIVE EXAMPLES M04_REND6289_10_IM_C04.QXD 5/7/08 2:49 PM Page 46 4 C H A P T E R Regression Models TEACHING SUGGESTIONS Teaching Suggestion 4.1: Which Is the Independent Variable? We find that students are often confused

More information

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs) The One-Way Independent-Samples ANOVA (For Between-Subjects Designs) Computations for the ANOVA In computing the terms required for the F-statistic, we won t explicitly compute any sample variances or

More information

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities:

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities: CHAPTER 9, 10 Hypothesis Testing Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities: The person is guilty. The person is innocent. To

More information

MAE Probability and Statistical Methods for Engineers - Spring 2016 Final Exam, June 8

MAE Probability and Statistical Methods for Engineers - Spring 2016 Final Exam, June 8 MAE 18 - Probability and Statistical Methods for Engineers - Spring 16 Final Exam, June 8 Instructions (i) One (two-sided) cheat sheet, book tables, and a calculator with no communication capabilities

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Simple Linear Regression: One Qualitative IV

Simple Linear Regression: One Qualitative IV Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression

More information

DSST Principles of Statistics

DSST Principles of Statistics DSST Principles of Statistics Time 10 Minutes 98 Questions Each incomplete statement is followed by four suggested completions. Select the one that is best in each case. 1. Which of the following variables

More information

Section 4.2 Basic Concepts of Probability

Section 4.2 Basic Concepts of Probability Section 4.2 Basic Concepts of Probability 2012 Pearson Education, Inc. All rights reserved. 1 of 88 Section 4.2 Objectives Identify the sample space of a probability experiment Identify simple events Use

More information

COVENANT UNIVERSITY NIGERIA TUTORIAL KIT OMEGA SEMESTER PROGRAMME: ECONOMICS

COVENANT UNIVERSITY NIGERIA TUTORIAL KIT OMEGA SEMESTER PROGRAMME: ECONOMICS COVENANT UNIVERSITY NIGERIA TUTORIAL KIT OMEGA SEMESTER PROGRAMME: ECONOMICS COURSE: CBS 221 DISCLAIMER The contents of this document are intended for practice and leaning purposes at the undergraduate

More information

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval

More information

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences Random Variables Christopher Adolph Department of Political Science and Center for Statistics and the Social Sciences University

More information

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Math 1332 Exam Review Name SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Find the cardinal number for the set. 1) {8, 10, 12,..., 66} 1) Are the sets

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1 9.1 Scatter Plots and Linear Correlation Answers 1. A high school psychologist wants to conduct a survey to answer the question: Is there a relationship between a student s athletic ability and his/her

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Appendix A. Review of Basic Mathematical Operations. 22Introduction

Appendix A. Review of Basic Mathematical Operations. 22Introduction Appendix A Review of Basic Mathematical Operations I never did very well in math I could never seem to persuade the teacher that I hadn t meant my answers literally. Introduction Calvin Trillin Many of

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

Regression With a Categorical Independent Variable

Regression With a Categorical Independent Variable Regression With a Independent Variable Lecture 10 November 5, 2008 ERSH 8320 Lecture #10-11/5/2008 Slide 1 of 54 Today s Lecture Today s Lecture Chapter 11: Regression with a single categorical independent

More information

Chapter 6 Continuous Probability Distributions

Chapter 6 Continuous Probability Distributions Continuous Probability Distributions Learning Objectives 1. Understand the difference between how probabilities are computed for discrete and continuous random variables. 2. Know how to compute probability

More information

Eco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013

Eco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013 Midterm 3 4/5/2013 Instructions: You may use a calculator, and one sheet of notes. You will never be penalized for showing work, but if what is asked for can be computed directly, points awarded will depend

More information

13.1 Categorical Data and the Multinomial Experiment

13.1 Categorical Data and the Multinomial Experiment Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)

More information

Hypothesis testing: Steps

Hypothesis testing: Steps Review for Exam 2 Hypothesis testing: Steps Exam 2 Review 1. Determine appropriate test and hypotheses 2. Use distribution table to find critical statistic value(s) representing rejection region 3. Compute

More information

Final Exam Review STAT 212

Final Exam Review STAT 212 Final Exam Review STAT 212 1/ A market researcher randomly selects 100 homeowners under 60 years of age and 200 homeowners over 60 years of age. What sampling technique was used? A. Systematic B. Convenience

More information

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

Chapter 2 Modeling with Linear Functions

Chapter 2 Modeling with Linear Functions Chapter Modeling with Linear Functions Homework.1 1. a. b. c. In 199, t = 1. Notice on the scattergram, when t = 1, p is approximately 4.5. Therefore, 4.5% of deaths due to car accidents in 199 were due

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

Basic Statistics Exercises 66

Basic Statistics Exercises 66 Basic Statistics Exercises 66 42. Suppose we are interested in predicting a person's height from the person's length of stride (distance between footprints). The following data is recorded for a random

More information

Section 6.2 Hypothesis Testing

Section 6.2 Hypothesis Testing Section 6.2 Hypothesis Testing GIVEN: an unknown parameter, and two mutually exclusive statements H 0 and H 1 about. The Statistician must decide either to accept H 0 or to accept H 1. This kind of problem

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson Lecture 10: Alternatives to OLS with limited dependent variables PEA vs APE Logit/Probit Poisson PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014 LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers

More information

Chapter 5: Normal Probability Distributions

Chapter 5: Normal Probability Distributions Probability and Statistics Mrs. Leahy Chapter 5: Normal Probability Distributions 5.1 Introduction to Normal Distributions and the Standard Normal Distribution What is a Normal Distribution and a Normal

More information

Lecture 7: Hypothesis Testing and ANOVA

Lecture 7: Hypothesis Testing and ANOVA Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis

More information

Solving Equations by Adding and Subtracting

Solving Equations by Adding and Subtracting SECTION 2.1 Solving Equations by Adding and Subtracting 2.1 OBJECTIVES 1. Determine whether a given number is a solution for an equation 2. Use the addition property to solve equations 3. Determine whether

More information

CHAPTER 4. > 0, where β

CHAPTER 4. > 0, where β CHAPTER 4 SOLUTIONS TO PROBLEMS 4. (i) and (iii) generally cause the t statistics not to have a t distribution under H. Homoskedasticity is one of the CLM assumptions. An important omitted variable violates

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Chapter 9. Correlation and Regression

Chapter 9. Correlation and Regression Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in

More information

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning STATISTICS 100 EXAM 3 Spring 2016 PRINT NAME (Last name) (First name) *NETID CIRCLE SECTION: Laska MWF L1 Laska Tues/Thurs L2 Robin Tu Write answers in appropriate blanks. When no blanks are provided CIRCLE

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information