SALES AND MARKETING Department MATHEMATICS. 2nd Semester. Bivariate statistics. Tutorials and exercises

Similar documents
SALES AND MARKETING Department MATHEMATICS. 2nd Semester. Bivariate statistics. SOLUTIONS of tutorials and exercises

IUT of Saint-Etienne Sales and Marketing department Mr. Ferraris Prom /04/2017

SALES AND MARKETING Department MATHEMATICS. 3rd Semester. Probability distributions. Tutorials and exercises

1. Introduction. 1.1 Aims. 2 characters. For each individual: 120 one pair (x, y) of values 100. Data series : 80

1. Introduction. 1.1 Aims. 2 characters. For each individual: 120 one pair (x, y) of values 100. Data series : 80

Math 1314 Lesson 19: Numerical Integration

Statistics I Exercises Lesson 3 Academic year 2015/16

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

Math Want to have fun with chapter 4? Find the derivative. 1) y = 5x2e3x. 2) y = 2xex - 2ex. 3) y = (x2-2x + 3) ex. 9ex 4) y = 2ex + 1

MATH 1710 College Algebra Final Exam Review

Math 1101 Chapter 2 Review Solve the equation. 1) (y - 7) - (y + 2) = 4y A) B) D) C) ) 2 5 x x = 5

ANOVA - analysis of variance - used to compare the means of several populations.

3. If a forecast is too high when compared to an actual outcome, will that forecast error be positive or negative?

Lesson 2: Exploring Quadratic Relations Quad Regression Unit 5 Quadratic Relations

Midterm 2 - Solutions

Marginal Propensity to Consume/Save

IUT of Saint-Etienne Sales and Marketing department Mr Ferraris Prom /10/2015

Math 10 - Compilation of Sample Exam Questions + Answers

Chapter 20 Comparing Groups

You identified, graphed, and described several parent functions. (Lesson 1-5)

Online Math 1314 Final Exam Review

Re: January 27, 2015 Math 080: Final Exam Review Page 1 of 6

LI EAR REGRESSIO A D CORRELATIO

Math 1314 Test 2 Review Lessons 2 8

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

MATH 112 Final Exam Study Questions

Study Guide - Part 2

Date: Pd: Unit 4. GSE H Analytic Geometry EOC Review Name: Units Rewrite ( 12 3) 2 in simplest form. 2. Simplify

Ch 13 & 14 - Regression Analysis

Math 120 Final Exam Practice Problems, Form: A

Analyzing Lines of Fit

Section 2.5 from Precalculus was developed by OpenStax College, licensed by Rice University, and is available on the Connexions website.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. describes the.

Mt. Douglas Secondary

Practice Questions for Math 131 Exam # 1

AP Statistics Review Ch. 7

Diploma Part 2. Quantitative Methods. Examiners Suggested Answers

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

SALES AND MARKETING Department MATHEMATICS. Combinatorics and probabilities. SOLUTIONS of tutorials and exercises

1 Binomial Probability [15 points]

date: math analysis 2 chapter 18: curve fitting and models

Chapter 14: Basics of Functions

Exam: practice test 1 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Tribhuvan University Institute of Science and Technology 2065

UNIVERSITY OF MORATUWA

Printed Name: Section #: Instructor:

(C) BOARDWORK: Examples: Solve w/ & w/o calculator (approx vs exact)

Statistics Revision Questions Nov 2016 [175 marks]

Mathematics (Project Maths Phase 3)

The following formulas related to this topic are provided on the formula sheet:

Estadística I Exercises Chapter 4 Academic year 2015/16

COLLEGE ALGEBRA. Linear Functions & Systems of Linear Equations

The Review has 16 questions. Simplify all answers, include all units when appropriate.

5, 0. Math 112 Fall 2017 Midterm 1 Review Problems Page Which one of the following points lies on the graph of the function f ( x) (A) (C) (B)

Scatter plot, Correlation, and Line of Best Fit Exam High School Common Core: Interpret Linear Models

MATH 1310 (College Mathematics for Liberal Arts) - Final Exam Review (Revised: Fall 2016)

Math 112 Spring 2018 Midterm 2 Review Problems Page 1

Math 1314 Final Exam Review. Year Profits (in millions of dollars)

Solutionbank S1 Edexcel AS and A Level Modular Mathematics

Math 112 Spring 2018 Midterm 1 Review Problems Page 1

3-1 Solving Systems of Equations. Solve each system of equations by using a table. 1. ANSWER: (3, 5) ANSWER: (2, 7)

3. Find the slope of the tangent line to the curve given by 3x y e x+y = 1 + ln x at (1, 1).

MATH 2070 Test 3 (Sections , , & )

Correlation Coefficient: the quantity, measures the strength and direction of a linear relationship between 2 variables.

The questions listed below are drawn from midterm and final exams from the last few years at OSU. As the text book and structure of the class have

Linear Regression 3.2

Midterm 2 - Solutions

Math 1325 Final Exam Review

Name: Practice A, Math Final Exam December 11, 2018

3 2 (C) 1 (D) 2 (E) 2. Math 112 Fall 2017 Midterm 2 Review Problems Page 1. Let. . Use these functions to answer the next two questions.

their contents. If the sample mean is 15.2 oz. and the sample standard deviation is 0.50 oz., find the 95% confidence interval of the true mean.

MATH 115 FIRST MIDTERM EXAM SOLUTIONS

4 and m 3m. 2 b 2ab is equivalent to... (3 x + 2 xy + 7) - (6x - 4 xy + 3) is equivalent to...

MATH 2070 Mixed Practice KEY Sections (25) 900(.95 )

MATH 1020 TEST 1 VERSION A SPRING Printed Name: Section #: Instructor:

Unit 4 Linear Functions

OCR Maths S1. Topic Questions from Papers. Bivariate Data

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions

Calculator Exam 2009 University of Houston Math Contest. Name: School: There is no penalty for guessing.

Math: Question 1 A. 4 B. 5 C. 6 D. 7

MATH FOR LIBERAL ARTS FINAL REVIEW

CHAPTER 5 LINEAR REGRESSION AND CORRELATION

Growth 23%

AP Statistics Unit 2 (Chapters 7-10) Warm-Ups: Part 1

determine whether or not this relationship is.

College Algebra. Word Problems

Social Science/Commerce Calculus I: Assignment #10 - Solutions Page 1/15

MATH 1101 Exam 1 Review. Spring 2018

Scatter diagrams & lines of best fit

CORRELATION ANALYSIS. Dr. Anulawathie Menike Dept. of Economics

Math 074 Final Exam Review. REVIEW FOR NO CALCULATOR PART OF THE EXAM (Questions 1-14)

Chapter 5. Increasing and Decreasing functions Theorem 1: For the interval (a,b) f (x) f(x) Graph of f + Increases Rises - Decreases Falls

Regression Analysis. BUS 735: Business Decision Making and Research

15. (,4)

Thursday 8 June 2017 Morning Time allowed: 1 hour 30 minutes

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13

Chapter 1 Linear Equations

Problems Pages 1-4 Answers Page 5 Solutions Pages 6-11

Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010

Linear Regression Communication, skills, and understanding Calculator Use

Transcription:

SALES AND MARKETING Department MATHEMATICS 2nd Semester Bivariate statistics Tutorials and exercises Online document: http://jff-dut-tc.weebly.com section DUT Maths S2. IUT de Saint-Etienne Département TC J.F.Ferraris Math S2 Stat2Var TEx Rev2018 page 1 / 10

Exercise 1. (Tutorial for lesson page 5) Are people s behaviour in relation to tobacco and people s gender related, with a 10% significant level? Here are the results of a survey made on a sample of 51 men and 66 women: G : variable "gender" B : variable "behaviour in relation to tobacco" Gm : men Bn : never smoked Gw : women Bs : smoke Bss : stopped smoking observed frequencies: theoretical frequencies according to H 0 : Detailed Chi-squares and total: Gm Gw Gm Gw Gm Gw Bn 12 23 Bn Bn Bs 31 26 Bs Bs Bss 8 17 Bss Bss 1) Place the subtotals and the general total in the first table, and in the second one, identically. 2) Fill the second table (6 central theoretical values) following proportional calculations. 3) Table #3: calculate the six Chi-square, then add them to get the value χ² calc. 4) Test writing: Null hypothesis: Observed χ² Value of the variable χ² between the observed and the theoretical samples: χ² calc = Rejection area Significance level: α = Number of dof: (r-1)(k-1) = Value of the variable χ² limit until rejection : χ² lim = Comparison and decision: Exercise 2. Two candidates compete for a presidential election: NS and FH. In a little town, there are 500 voters. 100 are retired people, 50 are unemployed and 350 are employees. There, the vote results are: candidates blank/ FH NS voters abstention unemployed 24 16 10 employees 122 148 80 retired 36 27 37 1) Decide, with a 1% significance level, whether people s opinion depends on their social group or not. 2) What can we say if we do not include blank votes and abstentions? IUT de Saint-Etienne Département TC J.F.Ferraris Math S2 Stat2Var TEx Rev2018 page 2 / 10

Exercise 3. The table shows attendance in two stores A and B: how many people made at least one purchase. These clients have been sorted by age group (10 to 15 years old, and so on). 1. Say, with a 5% significance level, whether the chosen store depends on the age of a client. store age A B 10-15 46 24 15-20 29 35 20-40 14 17 > 40 12 18 2) What age group mostly contributes to the previous result? Explain. 3) Give the meaning of the 5% significance level on your first answer. 4) According to your Chi² table, can you be more accurate about the chance taken in this statement (your first answer)? Exercise 4. In a survey, 100 people were asked about their age and their attendance at theatres (cinema). We name X the variable "age" and Y the variable "number of annual cinema shows". The survey result is the following table of quotes (fr.: citations) : Y X [15; 25[ [25; 50[ 50 none 4 6 13 1 to 11 10 16 15 12 to 23 13 8 4 24 6 3 2 1) By a χ² independence test, with a 2% significance level, decide whether there s a link or not between the age and the level of attendance at the cinema. 2) Using your form table, discuss the level of confidence you can assign to the assertion : they are dependent. 3) Identify the most important partial Chi-2s and give the meaning of these high values. Exercise 5. (Tutorial for lesson page 6) Let s have a close look of a company s turnover evolution through time. 2009 2010 2011 2012 tri1 tri2 tri3 tri4 tri1 tri2 tri3 tri4 tri1 tri2 tri3 tri4 tri1 tri2 tri3 tri4 (M ) 28 45 49 36 30 44 48 40 28 46 52 37 31 42 54 39 Though there are big seasonal variations, due to its particular activity, is it possible to find out a global trend on several years? IUT de Saint-Etienne Département TC J.F.Ferraris Math S2 Stat2Var TEx Rev2018 page 3 / 10

X Y Let s decide to calculate and display the 5 by 5 moving means: (do it as a group job: divide the set of calculations with your neighbours and share your results) 1-5 2-6 3-7 calculations: Exercise 6. (Tutorial for lesson page 7) Let s take back one of the examples introduced page 3 (lessons doc): effect of the amount of fertilizer on the harvested production. fertilizer harvest plot # X (kg.ha -1 ) Y (q.ha -1 ) 1 150 46 2 80 37 3 120 46 4 220 51 5 100 43 1) For each half-cloud, determine the mean points coordinates. 2) Determine the expression of the Mayer s line (G 1 G 2 ). 3) On a graph, plot the initial table and draw this line. Exercise 7. Determine the expression of the Mayer s line, taking back the case given in exercise 5. Exercise 8. (Tutorial for lesson page 8) Calculate or display on your calculator: the means and standard deviations; the covariance. 1) Taking the data of exercise 6 (fertilizer/harvest) 2) Taking the data of exercise 4 (age/# of cinema shows) choose 60 as average age for the class 50 and more; choose 36 as average number of shows for the class 24 and more. Exercise 9. (Tutorial for lesson page 9) Let s consider the following time series: a company s annual expenses in advertising. X : year 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Y : expense (k ) 41 60 55 66 87 61 90 95 82 120 125 118 The corresponding scatter plot is represented: IUT de Saint-Etienne Département TC J.F.Ferraris Math S2 Stat2Var TEx Rev2018 page 4 / 10

year 1: 2006 Determine the expression of the Y on X fitting line, following the least square method; then, draw it. Exercise 10. 500 people, having passed their driving license exam, are sorted in the table below. They are distributed with respect to the number X of times they took the exam before passing it and to the number Y of hours of driving lessons before their first attempt. 1) Define a margin frequency. Then, give an example from the table. 2) Describe, shortly, the way to enter the data set in your calculator. 3) Calculate the covariance of the pair (X, Y) and give a concrete comment about this value. 4) Among those who took between 15 and 25 hours of driving lessons, what is the rate of those who passed their exam on the third attempt? 5) Among those who passed their exam on the third attempt, what is the rate of those who took between 15 and 25 hours of driving lessons? Exercise 11. A sales agent wishes to analyse his (or her) activity and efficiency. On each appointment to a prospect have been noted the length (X, in minutes) of the presentation of the product, and the sold quantity (Y). The twelve values inside the table were filled with the number of appointments that correspond to each pair (X, Y). 1) Give the meaning of the frequency "8" found inside the table. 2) Calculate, manually, the average time spent per appointment. 3) Give the covariance of the pair (X, Y). Exercise 12. The following table indicates the sales price ( ) of an equipment and the number of sold items, for 4 years. year rank 1 2 3 4 sales price ( ) X 300 210 270 375 # of sold items Y 198 240 222 160 1) Build the scatter plot with an orthogonal frame. The axes intersection must be the point (210, 160); scales: 1 cm for 15 on the abscissas axis, 1 cm for 10 items on the ordinates axis. IUT de Saint-Etienne Département TC J.F.Ferraris Math S2 Stat2Var TEx Rev2018 page 5 / 10

2) Determine the coordinates of G, mean point of the cloud. 3) a. Determine the expression of the Y on X fitting line, following the least square method. The coefficients will be expressed with 6 significant figures. b. Draw this regression line on the graph. 4) Which year saw the highest turnover? For which amount? going further: 5) Now, we assume that, each year, the number of sold items y and the sales price x are related this way: y = 0.498 x + 349. We denote S(x) the turnover achieved by selling y items, x each. a. Express S(x) with respect to x. b. Find the variations of the function S defined in [210 ; 375]. c. Deduce the sales price we would have to set for a fifth year if we want a maximum turnover. How many items will be sold (round to one unit)? For what turnover? Exercise 13. A survey wishes to compare people's expense in high tech equipment compared to their sales. Each column of the table T below represents, in a given French land, the average monthly income of people (X) and the average monthly expense (Y) in high-tech equipment. land A B C D E F income X ( ) 1550 1620 1770 1850 1930 2000 expense Y ( ) 57 61 66 73 76 82 1) Calculate the covariance and then the linear correlation coefficient of the pair (X, Y). Give an interpretation of both parameters. 2) a. Give, by the mean of your calculator, the expression of the Y on X regression line. b. Obtain the expression of the Mayer's line of the series, from the table T. c. Both lines slightly differ. Find the income for which they both give the same expense. What makes this common point special, inside the point cloud? Exercise 14. (Tutorial for lesson page 12) Data about the fuel consumption of a motorcycle have been collected. Consumption: Y, in L/100km, speed: X, in km/h) : X 10 20 30 40 50 60 70 80 90 Y 15.2 11.6 9.3 7.8 7 6.6 6.9 8 9.6 The scatter plot, on the right, clearly shows us that a linear regression would be inappropriate to describe the evolution of the consumption with respect to the speed. Thus, we will propose a variable change. 1) Let s define the variable T by: T = (X 60)². Complete the following table: T Y 15.2 11.6 9.3 7.8 7 6.6 6.9 8 9.6 2) Perform a linear regression of Y on T. 3) Thus, deduce the expression of the regression curve, for the initial scatter plot. IUT de Saint-Etienne Département TC J.F.Ferraris Math S2 Stat2Var TEx Rev2018 page 6 / 10

Exercise 15. quadratic fitting A company took note of its profits Y with respect to X, produced and sold quantity: X (tons) 2 3 5 7 11 Y (k ) 38 55 72 69 24 T 1) Thanks to your calculator, give the linear correlation coefficient between X and Y. Comment. 2) Let s settle the variable T = -(X - 6)². a. Complete the table. b. Calculate Cov(T, Y) and then the linear correlation coefficient between both variables. c. Is a linear fitting of Y on T appropriate? d. Determine the expression of the Y on T fitting line, following the least square method. e. Deduce an expression of the regression of Y on X. Exercise 16. quadratic fitting A market study was conducted on a new type of product. The table below gives, for several proposed sales price, the number of people willing to pay that price. unit price ( ) X 2 3 4 5 6 7 number of people Y 66 47 34 25 18 14 1) Calculate the covariance of the variables X and Y, then comment its sign. 2) We set T = X(X - 20) a. Calculate le the linear correlation coefficient between both variables T and Y. b. Comment its value. c. Determine the expression of the Y on T fitting line, following the least square method. d. Deduce an expanded expression of the regression of Y with respect to X. 3) Here we examine the expected turnover (unit selling price number of sales), if the numbers of citations obtained in the survey are considered to be the numbers of units sold. a. Calculate the turnovers that can be extracted from the initial table. b. Calculate, for the same values of X, the turnovers CA' that can be got thanks to the formula obtained in question 2)d. c. What unit selling price should we fix, so that the best turnover would be reached? Exercise 17. inverse fitting A perfumery, on analysing its turnover, connects the sales quantities (Y) to various perfume brands and models prices (X). The results are gathered in the following table: X, bottle s price ( ) 15 25 30 40 45 60 75 90 Y, # of sold bottles 202 117 107 82 78 60 55 48 Answer the questions beginning with "calculate" by using your calculator s results. 1) a. Calculate the covariance of X and Y; comment its sign. b. Calculate the linear correlation coefficient of X and Y; comment its value. 850 2) In order to have a more precise idea of how X and Y are related, we set the variable change: T = X a. After having calculated the list of values of T, in a third list (calculator), justify that the linear correlation is excellent between T and Y. b. Give the expression of the Y on T regression line, according to the least square method. c. What is the least square criterion? d. Deduce from question 2)b a modelled expression of Y with respect to X. e. According to this model, how many bottles whose cost is 150 would the perfumery expect to sell? IUT de Saint-Etienne Département TC J.F.Ferraris Math S2 Stat2Var TEx Rev2018 page 7 / 10

Exercise 18. (Tutorial for lesson page 13) Calculate the point estimates, in the given situations. 1) Taking back exercise 9, give an estimate of the expense in 2015. 2) Taking back exercise 6, give an estimate of the quantity of fertilizer that would offer a harvest of 60 q/ha. 3) Taking back exercise 13, give an estimate of the fuel consumption when the speed is 100 km/h. Exercise 19. (Tutorial for lesson page 13) Let s take back exercise 9. We want to estimate the expense, for the year 2015, by a 95% confidence interval. 1) a. Get the values of Y, from the values of X and the expression of the fitting line; b. Get the values of Z, by dividing Y by Y ; c. Then, give the mean and standard deviation of Z. 2) Give the point estimate of the expense in 2015. 3) Give the coefficient u corresponding to the confidence level. 4) Then, give the confidence interval. Exercise 20. (Tutorial for lesson page 13) With exercise 6, estimate the harvest by a 99% confidence interval, due to 300 kg/ha of fertilizer. 1) a. Get the values of Y, from the values of X and the expression of the fitting line; b. Get the values of Z, by dividing Y by Y ; c. Then, give the mean and standard deviation of Z. 2) Give a point estimate of the harvest. 3) Give the coefficient u corresponding to the confidence level. 4) Then, give the confidence interval. Exercise 21. (Tutorial for lesson page 13) On each person in a sample, a survey noted the age class (X) and the visual acuity (Y, 1/10 = 0.1): X [5; 35[ [35; 45[ [45; 55[ [55; 65[ 0.3 1 5 10 20 Y 0.6 8 12 25 18 0.9 55 30 14 6 Estimate the visual acuity of a 80 year-old person, by a 99% confidence interval. Exercise 22. In a country, two variables are compared: the consumer force index and the turnover of its car industry: consumer force (index) X 3.26 3.85 3.44 3.08 3.6 car industry turnover (G ) Y 9.3 9.56 9.36 9.24 9.47 1) Give the expression of the Y on X Mayer s line. 2) By the mean of a point estimate, give a value of the consumer force that would correspond to a G 10 car industry turnover. 3) Is a strong correlation between two variables a sign of a cause and effect relationship between them? Exercise 23. least square + confidence interval Monthly revenues of a commercial website are listed below, from January to December 2015: in k : 3 5 4 8 10 9 13 12 17 18 18 21 1) In a few words, describe the least square method. 2) Thanks to the global trend of the evolution of the monthly revenue, give the 95% confidence interval of the predictable revenue in December 2016. (number the months from 1 for January 2015) 3) Give the probability that, in December 2016, the revenue would be less than k 29.23. 4) Build the scatter plot (scale: 2 cm for one month), draw the regression line and finally represent the confidence interval. IUT de Saint-Etienne Département TC J.F.Ferraris Math S2 Stat2Var TEx Rev2018 page 8 / 10

Exercise 24. Mayer + confidence interval city X Y The given table includes eight among major cities of a country. The variable X A 850 58 gives, in thousands, the number of city residents; the variable Y gives, in thousands, the number of students in this city. B 623 37 C 587 38 D 360 20 E 312 16 F 275 15 G 262 12 H 244 12 1) Build the scatter plot from this data series. 2) Give the coordinates of the mean point of the cloud. 3) a. Using Mayer s method, determine manually the expression of the Y on X regression line. b. Draw this line. Does G belong to it? c. Give "Mayer s principle". 4) We will use here another fitting line, whose expression is: y' = 0.07 x 6. a. With this line, give the 95% confidence interval of the predictable number of students in a town that has two million inhabitants. b. What can we say about the chances that the number of students would exceed 155,000 in such a town? Exercise 25. logarithmic fitting + confidence interval Service life of some identical office equipment has been studied. In the following table, t i represents the duration of use - expressed in thousands of hours - and R(t i ) the rate of equipment still in use at the time t i. (e.g.: after 1,000 hours, t i = 1, there are still 90 % left of equipment in use, R(t i ) = 0.90).. t i 1 2 3 4 5 6 7 8 9 R(t i ) 0.9 0.66 0.53 0.4 0.32 0.25 0.19 0.14 0.1 1) We set y i = ln[r(t i )] where ln is the natural logarithm. Fill the following table, then build the scatter plot, using the points M i (t i, y i ), into an orthogonal frame. t i 1 2 3 4 5 6 7 8 9 y i 2) May a linear fitting be relevant in the previous point? Calculate the linear correlation coefficient between T and Y. 3) Using the least square method, determine an expression of the Y on T regression line. Deduce from this expression that there are two positive real numbers k and λ such that: R(t) = k e - λt. 4) In this question, we'll take k = 1.174 and λ = 0.266. a. Determine the predictable rate of equipment still in use after 10,000 hours. b. After how long are there exactly 50 % of equipment still in use? 5) Give a 99% confidence interval of the rate of equipment still in use after 10,000 hours of service. Exercise 26. 100 children have been classified by age (X) and size (Y): Y X [95 ; 105[ [105 ; 125[ [125 ; 135[ [3 ; 5[ 15 10 0 [5 ; 7[ 8 32 5 [7 ; 9[ 2 13 15 1) Enter this table in your calculator. 2) Give the means and standard deviations of X and Y, calculate their covariance. 3) Calculate their linear correlation coefficient. Comment this value. 4) Nevertheless, does the table allow us to see some trend? 5) Assuming that the relationship between age and size is linear until the age of 12, give the 95% confidence interval of the size of a 12 year-old child. IUT de Saint-Etienne Département TC J.F.Ferraris Math S2 Stat2Var TEx Rev2018 page 9 / 10

IUT TC MATHEMATICS FORM FOR BIVARIATE STATISTICS IUT de Saint-Etienne Département TC J.F.Ferraris Math S2 Stat2Var TEx Rev2018 page 10 / 10