Multivariate Statistical Analysis

Similar documents
Syllabus for CH-3300 Introduction to Physical Chemistry

ECE Digital Signal Processing Spring Semester 2010 Classroom: 219 Riggs Hall Meeting Times: T,TH 3:30-4:45 pm

GREAT IDEAS IN PHYSICS

CHEM 102 Fall 2012 GENERAL CHEMISTRY

GEOL 443 SYLLABUS. Igneous and Metamorphic Petrology, Spring 2013 Tuesday & Thursday 8:00 a.m. 9:15 a.m., PLS Date Subject Reading

Multivariable Calculus

CALIFORNIA STATE UNIVERSITY, East Bay Department of Chemistry. Chemistry 1615 Survey of Basic Chemistry for Healthier Living Fall Quarter, 2014

Prerequisite: one year of high school chemistry and MATH 1314

MTH 163, Sections 40 & 41 Precalculus I FALL 2015

Chemistry 401: Modern Inorganic Chemistry (3 credits) Fall 2017

Chemistry 103: Basic General Chemistry (4.0 Credits) Fall Semester Prerequisites: Placement or concurrent enrollment in DEVM F105 or higher

Physics 321 Introduction to Modern Physics

Physics 343: Modern Physics Autumn 2015

Fall 2017 CHE 275 Organic Chemistry I

MATH 135 PRE-CALCULUS: ELEMENTARY FUNCTIONS COURSE SYLLABUS FALL 2012

GEOL 103: Dynamic Earth

Physics Fundamentals of Astronomy

MATH 251 Ordinary and Partial Differential Equations Summer Semester 2017 Syllabus

GTECH 380/722 Analytical and Computer Cartography Hunter College, CUNY Department of Geography

Reid State Technical College

Intermediate/College Algebra, MATH 083/163, E8X

Quinsigamond Community College School of Math and Science

GEOLOGY 101 Introductory Geology Lab Hunter North 1021 Times, days and instructors vary with section

CENTRAL TEXAS COLLEGE SYLLABUS FOR MATH 2318 Linear Algebra. Semester Hours Credit: 3

Reid State Technical College

MATH-0965: INTERMEDIATE ALGEBRA

AAE 553 Elasticity in Aerospace Engineering

Textbooks, supplies and other Resources TITLE: CHEMISTRY: A MOLECULAR APPROACH EDITION:4 TH EDITION

Physics Fundamentals of Astronomy

HEAT AND THERMODYNAMICS PHY 522 Fall, 2010

GEO 448 Plate Tectonics Fall 2014 Syllabus

Important Dates. Non-instructional days. No classes. College offices closed.

CHEM General Chemistry II Course (Lecture) Syllabus Spring 2010 [T, Th]

CHEM 30A: Introductory General Chemistry Fall 2017, Laney College. Welcome to Chem 30A!

Advanced Engineering Mathematics Course Number: Math Spring, 2016

CHE 251 Contemporary Organic Chemistry

GEO 401 Physical Geology (Fall 2010) Unique Numbers Class: JGB 2.324; MWF 9:00-10:00 Labs: JGB 2.310; time according to your unique number

Course: Math 111 Pre-Calculus Summer 2016

Introductory Geosciences I: Geology 1121 Honors Earth s Internal Processes Georgia State University Fall Semester 2009

Advanced Mechanics PHY 504 Fall Semester, 2016 (Revised 08/24/16) The Course will meet MWF in CP183, 9:00-9:50 a.m., beginning August 24, 2016.

GEOG 671: WEATHER FORECASTING Tuesday- Thursday 11:10am - 12:30pm, Kingsbury 134N

Math 060/070 PAL: Elementary and Intermediate Algebra Spring 2016

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS

MA113 Calculus III Syllabus, Fall 2017

Course Syllabus Chemistry 111 Introductory Chemistry I

Introduction to Engineering Analysis - ENGR1100 Course Description and Syllabus Tuesday / Friday Sections. Spring '13.

STATISTICAL AND THERMAL PHYSICS

University of Houston-Clear Lake PHYS Modern Physics (Summer 2015) Syllabus 3:00-5:50pm Bayou 3324

INTRODUCTION TO NUCLEAR AND PARTICLE PHYSICS Physics 4/56301 SPRING 2016 INSTRUCTOR:

Exam policies Learning Disabilities Academic Honesty Class Attendance Getting Help

CHEM 25: Organic Chemistry I (2009FA-CHEM )

Math 410 Linear Algebra Summer Session American River College

JEFFERSON COLLEGE COURSE SYLLABUS MTH 128 INTERMEDIATE ALGEBRA. 3 Credit Hours. Prepared by: Beverly Meyers September 20 12

Chemistry 330 Fall 2015 Organic Chemistry I

Honors Algebra II / Trigonometry

JEFFERSON COLLEGE COURSE SYLLABUS MTH 110 INTRODUCTORY ALGEBRA. 3 Credit Hours. Prepared by: Skyler Ross & Connie Kuchar September 2014

ENGR 3130: DYNAMICS University of Detroit Mercy Term I,

Instructor Dr. Tomislav Pintauer Mellon Hall Office Hours: 1-2 pm on Thursdays and Fridays, and by appointment.

Physics 103 Astronomy Syllabus and Schedule Fall 2016

SYLLABUS FOR [FALL/SPRING] SEMESTER, 201x

Prerequisite: Math 40 or Math 41B with a minimum grade of C, or the equivalent.

CHEM 4725/8725 Organometallic Chemistry. Spring 2016

University of Alaska Fairbanks Chemistry 103: Basic General Chemistry Course Syllabus

PHYS F212X FE1+FE2+FE3

Division of Arts & Sciences Department of Mathematics Course Syllabus

CHEM 333 Spring 2016 Organic Chemistry I California State University Northridge

Chemistry Syllabus Fall Term 2017

NEW RIVER COMMUNITY COLLEGE DUBLIN, VIRGINIA COURSE PLAN

CHEMISTRY 100 : CHEMISTRY and MAN

Astronomy 001 Online SP16 Syllabus (Section 8187)

ELECTROMAGNETIC THEORY

Biophysical Chemistry CHEM348 and CHEM348L

CHEM 115: Preparation for Chemistry

This course is based on notes from a variety of textbooks, National Weather Service Manuals, and online modules (e.g.,

Times/Room Friday 9:00 pm 3:00 pm Room B225 (lecture and laboratory) Course Semester Credit Total Course hours (lecture & lab)

CENTRAL TEXAS COLLEGE SYLLABUS FOR MATH 2415 CALCULUS III. Semester Hours Credit: 4

CENTRAL TEXAS COLLEGE-FORT RILEY SYLLABUS FOR DSMA 0301 DEVELOPMENTAL MATHEMATICS II SEMESTER HOURS CREDIT: 3 FALL 2014 SYLLABUS (08/11/14 10/05/14)

Syllabus, Math 343 Linear Algebra. Summer 2005

Intermediate Algebra

Historical Geology, GEOL 1120 (final version) Spring 2009

Modern Physics (PHY 371)

PHYS 1112: Introductory Physics-Electricity and Magnetism, Optics, Modern Physics

Important Dates. Non-instructional days. No classes. College offices closed.

MATH-0955: BEGINNING ALGEBRA

GIST 4302/5302: Spatial Analysis and Modeling

PHYSICS 206, Spring 2019

Physics Fundamentals of Astronomy

CHEM 121: Chemical Biology

West Virginia University Department of Civil and Environmental Engineering Course Syllabus

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS CONCEPTS OF CHEMISTRY CHEM 1310

Chemistry 4715/8715 Physical Inorganic Chemistry Fall :20 pm 1:10 pm MWF 121 Smith. Kent Mann; 668B Kolthoff; ;

CHE 717, Chemical Reaction Engineering, Fall 2016 COURSE SYLLABUS

JEFFERSON COLLEGE COURSE SYLLABUS MTH 110 INTRODUCTORY ALGEBRA. 3 Credit Hours. Prepared by: Skyler Ross & Connie Kuchar September 2014

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK NATURAL SCIENCES DEPARTMENT. SCC105: Introduction to Chemistry Fall I 2014

Course Content (visit for details)

Physics Observational Methods in Astronomy

Math 102,104: Trigonome and Precalculus Spring

REQUIRED TEXTBOOK/MATERIALS

BIOLOGY 3 INTRODUCTION TO BIOLOGY Tentative Lecture and Laboratory Schedule Spring 2016 WEEK DATE LECTURE TOPIC TEXT LAB EXERCISE (M, W)

Transcription:

Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Syllabus and Lecture 1 for Applied Multivariate Analysis

Outline Course Description 1 Course Description 2

What s this course about Applied Multivariate Analysis Multivariate data through experimentation and observation occur quite often in engineering, business, social sciences, as well as biological and physical sciences. This is a course in applied multivariate data analysis. It will cover descriptive and graphical methods for continuous multivariate data, the multivariate normal, multivariate tests of means, covariances and equality of distributions, univariate and multivariate regression and their comparisons, multivariate analysis of variance, covariance structure models, and discrimination and classification. Furthermore it should be emphasized that this course and hence the chosen text, is designed around the application of multivariate techniques to continuous data, time allowing we will endeavor to discuss methods of discrete multivariate analysis from prepared class notes.

Software Course Description Students will learn how to use statistical software to facilitate the understanding of the foundations of multivariate analysis. Statistical packages will include R, MatLab, SAS.

Prerequisites: Students taking MthSc 807 are expected to have had a course in regression or intermediate data analysis (perhaps 805) and exposure to some statistical software although it is not required. The equivalence of MthSc 403/603 (Intro to Statistical Theory) can be considered preparatory for those students interested in multivariate data analysis. Prerequisites are a working knowledge of general linear models, statistical inference concerning these types of models, and hypothesis testing, and elementary matrix operations.

Attendance Policy: All classes should be attended, but, if you are ill stay at home. I will accept e-mail or phone messages to that effect. Note that this does not exempt you from turning in homework/projects on time nor taking quizzes at their proposed times. Legitimate excuses must be offered with respect to the day(s) missed. Attendance will be monitored. It is to the instructors discretion whether an excuse is legitimate or not. Accordingly, the university s policy on religious holidays will be acknowledged and honored.

Tardy Professor Policy: If the instructor is more than 15 minutes late for any class you may leave.

Examination Policy: There will be two 50 minutes in class closed book examinations and a closed book final examination consisting of short answer questions requiring no more than a calculator. No makeup examinations will be given. Any student who misses an examination without a legitimate excuse,ie, a documented medical excuse, will receive a score of zero for that exam. A student with a legitimate excuse, will receive a final score based on all other class work. More than one missed exam with require withdrawal from the course and/or the receipt of a failing final grade.

Homework and/or Take Home Projects: There will also be several homework sets and/or take home projects assigned from the text as well as from material covered during class. It is imperative that each student be completely comfortable with these assigned problems and projects. Homework is a critical part of the course. It will include both theoretical and applied work. Late homework will not be accepted (although you may turn in assignments early).

Homework Constraints Homework should be detailed enough to adequately demonstrate your solution. You may discuss the homework problems with other students, however, the final work you turn in must be your own. Don t copy other students solutions. Questions regarding them can be asked during office hours and via email. All homework assignments must be sent electronically. This requires that solutions to homework sets and take home projects be typed. No hand written assignments will be accepted. All data sets will be available electronically to facilitate this being done. Homework sets will be due two class periods after they have been assigned.

Grading Policy: The two regular exams (20% each)(midterms if you wish) and a final exam (10%) which will count as 50% of the final grade, homework sets 10%, several multivariate exploratory data cases 20%, and a final multivariate data analysis project 20%. The final exam will cover the more important topics covered during the semester and will consist of several short answer questions requiring no more than a calculator. Grading Scale: A 100-90 B 89-80 C 79-70 D 69-60 F 59 -

Final Multivariate Data Analysis project The purpose of this project is to have the student conduct a complete and thorough analysis of a multivariate data set. Data sets can be selected from journal articles or other referenced sources. The analysis should be completely novel from the approach taken in the source. The final project should include: 1 A one page description of the data set, which should contain at least 50 observations and 5 variables, including the source. 2 A one page proposed analysis, methods, and rational for the methods used. 3 The results of the analysis, including a printouts and graphics. Graphics should be incorporated into results. Source code should be included in the appendices along with a reference page sourcing the problem and the data. 4 Data from the text book will not be allowed, although simulation of data similar to those in the text will be allowed.

Final Multivariate Data Analysis project Reports should be neatly typed, well-organized and attractive. Graphical displays (either computer-generated or hand-drawn) are encouraged. Generally, graphs are more effective if they are incorporated into the text, rather than hidden at the end of the report. You may also use a computer package to aid in the data analysis. If you do so, the results should be discussed in the text of your report, and the computer output itself may be included in an appendix. A typed rough draft of the final report will be due approximately 2 weeks before the final report is due (last day of class). The project is worth 100 points. Grades will be based on: Appropriate and correct procedures 50 pts; Well-written and attractive presentation 20 pts; Grammar, spelling and punctuation 20 pts; Complexity 10 pts.

Cell phones Cell phones should be unseen and unheard during class. You are not the exception to this rule. Students who fail to comply will be asked to leave the classroom. Use of them during class is prohibited. Please turn your phone off and store it away when you enter the classroom. It should not be left on your desk during class.

Academic Integrity: As members of the Clemson University community, we have inherited Thomas Green Clemson s vision of this institution as a high seminary of learning. Fundamental to this vision is a mutual commitment to truthfulness, honor, and responsibility, without which we cannot earn the trust and respect of others. Furthermore, we recognize that academic dishonesty detracts from the value of a Clemson degree. Therefore, we shall not tolerate lying, cheating, or stealing in any form. When in the opinion of a faculty member, there is evidence that a student has committed an act of academic dishonesty, the faculty member shall make a formal written charge of academic dishonesty including a description of the misconduct, to the Dean of the Graduate School. At the same time, the faculty member may, but is not required to, inform each involved student privately of the nature of the alleged charge.

Disability Access Statement Students with disabilities who need accommodations should make an appointment with me to discuss specific needs within the first two weeks of classes. Students should present a Faculty Accommodation Letter(FAL) from Student Disabilities Services when we meet. Student Disability Services is located in G-20 Redfern (www.clemson.edu/sds/). Please be aware that accommodations are not retroactive and new FAL must be presented each semester.

Lecture Topics Aug 24-31 3.1-3.12 Characterization of Multivariate Data Sep 2-7 4 Multivariate Normal Distribution Sep 9-16 5.1-5.3 Test on Mean Vectors Sep 16 5.4-5.9 Comparing Two mean vectors Sep 19-21 6.1-6.2 Multivariate Analysis of Variance Sep 23-26 6.5-6.7 Two-Way Classification Sep 28-30 6.8-6.11 Profile Analysis & Growth Curves Oct 3-5 7 Tests on Covariance Matrices Oct 7-14 8 Discriminant Analysis Oct 17 Fall Break Oct 19-24 9 Classification Analysis Oct 26-Nov 4 10 Multivariate Regression Nov 7-14 12 Principal Component Analysis Nov 16-21 13 Factor Analysis Nov 23-25 Thanksgiving Break Nov 28-Dec 9 14 Cluster Analysis Dec 9 14 (Last Day of Classes) Dec 16 th Friday Instructor: 3:00 C. L. Williams, pm - 5:30 Ph.D. pmmthscfinal 807 Examination

Aims of the next couple of meetings To explain what we mean by multivariate methods To provide an overview of the material we will cover To introduce the notation, to start thinking about linear combinations To provide some ideas on how to conduct a multivariate exploratory data analysis (eda)

Univariate Measures of Center and Variability I. Mean and Variances of Univariate Data-Common measures of center and variability y = n i=1 Generally, y will never be equal to µ; by this we mean that the probability is zero that a sample will ever arise in which y is exactly equal to µ. However, y is considered a good estimator for µ because E (y) = µ and var(y) = σ 2 n, where σ2 is the variance of y. In other words, y is an unbiased estimator of µ and has a smaller variance than a single observation y. The variance σ 2 is defined shortly. n y i

The notation E (y) indicates the mean of all possible values of y; that is, conceptually, every possible sample is obtained from the population, the mean of each is found, and the average of all these sample means is calculated. If every y in the population is multiplied by a constant a, the expected value is also multiplied by a: E (ay) = ae (y) = aµ.

The variance of the population is defined as var(y) = σ 2 = E (y µ) 2. This is the average squared deviation from the mean and is thus an indication of the extent to which the values of y are spread or scattered. It can be shown that σ 2 = E ( y 2) µ 2. The sample variance is defined as s 2 = n (y i y) 2 i=1 n 1 = n i=1 y 2 i ny 2 n 1

The sample variance s 2 is generally never equal to the population variance σ 2 (the probability of such an occurrence is zero), but it is an unbiased estimator for σ 2 ; that is, E ( s 2) = σ 2. Again the notation E ( s 2) indicates the mean of all possible sample variances. The square root of either the population variance or sample variance is called the standard deviation. If each y is multiplied by a constant a, the population variance is multiplied by a 2, that is, var(ay) = a 2 σ 2.

Diagnostic I I. If Z N(0,1), what is the distribution of Z 2? (a) F (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic I I. If Z N(0,1), what is the distribution of Z 2? (a) F (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic I I. If Z N(0,1), what is the distribution of Z 2? (a) F (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic I I. If Z N(0,1), what is the distribution of Z 2? (a) F (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic I I. If Z N(0,1), what is the distribution of Z 2? (a) F (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic I I. If Z N(0,1), what is the distribution of Z 2? (c) χ 2 1

Diagnostic I I. If Z N(0,1), what is the distribution of Z 2? (c) χ 2 1

Diagnostic II II. If Z 1,Z 2,...,Z p iid N(0,1), what is the distribution of Z 2 1 + Z2 2 +... + Z2 p? (a) N(0,1 p ) (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic II II. If Z 1,Z 2,...,Z p iid N(0,1), what is the distribution of Z 2 1 + Z2 2 +... + Z2 p? (a) N(0,1 p ) (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic II II. If Z 1,Z 2,...,Z p iid N(0,1), what is the distribution of Z 2 1 + Z2 2 +... + Z2 p? (a) N(0,1 p ) (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic II II. If Z 1,Z 2,...,Z p iid N(0,1), what is the distribution of Z 2 1 + Z2 2 +... + Z2 p? (a) N(0,1 p ) (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic II II. If Z 1,Z 2,...,Z p iid N(0,1), what is the distribution of Z 2 1 + Z2 2 +... + Z2 p? (a) N(0,1 p ) (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic II II. If Z 1,Z 2,...,Z p iid N(0,1), what is the distribution of Z 2 1 + Z2 2 +... + Z2 p? (d) χ 2 p

Diagnostic II II. If Z 1,Z 2,...,Z p iid N(0,1), what is the distribution of Z 2 1 + Z2 2 +... + Z2 p? (d) χ 2 p

Diagnostic III III. What is: r = N (x i x)(y i ȳ)? i=1 (a) Variance (b) Covariance (c) Correlation (d) Standard deviation

Diagnostic III III. What is: r = N (x i x)(y i ȳ)? i=1 (a) Variance (b) Covariance (c) Correlation (d) Standard deviation

Diagnostic III III. What is: r = N (x i x)(y i ȳ)? i=1 (a) Variance (b) Covariance (c) Correlation (d) Standard deviation

Diagnostic III III. What is: r = N (x i x)(y i ȳ)? i=1 (a) Variance (b) Covariance (c) Correlation (d) Standard deviation

Diagnostic III III. What is: r = N (x i x)(y i ȳ)? i=1 (a) Variance (b) Covariance (c) Correlation (d) Standard deviation

Diagnostic III III. What is: r = N (x i x)(y i ȳ)? i=1 (b) Covariance

Diagnostic IV IV. What is the sample estimate of known as? (a) Variance (b) Covariance (c) Correlation (d) Standard deviation Cov(X,Y) Var(X) Var(Y ) also

Diagnostic IV IV. What is the sample estimate of known as? (a) Variance (b) Covariance (c) Correlation (d) Standard deviation Cov(X,Y) Var(X) Var(Y ) also

Diagnostic IV IV. What is the sample estimate of known as? (a) Variance (b) Covariance (c) Correlation (d) Standard deviation Cov(X,Y) Var(X) Var(Y ) also

Diagnostic IV IV. What is the sample estimate of known as? (a) Variance (b) Covariance (c) Correlation (d) Standard deviation Cov(X,Y) Var(X) Var(Y ) also

Diagnostic IV IV. What is the sample estimate of known as? (a) Variance (b) Covariance (c) Correlation (d) Standard deviation Cov(X,Y) Var(X) Var(Y ) also

Diagnostic IV IV. What is the sample estimate of known as? (c) Correlation Cov(X,Y) Var(X) Var(Y ) also

Diagnostic IV IV. What is the sample estimate of known as? (c) Correlation Cov(X,Y) Var(X) Var(Y ) also

Diagnostic V V. If x 1 is the mean of group 1, and x 2 is the mean of group 2, what test do we use to find whether µ 1 = µ 2? (a) Z-test (b) t-test (c) F-test (d) ANOVA

Diagnostic V V. If x 1 is the mean of group 1, and x 2 is the mean of group 2, what test do we use to find whether µ 1 = µ 2? (a) Z-test (b) t-test (c) F-test (d) ANOVA

Diagnostic V V. If x 1 is the mean of group 1, and x 2 is the mean of group 2, what test do we use to find whether µ 1 = µ 2? (a) Z-test (b) t-test (c) F-test (d) ANOVA

Diagnostic V V. If x 1 is the mean of group 1, and x 2 is the mean of group 2, what test do we use to find whether µ 1 = µ 2? (a) Z-test (b) t-test (c) F-test (d) ANOVA

Diagnostic V V. If x 1 is the mean of group 1, and x 2 is the mean of group 2, what test do we use to find whether µ 1 = µ 2? (a) Z-test (b) t-test (c) F-test (d) ANOVA

Diagnostic V V. If x 1 is the mean of group 1, and x 2 is the mean of group 2, what test do we use to find whether µ 1 = µ 2? (b) t-test

Diagnostic V V. If x 1 is the mean of group 1, and x 2 is the mean of group 2, what test do we use to find whether µ 1 = µ 2? (b) t-test

Diagnostic VI VI. What test do you use if you have more than 2 groups? (a) Z-test (b) t-test (c) F-test (d) ANOVA

Diagnostic VI VI. What test do you use if you have more than 2 groups? (a) Z-test (b) t-test (c) F-test (d) ANOVA

Diagnostic VI VI. What test do you use if you have more than 2 groups? (a) Z-test (b) t-test (c) F-test (d) ANOVA

Diagnostic VI VI. What test do you use if you have more than 2 groups? (a) Z-test (b) t-test (c) F-test (d) ANOVA

Diagnostic VI VI. What test do you use if you have more than 2 groups? (a) Z-test (b) t-test (c) F-test (d) ANOVA

Diagnostic VI VI. What test do you use if you have more than 2 groups? (d) ANOVA

Diagnostic VI VI. What test do you use if you have more than 2 groups? (d) ANOVA

Diagnostic VII last but not least VII. If X 1 χ 2 ν 1 and X 2 χ 2 ν 2, what is the distribution of X 1 /X 2? (a) F (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic VII last but not least VII. If X 1 χ 2 ν 1 and X 2 χ 2 ν 2, what is the distribution of X 1 /X 2? (a) F (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic VII last but not least VII. If X 1 χ 2 ν 1 and X 2 χ 2 ν 2, what is the distribution of X 1 /X 2? (a) F (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic VII last but not least VII. If X 1 χ 2 ν 1 and X 2 χ 2 ν 2, what is the distribution of X 1 /X 2? (a) F (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic VII last but not least VII. If X 1 χ 2 ν 1 and X 2 χ 2 ν 2, what is the distribution of X 1 /X 2? (a) F (b) t (c) χ 2 1 (d) χ 2 p

Diagnostic VII last but not least VII. If X 1 χ 2 ν 1 and X 2 χ 2 ν 2, what is the distribution of X 1 /X 2? (a) F

Diagnostic VII last but not least VII. If X 1 χ 2 ν 1 and X 2 χ 2 ν 2, what is the distribution of X 1 /X 2? (a) F

Summarising the data Let s start with a gentle introduction to some matrix terminology. The values observed on the i th individual can be written as the vector y T i = (y i1,y i2,...,y ip ) If all the individual vectors are written as the rows of a table, the result is called the data matrix and is usually denoted by Y. The element y ij of this matrix contains the value of variable y j observed on individual i.

And now, lets fix our ideas by thinking about some real data. To describe a multivariate data set it would help to be able to look at it, i.e. to have a pictorial representation. First, let us assume that all the variables are purely numerical. When there are just two such variables, a scatter diagram is a familiar idea for representing the data. The n individuals are represented by n points on a graph; the coordinates of point i are given by (y i1,y i2 ).

Life Lines Course Description

Points and Dimensions A natural extension to a two-dimensional scatter diagram (scatterplot) where there are two variables in which a relationship is sought is the extension to p-dimensions. Let y represent a random vector of p variables measured on a sampling unit (subject or object). If there are n individuals in the sample, the n denoted by y 1,y 2,...,y n, where each of the y i s is a vector of length p and y i1,y i2,...,y ip coordinate axes are taken to correspond to the variables so that the i th point is y i1 units along the first axis, y i2 units along the second,... and y ip units along the p th axis. Generally, this resulting scatterplot will result in a distinct pattern of variability, but should also reflect any similarities and well as dissimilarities among the n observations. Clustering may also be observed.

What is Applied Multivariate Analysis? Multivariate random variables Uppercase boldface letters are used for matrices of random variables or constants, lowercase boldface letters represent vectors of random variables or constants, and univariate random variables or constants are usually represented by lowercase nonbolded letters.

Representing Data The Data Matrix Y = y 11 y 12 y 1j y 1p y 21 y 22 y 2j y 2p.... y i1 y i2 y ij y ip.... y n1 y n2 y nj y np

y i = y i1 y i2 y i3. y ip

Mean Vectors y = 1 n n y i = i=1 y 1 y 2 y 3. y p

Let s look at some data!