Exploratory Data Analysis. CERENA Instituto Superior Técnico

Size: px
Start display at page:

Download "Exploratory Data Analysis. CERENA Instituto Superior Técnico"

Transcription

1 Exploratory Data Analysis CERENA Instituto Superior Técnico

2 Some questions for which we use Exploratory Data Analysis: 1. What is a typical value? 2. What is the uncertainty around a typical value? 3. What is a representative value? 4. A certain histogram is spatially representative of the reservoir? EDA should give answers to relevant and basic questions related to the characterization of the reservoir

3 Exploratory Data Analysis vs. Classical analysis Classical Analysis Problem => Data => Model => Analysis => Conclusions EDA Analysis Problem => Data => Analysis => Model => Conclusions

4 In classical analysis, data collection is followed by the imposition of a model (normality, linearity, etc.). and all the analysis that follows refers to the estimation and testing of the model parameters. In EDA data collection is not followed by the imposition of a model; the analysis that is done is to choose and inference of the best model for the data we have. The EDA approach takes as its priority the set of available data to suggest the most appropriate models to be fitted to those data.

5 Example : global statistics, local statistics, representativity,.

6 N=99 N=98

7 N=96 N=95

8 CMRP-Instituto Superior Técnico N=92 N=89

9 Var var(n)/var(n=99) (%) Nsamples/Variance Relaltive Unitary Ratio of Variance Series Series1 Samples Nsamples Relative decrease of the variance

10 Estimation of global statistics Statistics have to be robust and representative. Ex: variance (N=98) of 667 ppm 2 is more representative of the study area than variance (N=99) of 1000 ppm 2-30% of this value is due to a single sample and to the area it represents. In these situations, when samples are removed to encounter more robust estimators one should take into account the representative area of these samples. Ex: variance (N=89) de 212 is not representative of the central area of the highest values.

11 Local statistics But the spatial (local) characterization of permeability must necessarily have all the values

12 Exploratory Data Analysis: Univariate Description Raw Data North ing Easting Depth k(md) Mining the data: Visualize, describe, analyze

13 Univariate Description Histograms Class Limit Abs. Freq. Rel. Freq. Cum.Freq Box-plot Representation

14 Univariate Description

15 Symmetrical Histogram with moderated tails Symmetrical Histogram with short tails Symmetrical Histogram with long tails Bi-modal Symmetrical Histogram

16 Histogram resulting from the mixture of two normal distribuitions Histogram with tail to the right Histogram with tail to the left Symmetrical Histogram with outlier

17 Estimation of a Histogram Most ususal Estimator: equal weight to all samples Which can lead to a biased estimator. This sample has the same weight of all the others, then the histogram, mean, variance,, are skewed Solution: desagregated histograma

18 Univariate Description Mean m (arithmetic mean ) Measures of Center m 1 n n z i i1 Median M - is the z value corresponding to a cumulative percentage of 50% of the total values M m

19 Univariate Description Measures of Location quartiles: Q 1 - z value corresponding to a cumulative percentage of 25% Q 3 - z value corresponding to a cumulative percentage of 75% minimum: min- z value corresponding to a cumulative percentage of 0% maximum: max - z value corresponding to a cumulative percentage of 100% quantiles q(p) - z value corresponding to a cumulative percentage of 100.p %

20 Univariate Description max Q 2 M Q 1 min

21 Univariate Description variance 2 Measures of Spread 1 N N 11 ( z i m) 2 Interquartil range IQR=Q 3 -Q Q1 Q2 f(z) Zi Q1 Q2

22 Univariate Description Sensitivity to extreme values high sensitivity No. of Samples Variance Coef. Variation Mean low sensitivity No. of Samples Median Q1 Q

23 1- Univariate Description Extreme values and local uncertainty High local mean and low local variance High local mean and high local variance

24 Sample weights for the calculus of Disaggregated Spatial Statistics Weights proportional to the area of influence of samples Influence Polygons Regular Polygons Z(x 2 ) p(x 2 ) Z(x 1 ) p(x 1 ) Directly proportional to the area of each polygon Inversely proportional to the number of samples contained in each polygon

25 Univariate Description Distribution Models y m ln 2 Pros: Simplicity of representation (2 parameters ) and analysis. Cont: Representation to simplistic of important details of the histogram.

26 Bivariate Description x y z (%) K(md) Relation between Porosity and Permeability

27 Bivariate Description Bi-plots (14.4,13.2)

28

29

30 Bivariate Description Bi-Histograms % v 2 % Conditional Histograms v 1

31 Bivariate Description Quantile-Quantile Plots q-q plot: two marginal distributions can be compared by plotting theirs quantiles against one another : Cu Sn q(0) 0 10 q(0.1) 0 75 q(0.2) q(1.) Cu Sn If the q-q plot appears as a straight line, the 2 marginal distributions have the same shape.

32 Bivariate Description Regression Methods to summarize and visualize the behavior between two variables. Linear regression y=ax+b summarizes the behaviour between the two variables

33 Bivariate Description The regression model should have a qualitative relationship with the physical phenomenon under study. A polynomial regression can reproduce the particularities of sample data rather than the details of the relationship between the two variables and the physical phenomenon.

34 y x N i i i N y i N i x i N m m y x m y m x.. cov Bivariate Description Measures of correlation between the 2 variables Covariance between variables xi and yi Mean of x i N i i N m x x 1 1 Mean of y i N i i N m y y 1 1

35 Bivariate Description Measures of correlation between the 2 variables Correlation Coefficient (Pearson) cov x y Standard Deviation of the two variables xi e yi N 1 N i1 2 N 2 x m x i x y 1 N i1 y i m y

36 Bivariate Description Correlation Coefficient V2 Positive Correlation Coefficient V2 Negative Correlation Coefficient V1 V1 V2 Null Correlation Coefficient V1

37 Bivariate Description Correlation Coefficient The Correlation Coefficient is extremely sensitive to points which are located far from the main cloud.

38 Bivariate Description Correlation Coefficient Add just one pair of values: (200,200)

39 Bivariate Description Correlation Coefficient The correlation coefficient measures the linear dependence between two variables

40 Univariate and Bi-variate statistics Should reflect the most relevant geological patterns Should explain the main relationships between data Should always focus on the principle of maximum parsimony The use of statistic to summarize the behavior of key variables must be balanced against the drawbacks, too condensed information, sensitivity to extreme values, limited description in the case of bivariate,

41 Spatial Description Definition of Lithotypes

42 Spatial Description Data spatial representation

43 Spatial Description

44 Spatial Description Moving window statistics m 2 m 2 m 2

45 Spatial Description Variances Moving window statistics Mean Window Window

46 Introduction to GeoMS- Geostatistical Modelling Software

47 Geostatistic Software for Windows 2000, NT CMRP/IST Exploratory data analysis Spatial continuity analysis Modelling of variograms Kriging (SK, OK, KED,..) Co-Kriging Stochastic Simulation (DSS, SGS, SIS) Multi-phase classification Simulated Annealing Visualization Data transformation

48 Probabilities Distribuitions

49 Random variable (RV): Z Distribution function (cdf): F(z) = F Z (z)= Prob{Zz} Density Function (pdf): f(z)=f Z (z)=f (z)= lim dz0 F z dz Fz dz

50 Discrete probability function A discrete probability function, p(x), is a function that satisfies the following properties: 1. The probability of x taking a specific value is p(x) 2. p(x) is non-negative for all real x. 3. The sum of all possible p(x) values of x is 1, i.e. j1 p( Where j represents all the possible values of x and p j is the probability of x j. Consequently, 0 <= p(x) <= 1. N x j ) 1

51 Continuous probability functions A continuous probability functions, f(x), is a function that satisfies the following properties: The probability of x being between two points a and b is p a x b f xdx b Is non-negative for all real x. The integral of the probability function is one: a f xdx 1

52 Density function, pdf : f(z) f z 0 f z dz 1 z f z z f z F' z Z f zdx Fz ProbZ z lim dz0 F z dz Fz dz

53 Distribution function, cdf : 1. F(b) 0.5 F(a) z F z ProbZ z F Z F z is non decreasing Fz0,1 F 0 F 1 a 0. b Prob Z a, b Prob Z b Prob Z a Fb Fa

54 Normal Distribuition g x 1 2 xm 1 2 e 2 Prob{X<x}= F x x m G

55 Standard Normal Distribution X>0 N (m, ) Y= (X-m)/ N (0, 1) X=m+ Y g x x 2 e 2 Prob{X<x} = Prob{m+ Xx} = prob {Y(x-m)/ } Prob{Y y} G y x m G

56 Uniform Distribution f x B 1 Ax=B A Standard Uniform Distribution f x 1

57 Exponential Distribution f x 1 e x x e >0 Standard Exponential Distribution f x x e x0

58 a variable X as a lognormal distribution if Y = ln(x) is normally distributed X>0 lnn (m, ) Y=ln(X) N (,) f x 1 x ln g x

59 Central Limite Theorem Theorem: The sum of a large number of independent standardized random variables - evenly distributed - tend to be normally distributed. n RVs Zi, equally distributed (not necessarily normal) with zero mean: Y=Z i Normal, when n

60 Central Limite Theorem Corollary: The product of a large number of independent standardized random variables - equally distributed - tend to have a log-normal distribution. n RVs Zi, equally distributed (not necessarily normal) with zero mean : X=Z i Y=lnX= lnzi Normal, when n

61 Bi-variate Distributions Bi-Histograms % v 2 % Conditional Histograms v 1

62 Conditional Probability E{X Y=y} E{Y X=x} E{X Y=y} mean value of X in class Y=y. E{Y X=x} mean value of Y in class de X=x.

63 Conditional Probability f (x y)=prob{x=x;y=y}/prob{y=y}=f(x,y)/f(y) Prob{A B}=Prob{A e B}/Prob{B} A B Prob of A conditioned to the occurrence of event B

64 Bayes Relation Prob{A B}=Prob{A B}.Prob{B} Prob{A B}=Prob{B A}.Prob{A} Prob{A B} =(Prob{B A}/Prob{B}).Prob{A} (Prob{B A}/Prob{B}) is the likelihood of B given A

65 Bayes thinking - 1 (Savage S. 2005) There is 98% effective diagnostic test for SRHD: 98% of people who are infected display positive results; 98% of people who are not infected display negative results. Someone who in routine has just tested positive for SRHD. What is the chance he is actually infected? 1% of population that is SRHD+ Gut response is 98% of chance to be infected 99% of population that is SRHD- Bayesian analysis says it is only one out of three

66 Bayes Thinking - 1 A positive test can occur in two ways: true positive (98% of 1%) or false positive (2% of 99%) False negative: 2% of 1% = 0.02% True positive: 98% of 1% =.98% True negative: 98% of 99% = 97.02% False positive: 2% of 99% = 1.98% Real question: what is the chance of hitting a green area, given that you know you have hit a hatched area. Clearly about one third

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Bivariate Distributions. Discrete Bivariate Distribution Example

Bivariate Distributions. Discrete Bivariate Distribution Example Spring 7 Geog C: Phaedon C. Kyriakidis Bivariate Distributions Definition: class of multivariate probability distributions describing joint variation of outcomes of two random variables (discrete or continuous),

More information

Probability Distribution

Probability Distribution Economic Risk and Decision Analysis for Oil and Gas Industry CE81.98 School of Engineering and Technology Asian Institute of Technology January Semester Presented by Dr. Thitisak Boonpramote Department

More information

4. Distributions of Functions of Random Variables

4. Distributions of Functions of Random Variables 4. Distributions of Functions of Random Variables Setup: Consider as given the joint distribution of X 1,..., X n (i.e. consider as given f X1,...,X n and F X1,...,X n ) Consider k functions g 1 : R n

More information

Statistical Tools and Concepts

Statistical Tools and Concepts Statistical Tools and Concepts Abstract Mineral resource estimation requires extensive use of statistics. In our context, statistics are mathematical methods for collecting, organizing, and interpreting

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Entropy of Gaussian Random Functions and Consequences in Geostatistics

Entropy of Gaussian Random Functions and Consequences in Geostatistics Entropy of Gaussian Random Functions and Consequences in Geostatistics Paula Larrondo (larrondo@ualberta.ca) Department of Civil & Environmental Engineering University of Alberta Abstract Sequential Gaussian

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The

More information

Statistics 3657 : Moment Approximations

Statistics 3657 : Moment Approximations Statistics 3657 : Moment Approximations Preliminaries Suppose that we have a r.v. and that we wish to calculate the expectation of g) for some function g. Of course we could calculate it as Eg)) by the

More information

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2017 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

More information

The Binomial distribution. Probability theory 2. Example. The Binomial distribution

The Binomial distribution. Probability theory 2. Example. The Binomial distribution Probability theory Tron Anders Moger September th 7 The Binomial distribution Bernoulli distribution: One experiment X i with two possible outcomes, probability of success P. If the experiment is repeated

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables Chapter 2 Some Basic Probability Concepts 2.1 Experiments, Outcomes and Random Variables A random variable is a variable whose value is unknown until it is observed. The value of a random variable results

More information

Review: mostly probability and some statistics

Review: mostly probability and some statistics Review: mostly probability and some statistics C2 1 Content robability (should know already) Axioms and properties Conditional probability and independence Law of Total probability and Bayes theorem Random

More information

Computational Challenges in Reservoir Modeling. Sanjay Srinivasan The Pennsylvania State University

Computational Challenges in Reservoir Modeling. Sanjay Srinivasan The Pennsylvania State University Computational Challenges in Reservoir Modeling Sanjay Srinivasan The Pennsylvania State University Well Data 3D view of well paths Inspired by an offshore development 4 platforms 2 vertical wells 2 deviated

More information

Chapter 2. Continuous random variables

Chapter 2. Continuous random variables Chapter 2 Continuous random variables Outline Review of probability: events and probability Random variable Probability and Cumulative distribution function Review of discrete random variable Introduction

More information

Class 11 Maths Chapter 15. Statistics

Class 11 Maths Chapter 15. Statistics 1 P a g e Class 11 Maths Chapter 15. Statistics Statistics is the Science of collection, organization, presentation, analysis and interpretation of the numerical data. Useful Terms 1. Limit of the Class

More information

Statistical Distributions and Uncertainty Analysis. QMRA Institute Patrick Gurian

Statistical Distributions and Uncertainty Analysis. QMRA Institute Patrick Gurian Statistical Distributions and Uncertainty Analysis QMRA Institute Patrick Gurian Probability Define a function f(x) probability density distribution function (PDF) Prob [A

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

Acceptable Ergodic Fluctuations and Simulation of Skewed Distributions

Acceptable Ergodic Fluctuations and Simulation of Skewed Distributions Acceptable Ergodic Fluctuations and Simulation of Skewed Distributions Oy Leuangthong, Jason McLennan and Clayton V. Deutsch Centre for Computational Geostatistics Department of Civil & Environmental Engineering

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. A Probability Primer A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. Are you holding all the cards?? Random Events A random event, E,

More information

Contents 1 Introduction 2 Statistical Tools and Concepts

Contents 1 Introduction 2 Statistical Tools and Concepts 1 Introduction... 1 1.1 Objectives and Approach... 1 1.2 Scope of Resource Modeling... 2 1.3 Critical Aspects... 2 1.3.1 Data Assembly and Data Quality... 2 1.3.2 Geologic Model and Definition of Estimation

More information

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models Fatih Cavdur fatihcavdur@uludag.edu.tr March 20, 2012 Introduction Introduction The world of the model-builder

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Correlation and Regression Theory 1) Multivariate Statistics

Correlation and Regression Theory 1) Multivariate Statistics Correlation and Regression Theory 1) Multivariate Statistics What is a multivariate data set? How to statistically analyze this data set? Is there any kind of relationship between different variables in

More information

Probability Distributions for Continuous Variables. Probability Distributions for Continuous Variables

Probability Distributions for Continuous Variables. Probability Distributions for Continuous Variables Probability Distributions for Continuous Variables Probability Distributions for Continuous Variables Let X = lake depth at a randomly chosen point on lake surface If we draw the histogram so that the

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Summarizing Measured Data

Summarizing Measured Data Performance Evaluation: Summarizing Measured Data Hongwei Zhang http://www.cs.wayne.edu/~hzhang The object of statistics is to discover methods of condensing information concerning large groups of allied

More information

STATISTICS SYLLABUS UNIT I

STATISTICS SYLLABUS UNIT I STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution

More information

Continuous Random Variables

Continuous Random Variables MATH 38 Continuous Random Variables Dr. Neal, WKU Throughout, let Ω be a sample space with a defined probability measure P. Definition. A continuous random variable is a real-valued function X defined

More information

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Amy G. Froelich Michael D. Larsen Iowa State University *The work presented in this talk was partially supported by

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Linear regression analysis. Part 2. Model comparisons Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Summarizing Measured Data

Summarizing Measured Data Summarizing Measured Data 12-1 Overview Basic Probability and Statistics Concepts: CDF, PDF, PMF, Mean, Variance, CoV, Normal Distribution Summarizing Data by a Single Number: Mean, Median, and Mode, Arithmetic,

More information

Topic 6 Continuous Random Variables

Topic 6 Continuous Random Variables Topic 6 page Topic 6 Continuous Random Variables Reference: Chapter 5.-5.3 Probability Density Function The Uniform Distribution The Normal Distribution Standardizing a Normal Distribution Using the Standard

More information

Preliminary Statistics. Lecture 3: Probability Models and Distributions

Preliminary Statistics. Lecture 3: Probability Models and Distributions Preliminary Statistics Lecture 3: Probability Models and Distributions Rory Macqueen (rm43@soas.ac.uk), September 2015 Outline Revision of Lecture 2 Probability Density Functions Cumulative Distribution

More information

Index. Geostatistics for Environmental Scientists, 2nd Edition R. Webster and M. A. Oliver 2007 John Wiley & Sons, Ltd. ISBN:

Index. Geostatistics for Environmental Scientists, 2nd Edition R. Webster and M. A. Oliver 2007 John Wiley & Sons, Ltd. ISBN: Index Akaike information criterion (AIC) 105, 290 analysis of variance 35, 44, 127 132 angular transformation 22 anisotropy 59, 99 affine or geometric 59, 100 101 anisotropy ratio 101 exploring and displaying

More information

MATH4427 Notebook 4 Fall Semester 2017/2018

MATH4427 Notebook 4 Fall Semester 2017/2018 MATH4427 Notebook 4 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH4427 Notebook 4 3 4.1 K th Order Statistics and Their

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1

More information

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015 Probability Refresher Kai Arras, University of Freiburg Winter term 2014/2015 Probability Refresher Introduction to Probability Random variables Joint distribution Marginalization Conditional probability

More information

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES VARIABLE Studying the behavior of random variables, and more importantly functions of random variables is essential for both the

More information

2 Functions of random variables

2 Functions of random variables 2 Functions of random variables A basic statistical model for sample data is a collection of random variables X 1,..., X n. The data are summarised in terms of certain sample statistics, calculated as

More information

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY. SECOND YEAR B.Sc. SEMESTER - III

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY. SECOND YEAR B.Sc. SEMESTER - III Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY SECOND YEAR B.Sc. SEMESTER - III SYLLABUS FOR S. Y. B. Sc. STATISTICS Academic Year 07-8 S.Y. B.Sc. (Statistics)

More information

Descriptive Univariate Statistics and Bivariate Correlation

Descriptive Univariate Statistics and Bivariate Correlation ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to

More information

7 Geostatistics. Figure 7.1 Focus of geostatistics

7 Geostatistics. Figure 7.1 Focus of geostatistics 7 Geostatistics 7.1 Introduction Geostatistics is the part of statistics that is concerned with geo-referenced data, i.e. data that are linked to spatial coordinates. To describe the spatial variation

More information

A Short Note on the Proportional Effect and Direct Sequential Simulation

A Short Note on the Proportional Effect and Direct Sequential Simulation A Short Note on the Proportional Effect and Direct Sequential Simulation Abstract B. Oz (boz@ualberta.ca) and C. V. Deutsch (cdeutsch@ualberta.ca) University of Alberta, Edmonton, Alberta, CANADA Direct

More information

Probability. Table of contents

Probability. Table of contents Probability Table of contents 1. Important definitions 2. Distributions 3. Discrete distributions 4. Continuous distributions 5. The Normal distribution 6. Multivariate random variables 7. Other continuous

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Practical Statistics for the Analytical Scientist Table of Contents

Practical Statistics for the Analytical Scientist Table of Contents Practical Statistics for the Analytical Scientist Table of Contents Chapter 1 Introduction - Choosing the Correct Statistics 1.1 Introduction 1.2 Choosing the Right Statistical Procedures 1.2.1 Planning

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

Tastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that?

Tastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that? Tastitsticsss? What s that? Statistics describes random mass phanomenons. Principles of Biostatistics and Informatics nd Lecture: Descriptive Statistics 3 th September Dániel VERES Data Collecting (Sampling)

More information

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System Outline I Data Preparation Introduction to SpaceStat and ESTDA II Introduction to ESTDA and SpaceStat III Introduction to time-dynamic regression ESTDA ESTDA & SpaceStat Learning Objectives Activities

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume II: Probability Emlyn Lloyd University oflancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester - New York - Brisbane

More information

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12)

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) Following is an outline of the major topics covered by the AP Statistics Examination. The ordering here is intended to define the

More information

ECON Fundamentals of Probability

ECON Fundamentals of Probability ECON 351 - Fundamentals of Probability Maggie Jones 1 / 32 Random Variables A random variable is one that takes on numerical values, i.e. numerical summary of a random outcome e.g., prices, total GDP,

More information

SUMMARIZING MEASURED DATA. Gaia Maselli

SUMMARIZING MEASURED DATA. Gaia Maselli SUMMARIZING MEASURED DATA Gaia Maselli maselli@di.uniroma1.it Computer Network Performance 2 Overview Basic concepts Summarizing measured data Summarizing data by a single number Summarizing variability

More information

EE/CpE 345. Modeling and Simulation. Fall Class 10 November 18, 2002

EE/CpE 345. Modeling and Simulation. Fall Class 10 November 18, 2002 EE/CpE 345 Modeling and Simulation Class 0 November 8, 2002 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation

More information

Extreme Value Analysis and Spatial Extremes

Extreme Value Analysis and Spatial Extremes Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models

More information

Some Concepts of Probability (Review) Volker Tresp Summer 2018

Some Concepts of Probability (Review) Volker Tresp Summer 2018 Some Concepts of Probability (Review) Volker Tresp Summer 2018 1 Definition There are different way to define what a probability stands for Mathematically, the most rigorous definition is based on Kolmogorov

More information

14: Correlation. Introduction Scatter Plot The Correlational Coefficient Hypothesis Test Assumptions An Additional Example

14: Correlation. Introduction Scatter Plot The Correlational Coefficient Hypothesis Test Assumptions An Additional Example 14: Correlation Introduction Scatter Plot The Correlational Coefficient Hypothesis Test Assumptions An Additional Example Introduction Correlation quantifies the extent to which two quantitative variables,

More information

Is there still room for new developments in geostatistics?

Is there still room for new developments in geostatistics? Is there still room for new developments in geostatistics? Jean-Paul Chilès MINES ParisTech, Fontainebleau, France, IAMG 34th IGC, Brisbane, 8 August 2012 Matheron: books and monographs 1962-1963: Treatise

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 1

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 1 CS434a/541a: Pattern Recognition Prof. Olga Veksler Lecture 1 1 Outline of the lecture Syllabus Introduction to Pattern Recognition Review of Probability/Statistics 2 Syllabus Prerequisite Analysis of

More information

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued Chapter 3 sections Chapter 3 - continued 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions

More information

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland EnviroInfo 2004 (Geneva) Sh@ring EnviroInfo 2004 Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland Mikhail Kanevski 1, Michel Maignan 1

More information

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University Statistics for Economists Lectures 6 & 7 Asrat Temesgen Stockholm University 1 Chapter 4- Bivariate Distributions 41 Distributions of two random variables Definition 41-1: Let X and Y be two random variables

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Lecture 3. Probability - Part 2. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. October 19, 2016

Lecture 3. Probability - Part 2. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. October 19, 2016 Lecture 3 Probability - Part 2 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza October 19, 2016 Luigi Freda ( La Sapienza University) Lecture 3 October 19, 2016 1 / 46 Outline 1 Common Continuous

More information

Multivariate Random Variable

Multivariate Random Variable Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate

More information

Continuous r.v. s: cdf s, Expected Values

Continuous r.v. s: cdf s, Expected Values Continuous r.v. s: cdf s, Expected Values Engineering Statistics Section 4.2 Josh Engwer TTU 29 February 2016 Josh Engwer (TTU) Continuous r.v. s: cdf s, Expected Values 29 February 2016 1 / 17 PART I

More information

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 2013 MODULE 5 : Further probability and inference Time allowed: One and a half hours Candidates should answer THREE questions.

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Modeling and Performance Analysis with Discrete-Event Simulation

Modeling and Performance Analysis with Discrete-Event Simulation Simulation Modeling and Performance Analysis with Discrete-Event Simulation Chapter 9 Input Modeling Contents Data Collection Identifying the Distribution with Data Parameter Estimation Goodness-of-Fit

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

Basic Statistical Tools

Basic Statistical Tools Structural Health Monitoring Using Statistical Pattern Recognition Basic Statistical Tools Presented by Charles R. Farrar, Ph.D., P.E. Los Alamos Dynamics Structural Dynamics and Mechanical Vibration Consultants

More information

Estimation of direction of increase of gold mineralisation using pair-copulas

Estimation of direction of increase of gold mineralisation using pair-copulas 22nd International Congress on Modelling and Simulation, Hobart, Tasmania, Australia, 3 to 8 December 2017 mssanz.org.au/modsim2017 Estimation of direction of increase of gold mineralisation using pair-copulas

More information

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr. Simulation Discrete-Event System Simulation Chapter 8 Input Modeling Purpose & Overview Input models provide the driving force for a simulation model. The quality of the output is no better than the quality

More information

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment: Moments Lecture 10: Central Limit Theorem and CDFs Sta230 / Mth 230 Colin Rundel Raw moment: Central moment: µ n = EX n ) µ n = E[X µ) 2 ] February 25, 2014 Normalized / Standardized moment: µ n σ n Sta230

More information

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( ) Chapter 4 4.

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( ) Chapter 4 4. UCLA STAT 11 A Applied Probability & Statistics for Engineers Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology Teaching Assistant: Christopher Barr University of California, Los Angeles,

More information

Review of probability

Review of probability Review of probability Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts definition of probability random variables

More information

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Probability theory and statistical analysis: a review Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Concepts assumed known Histograms, mean, median, spread, quantiles Probability,

More information

ECE Lecture #10 Overview

ECE Lecture #10 Overview ECE 450 - Lecture #0 Overview Introduction to Random Vectors CDF, PDF Mean Vector, Covariance Matrix Jointly Gaussian RV s: vector form of pdf Introduction to Random (or Stochastic) Processes Definitions

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 89 Part II

More information

Review of Probability. CS1538: Introduction to Simulations

Review of Probability. CS1538: Introduction to Simulations Review of Probability CS1538: Introduction to Simulations Probability and Statistics in Simulation Why do we need probability and statistics in simulation? Needed to validate the simulation model Needed

More information

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

More information

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic STATISTICS ANCILLARY SYLLABUS (W.E.F. the session 2014-15) Semester Paper Code Marks Credits Topic 1 ST21012T 70 4 Descriptive Statistics 1 & Probability Theory 1 ST21012P 30 1 Practical- Using Minitab

More information

Test Problems for Probability Theory ,

Test Problems for Probability Theory , 1 Test Problems for Probability Theory 01-06-16, 010-1-14 1. Write down the following probability density functions and compute their moment generating functions. (a) Binomial distribution with mean 30

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, Jeremy Orloff and Jonathan Bloom

Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, Jeremy Orloff and Jonathan Bloom Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, 8.5 Jeremy Orloff and Jonathan Bloom Learning Goals. Be able to compute and interpret expectation, variance, and standard

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................

More information

Brownian Motion. An Undergraduate Introduction to Financial Mathematics. J. Robert Buchanan. J. Robert Buchanan Brownian Motion

Brownian Motion. An Undergraduate Introduction to Financial Mathematics. J. Robert Buchanan. J. Robert Buchanan Brownian Motion Brownian Motion An Undergraduate Introduction to Financial Mathematics J. Robert Buchanan 2010 Background We have already seen that the limiting behavior of a discrete random walk yields a derivation of

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

Chapter 7: Theoretical Probability Distributions Variable - Measured/Categorized characteristic

Chapter 7: Theoretical Probability Distributions Variable - Measured/Categorized characteristic BSTT523: Pagano & Gavreau, Chapter 7 1 Chapter 7: Theoretical Probability Distributions Variable - Measured/Categorized characteristic Random Variable (R.V.) X Assumes values (x) by chance Discrete R.V.

More information

Exam C Solutions Spring 2005

Exam C Solutions Spring 2005 Exam C Solutions Spring 005 Question # The CDF is F( x) = 4 ( + x) Observation (x) F(x) compare to: Maximum difference 0. 0.58 0, 0. 0.58 0.7 0.880 0., 0.4 0.680 0.9 0.93 0.4, 0.6 0.53. 0.949 0.6, 0.8

More information