Chapter 2. 1.what are the types of Data Sets

Size: px
Start display at page:

Download "Chapter 2. 1.what are the types of Data Sets"

Transcription

1 Record Relatoal records Data matr, :umercal matr Documet data: tet documets: term-frequecy vector Trasacto data Graph ad etwork World Wde Web Socal or formato etworks Molecular Structures Chapter.what are the types of Data Sets >>>. what are Importat Characterstcs of Structured Data Dmesoalty : Curse of dmesoalty Sparsty : Oly presece couts Resoluto: Patters deped o the scale Dstrbuto:Cetralty ad dsperso Ordered Vdeo data: sequece of mages Temporal data: tme-seres Sequetal Data: trasacto sequeces Geetc sequece data Spatal, mage ad multmeda: Spatal data: maps Image data: Vdeo data: 3 Data Obects Data sets are made up of data obects. A data obect represets a etty. Eamples: sales database: customers, store tems, sales medcal database: patets, treatmets uversty database: studets, professors, courses >>> Also called samples, eamples, staces, data pots, obects, tuples. >>>Data obects are descrbed by attrbutes. >>>Database rows data obects; colums attrbutes. >>> Attrbute (or dmesos, features, varables): a data feld, represetg a characterstc or feature of a data obect. E.g., customer _ID, ame, address 4. Dscrete vs. Cotuous Attrbutes Dscrete Attrbute Has oly a fte or coutably fte set of values E.g., zp codes, professo, or the set of words a collecto of documets Sometmes, represeted as teger varables Note: Bary attrbutes are a specal case of dscrete attrbutes Cotuous Attrbute Has real umbers as attrbute values E.g., temperature, heght, or weght Practcally, real values ca oly be measured ad represeted usg a fte umber of dgts Cotuous attrbutes are typcally represeted as floatg-pot varables

2 >>>5.Attrbute Types Nomal: categores, states, or ames of thgs Har_color = { black, blod, brow, grey, red, whte} martal status, occupato, ID umbers, zp codes Bary Nomal attrbute wth oly states (0 ad ) Symmetrc bary: both outcomes equally mportat e.g., geder Asymmetrc bary: outcomes ot equally mportat. e.g., medcal test (postve vs. egatve) Coveto: assg to most mportat outcome (e.g., HIV postve) Ordal Values have a meagful order (rakg) but magtude betwee successve values s ot kow. Sze = {small, medum, large}, grades, army rakgs Numerc Quatty (teger or real-valued) Rato Iheret zero-pot e.g., temperature Kelv, legth, couts, moetary quattes Iterval No zero-pot. Scale of equal-szed uts. Values have order E.g., temperature C or F, caledar dates 6. Basc Statstcal Descrptos of Data Motvato :To better uderstad the data: cetral tedecy, varato ad spread Data dsperso characterstcs : meda, ma, m, quatles, outlers, varace, etc. Numercal dmesos correspod to sorted tervals Dsperso aalyss o computed measures correspod to trasformed cube >>>6. Graphc Dsplays of Basc Statstcal Descrptos Boplot: graphc dsplay of fve-umber summary Hstogram: -as are values, y-as repres. frequeces Quatle plot: each value s pared wth f dcatg that appromately 00 f % of data are Quatle-quatle (q-q) plot: graphs the quatles of oe uvarat dstrbuto agast the correspodg quatles of aother Scatter plot: each par of values s a par of coordates ad plotted as pots the plae

3 7. Boplot Aalyss Fve-umber summary of a dstrbuto: Mmum, Q, Meda, Q3, Mamum Data s represeted wth a bo The eds of the bo are at the frst ad thrd quartles,.e., the heght of the bo s IQR The meda s marked by a le wth the bo Whskers: two les outsde the bo eteded to Mmum ad Mamum Outlers: pots beyod a specfed outler threshold, plotted dvdually 8. Hstogram Aalyss Hstogram: Graph dsplay of tabulated frequeces, show as bars It shows what proporto of cases fall to each of several categores >>>Dffers from a bar chart that t s the area of the bar that deotes the value, ot the heght as bar charts, a crucal dstcto whe the categores are ot of uform wdth The categores are usually specfed as o-overlappg tervals of some varable. The categores (bars) must be adacet why Hstograms Ofte Tell More tha Boplots The two hstograms may have the same boplot represetato The same values for: m, Q, meda, Q3, ma But they have rather dfferet data dstrbutos

4 0. Quatle Plot Dsplays all of the data (allowg the user to assess both the overall behavor ad uusual occurreces) For a data data sorted creasg order, f dcates that appromately 00 f% of the data are below or equal to the value. Quatle-Quatle (Q-Q) Plot Graphs the quatles of oe uvarate dstrbuto agast the correspodg quatles of aother Vew: Is there s a shft gog from oe dstrbuto to aother? yes We eed to label the dark plotted pots as Q, Meda, Q3 that would help uderstadg ths graph.

5 . Scatter plot Provdes a frst look at bvarate data to see clusters of pots, outlers, etc Each par of values s treated as a par of coordates ad plotted as pots Determe Postvely ad Negatvely Correlated Data Postve egatve left half fragmet s postvely correlated rght half s egatve correlated Ucorrelated Data 3. Propertes of Normal Dstrbuto Curve(μ: mea, σ: stadard devato) From μ σ to μ+σ: cotas about 68% of measuremets From μ σ to μ+σ: cotas about 95% of t From μ 3σ to μ+3σ: cotas about 99.7% of t

6 4. Measurg the Cetral Tedecy Mea (algebrac measure) (sample vs. populato): Note: s sample sze ad N s populato sze. N Weghted arthmetc mea: Trmmed mea: choppg etreme values Meda: Mddle value f odd umber of values, or average of the mddle two values Estmated by terpolato (for grouped data): meda L w w / ( ( freq meda freq) l ) wdth Mode Value that occurs most frequetly the data Umodal, bmodal, trmodal Emprcal formula: mea mode 3 ( mea meda) Symmetrc vs. Skewed Data

7 5. Measurg the Dsperso of Data Quartles, outlers ad boplots Quartles: Q (5 th percetle), Q3 (75 th percetle) Iter-quartle rage: IQR = Q3 Q Fve umber summary: m, Q, meda, Q3, ma Boplot: eds of the bo are the quartles; meda s marked; add whskers, ad plot outlers dvdually Outler: usually, a value hgher/lower tha.5 IQR Varace ad stadard devato (sample: s, populato: σ) Varace: (algebrac, scalable computato) s ( ) N N ( ) [ ( ) ] Stadard devato s (or σ) s the square root of varace s ( or σ ) 6. Promty refers to a smlarty or dssmlarty Smlarty Numercal measure of how alke two data obects are Value s hgher whe obects are more alke Ofte falls the rage [0,] Dssmlarty (e.g., dstace) Numercal measure of how dfferet two data obects are Lower whe obects are more alke Mmum dssmlarty s ofte 0 Upper lmt vares 7. Data Matr ad Dssmlarty Matr Data matr data pots wth p dmesos Two modes f f f p p p Dssmlarty matr data pots, but regsters oly the dstace A tragular matr Sgle mode 0 d(,) d(3,) : d(,) 0 d(3,) : d(,) 0 : 0 اللي جاي قوانين ومسائل صحصح لو سمحت

8 Promty Measure for Nomal Attrbutes Ca take or more states, e.g., red, blue, gree (geeralzato of a bary attrbute) Method : Smple matchg : m: # of matches, p: total # of varables d(, ) p p m Method : Use a large umber of bary attrbutes creatg a ew bary attrbute for each of the M omal states Promty Measure for Bary Attrbutes A cotgecy table for bary data Dstace measure for symmetrc bary varables: Dstace measure for asymmetrc bary varables: Jaccard coeffcet (smlarty measure for asymmetrc bary varables): Note: Jaccard coeffcet s the same as coherece : >>>8.determe Dssmlarty betwee Bary Varables Name Geder Fever Cough Test- Test- Test-3 Test-4 Jack M Y N P N N N Mary F Y N P N P N Jm M Y P N N N N Geder s a symmetrc attrbute The remag attrbutes are asymmetrc bary

9 Let the values Y ad P be, ad the value N 0 Stadardzg Numerc Data Z-score: z 0 d( ack, mary) d( ack, m) 0.67 d( m, mary) 0.75 X: raw score to be stadardzed, μ: mea of the populato, σ: stadard devato egatve whe the raw score s below the mea, + whe above A alteratve way: Calculate the mea absolute devato s f ( m m m m f f f f f f ( Where ). f f f >> Usg mea absolute devato s more robust tha usg stadard devato f ) Dstace o Numerc Data: Mkowsk Dstace Mkowsk dstace: A popular dstace measure where = (,,, p) ad = (,,, p) are two p-dmesoal data obects, ad h s the order (the dstace so defed s also called L-h orm) Propertes d(, ) > 0 f, ad d(, ) = 0 (Postve defteess) d(, ) = d(, ) (Symmetry) d(, ) d(, k) + d(k, ) (Tragle Iequalty) A dstace that satsfes these propertes s a metrc Specal Cases of Mkowsk Dstace h = : Mahatta (cty block, L orm) dstace E.g., the Hammg dstace: the umber of bts that are dfferet betwee two bary vectors d(, ) p p

10 h = : (L orm) Eucldea dstace ) ( ), ( p p d h. supremum (Lma orm, L orm) dstace. Ths s the mamum dfferece betwee ay compoet (attrbute) of the vectors 9. Calculate Data Matr ad Dssmlarty Matr ( Mkowsk Dstaces)

11 Ordal Varables Order s mportat, e.g., rak. A ordal varable ca be dscrete or cotuous Ca be treated lke terval-scaled replace f by ther rak r,, M } f { f map the rage of each varable oto [0, ] by replacg -th obect the f-th varable by z f rf M f compute the dssmlarty usg methods for terval-scaled varables Attrbutes of Med Type A database may cota all attrbute types Nomal, symmetrc bary, asymmetrc bary, umerc, ordal Oe may use a weghted formula to combe ther effects d d p ( f ) ( f ) ( f, ) p ( f ) f f s bary or omal: d (f) = 0 f f = f, or d (f) = otherwse

12 f s umerc: use the ormalzed dstace f s ordal Compute raks rf ad Treat zf as terval-scaled z f Cose Smlarty A documet ca be represeted by thousads of attrbutes, each recordg the frequecy of a partcular word (such as keywords) or phrase the documet. rf M f Other vector obects: gee features mcro-arrays, Applcatos: formato retreval, bologc taoomy, gee feature mappg, Cose measure: If d ad d are two vectors (e.g., term-frequecy vectors), the cos(d, d ) = (d d ) / d d, where dcates vector dot product, d : the legth of vector d 0 Fd the smlarty betwee documets ad. d = (5, 0, 3, 0,, 0, 0,, 0, 0) d = (3, 0,, 0,,, 0,, 0, ) dd = 5*3+0*0+3*+0*0+*+0*+0*+*+0*0+0* = 5 d = (5*5+0*0+3*3+0*0+*+0*0+0*0+*+0*0+0*0) 0.5 =(4) 0.5 = 6.48 d = (3*3+0*0+*+0*0+*+*+0*0+*+0*0+*) 0.5 =(7) 0.5 = 4. cos(d, d ) = 0.94

is the score of the 1 st student, x

is the score of the 1 st student, x 8 Chapter Collectg, Dsplayg, ad Aalyzg your Data. Descrptve Statstcs Sectos explaed how to choose a sample, how to collect ad orgaze data from the sample, ad how to dsplay your data. I ths secto, you wll

More information

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

MEASURES OF DISPERSION

MEASURES OF DISPERSION MEASURES OF DISPERSION Measure of Cetral Tedecy: Measures of Cetral Tedecy ad Dsperso ) Mathematcal Average: a) Arthmetc mea (A.M.) b) Geometrc mea (G.M.) c) Harmoc mea (H.M.) ) Averages of Posto: a) Meda

More information

Lesson 3. Group and individual indexes. Design and Data Analysis in Psychology I English group (A) School of Psychology Dpt. Experimental Psychology

Lesson 3. Group and individual indexes. Design and Data Analysis in Psychology I English group (A) School of Psychology Dpt. Experimental Psychology 17/03/015 School of Psychology Dpt. Expermetal Psychology Desg ad Data Aalyss Psychology I Eglsh group (A) Salvador Chacó Moscoso Susaa Saduvete Chaves Mlagrosa Sáchez Martí Lesso 3 Group ad dvdual dexes

More information

Section l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58

Section l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58 Secto.. 6l 34 6h 667899 7l 44 7h Stem=Tes 8l 344 Leaf=Oes 8h 5557899 9l 3 9h 58 Ths dsplay brgs out the gap the data: There are o scores the hgh 7's. 6. a. beams cylders 9 5 8 88533 6 6 98877643 7 488

More information

Handout #1. Title: Foundations of Econometrics. POPULATION vs. SAMPLE

Handout #1. Title: Foundations of Econometrics. POPULATION vs. SAMPLE Hadout #1 Ttle: Foudatos of Ecoometrcs Course: Eco 367 Fall/015 Istructor: Dr. I-Mg Chu POPULATION vs. SAMPLE From the Bureau of Labor web ste (http://www.bls.gov), we ca fd the uemploymet rate for each

More information

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Mean is only appropriate for interval or ratio scales, not ordinal or nominal. Mea Same as ordary average Sum all the data values ad dvde by the sample sze. x = ( x + x +... + x Usg summato otato, we wrte ths as x = x = x = = ) x Mea s oly approprate for terval or rato scales, ot

More information

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen. .5 x 54.5 a. x 7. 786 7 b. The raked observatos are: 7.4, 7.5, 7.7, 7.8, 7.9, 8.0, 8.. Sce the sample sze 7 s odd, the meda s the (+)/ 4 th raked observato, or meda 7.8 c. The cosumer would more lkely

More information

Descriptive Statistics

Descriptive Statistics Page Techcal Math II Descrptve Statstcs Descrptve Statstcs Descrptve statstcs s the body of methods used to represet ad summarze sets of data. A descrpto of how a set of measuremets (for eample, people

More information

= 1. UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Parameters and Statistics. Measures of Centrality

= 1. UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Parameters and Statistics. Measures of Centrality UCLA STAT Itroducto to Statstcal Methods for the Lfe ad Health Sceces Istructor: Ivo Dov, Asst. Prof. of Statstcs ad Neurology Teachg Assstats: Fred Phoa, Krste Johso, Mg Zheg & Matlda Hseh Uversty of

More information

Measures of Dispersion

Measures of Dispersion Chapter 8 Measures of Dsperso Defto of Measures of Dsperso (page 31) A measure of dsperso s a descrptve summary measure that helps us characterze the data set terms of how vared the observatos are from

More information

Summary tables and charts

Summary tables and charts Data Aalyss Summary tables ad charts. Orgazg umercal data: Hstograms ad frequecy tables I ths lecture, we wll study descrptve statstcs. By descrptve statstcs, we refer to methods volvg the collecto, presetato,

More information

Lecture Notes Types of economic variables

Lecture Notes Types of economic variables Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte

More information

Statistics Descriptive

Statistics Descriptive Statstcs Descrptve Ma aspects of descrbg a data set (a) Summarzazto ad descrpto of the data (1) Presetato of tables ad graphs (2) Scag the graphed data for ay uusual observatos wch seem to stck far out

More information

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s). CHAPTER STATISTICS Pots to Remember :. Facts or fgures, collected wth a defte pupose, are called Data.. Statstcs s the area of study dealg wth the collecto, presetato, aalyss ad terpretato of data.. The

More information

C. Statistics. X = n geometric the n th root of the product of numerical data ln X GM = or ln GM = X 2. X n X 1

C. Statistics. X = n geometric the n th root of the product of numerical data ln X GM = or ln GM = X 2. X n X 1 C. Statstcs a. Descrbe the stages the desg of a clcal tral, takg to accout the: research questos ad hypothess, lterature revew, statstcal advce, choce of study protocol, ethcal ssues, data collecto ad

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

STATISTICS 13. Lecture 5 Apr 7, 2010

STATISTICS 13. Lecture 5 Apr 7, 2010 STATISTICS 13 Leture 5 Apr 7, 010 Revew Shape of the data -Bell shaped -Skewed -Bmodal Measures of eter Arthmet Mea Meda Mode Effets of outlers ad skewess Measures of Varablt A quattatve measure that desrbes

More information

Simple Linear Regression

Simple Linear Regression Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uversty Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal

More information

Lecture 1 Review of Fundamental Statistical Concepts

Lecture 1 Review of Fundamental Statistical Concepts Lecture Revew of Fudametal Statstcal Cocepts Measures of Cetral Tedecy ad Dsperso A word about otato for ths class: Idvduals a populato are desgated, where the dex rages from to N, ad N s the total umber

More information

Median as a Weighted Arithmetic Mean of All Sample Observations

Median as a Weighted Arithmetic Mean of All Sample Observations Meda as a Weghted Arthmetc Mea of All Sample Observatos SK Mshra Dept. of Ecoomcs NEHU, Shllog (Ida). Itroducto: Iumerably may textbooks Statstcs explctly meto that oe of the weakesses (or propertes) of

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

Module 7. Lecture 7: Statistical parameter estimation

Module 7. Lecture 7: Statistical parameter estimation Lecture 7: Statstcal parameter estmato Parameter Estmato Methods of Parameter Estmato 1) Method of Matchg Pots ) Method of Momets 3) Mamum Lkelhood method Populato Parameter Sample Parameter Ubased estmato

More information

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 3. Sampling, sampling distributions, and parameter estimation Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty

More information

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

STA 105-M BASIC STATISTICS (This is a multiple choice paper.) DCDM BUSINESS SCHOOL September Mock Eamatos STA 0-M BASIC STATISTICS (Ths s a multple choce paper.) Tme: hours 0 mutes INSTRUCTIONS TO CANDIDATES Do ot ope ths questo paper utl you have bee told to do

More information

Measures of Central Tendency

Measures of Central Tendency Chapter 6 Measures of Cetral Tedecy Defto of a Summary Measure (page 185) A summary measure s a sgle value that we compute from a collecto of measuremets order to descrbe oe of the collecto s partcular

More information

The variance and standard deviation from ungrouped data

The variance and standard deviation from ungrouped data BIOL 443 مقاييس التغير (التشتت ( (dsperso). Measures of Varato Just as measures of cetral tedecy locate the ceter of data, measures of varato measure ts spread. Whe the varato s small, ths meas that the

More information

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations HP 30S Statstcs Averages ad Stadard Devatos Average ad Stadard Devato Practce Fdg Averages ad Stadard Devatos HP 30S Statstcs Averages ad Stadard Devatos Average ad stadard devato The HP 30S provdes several

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uverst Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal

More information

Chapter 13 Student Lecture Notes 13-1

Chapter 13 Student Lecture Notes 13-1 Chapter 3 Studet Lecture Notes 3- Basc Busess Statstcs (9 th Edto) Chapter 3 Smple Lear Regresso 4 Pretce-Hall, Ic. Chap 3- Chapter Topcs Types of Regresso Models Determg the Smple Lear Regresso Equato

More information

Lecture 8: Linear Regression

Lecture 8: Linear Regression Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE

More information

Continuous Distributions

Continuous Distributions 7//3 Cotuous Dstrbutos Radom Varables of the Cotuous Type Desty Curve Percet Desty fucto, f (x) A smooth curve that ft the dstrbuto 3 4 5 6 7 8 9 Test scores Desty Curve Percet Probablty Desty Fucto, f

More information

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

X ε ) = 0, or equivalently, lim

X ε ) = 0, or equivalently, lim Revew for the prevous lecture Cocepts: order statstcs Theorems: Dstrbutos of order statstcs Examples: How to get the dstrbuto of order statstcs Chapter 5 Propertes of a Radom Sample Secto 55 Covergece

More information

Evaluation of uncertainty in measurements

Evaluation of uncertainty in measurements Evaluato of ucertaty measuremets Laboratory of Physcs I Faculty of Physcs Warsaw Uversty of Techology Warszawa, 05 Itroducto The am of the measuremet s to determe the measured value. Thus, the measuremet

More information

UNIT 1 MEASURES OF CENTRAL TENDENCY

UNIT 1 MEASURES OF CENTRAL TENDENCY UIT MEASURES OF CETRAL TEDECY Measures o Cetral Tedecy Structure Itroducto Objectves Measures o Cetral Tedecy 3 Armetc Mea 4 Weghted Mea 5 Meda 6 Mode 7 Geometrc Mea 8 Harmoc Mea 9 Partto Values Quartles

More information

Chapter 8. Inferences about More Than Two Population Central Values

Chapter 8. Inferences about More Than Two Population Central Values Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha

More information

1 Onto functions and bijections Applications to Counting

1 Onto functions and bijections Applications to Counting 1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of

More information

Previous lecture. Lecture 8. Learning outcomes of this lecture. Today. Statistical test and Scales of measurement. Correlation

Previous lecture. Lecture 8. Learning outcomes of this lecture. Today. Statistical test and Scales of measurement. Correlation Lecture 8 Emprcal Research Methods I434 Quattatve Data aalss II Relatos Prevous lecture Idea behd hpothess testg Is the dfferece betwee two samples a reflecto of the dfferece of two dfferet populatos or

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

Chapter -2 Simple Random Sampling

Chapter -2 Simple Random Sampling Chapter - Smple Radom Samplg Smple radom samplg (SRS) s a method of selecto of a sample comprsg of umber of samplg uts out of the populato havg umber of samplg uts such that every samplg ut has a equal

More information

Utts and Heckard. Why Study Statistics? Why Study Statistics? American Heritage College Dictionary, 3rd Ed.

Utts and Heckard. Why Study Statistics? Why Study Statistics? American Heritage College Dictionary, 3rd Ed. Amerca Hertage College Dctoar, 3rd Ed. 1. (used wth sgular verb) The mathematcs of the collecto, orgazato, ad terpretato of umercal data, esp. the aalss of populato characterstcs b ferece from samplg.

More information

Simple Linear Regression

Simple Linear Regression Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato

More information

Chapter 1 Data and Statistics

Chapter 1 Data and Statistics Chapter Data ad Statstcs Motvato: the followg kds of statemets ewspaper ad magaze appear very frequetly, Sales of ew homes are accrug at a rate of 7000 homes per year. The uemploymet rate has dropped to

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Revew o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for Chapter 4-5 Notes: Although all deftos ad theorems troduced our lectures ad ths ote are mportat ad you should be famlar wth, but I put those

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

Centroids & Moments of Inertia of Beam Sections

Centroids & Moments of Inertia of Beam Sections RCH 614 Note Set 8 S017ab Cetrods & Momets of erta of Beam Sectos Notato: b C d d d Fz h c Jo L O Q Q = ame for area = ame for a (base) wdth = desgato for chael secto = ame for cetrod = calculus smbol

More information

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR Pot Patter Aalyss Part I Outle Revst IRP/CSR, frst- ad secod order effects What s pot patter aalyss (PPA)? Desty-based pot patter measures Dstace-based pot patter measures Revst IRP/CSR Equal probablty:

More information

n -dimensional vectors follow naturally from the one

n -dimensional vectors follow naturally from the one B. Vectors ad sets B. Vectors Ecoomsts study ecoomc pheomea by buldg hghly stylzed models. Uderstadg ad makg use of almost all such models requres a hgh comfort level wth some key mathematcal sklls. I

More information

Machine Learning. Topic 4: Measuring Distance

Machine Learning. Topic 4: Measuring Distance Mache Learg Topc 4: Measurg Dstace Bra Pardo Mache Learg: EECS 349 Fall 2009 Wh measure dstace? Clusterg requres dstace measures. Local methods requre a measure of localt Search eges requre a measure of

More information

Chapter -2 Simple Random Sampling

Chapter -2 Simple Random Sampling Chapter - Smple Radom Samplg Smple radom samplg (SRS) s a method of selecto of a sample comprsg of umber of samplg uts out of the populato havg umber of samplg uts such that every samplg ut has a equal

More information

MA/CSSE 473 Day 27. Dynamic programming

MA/CSSE 473 Day 27. Dynamic programming MA/CSSE 473 Day 7 Dyamc Programmg Bomal Coeffcets Warshall's algorthm (Optmal BSTs) Studet questos? Dyamc programmg Used for problems wth recursve solutos ad overlappg subproblems Typcally, we save (memoze)

More information

LECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR

LECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR amplg Theory MODULE II LECTURE - 4 IMPLE RADOM AMPLIG DR. HALABH DEPARTMET OF MATHEMATIC AD TATITIC IDIA ITITUTE OF TECHOLOGY KAPUR Estmato of populato mea ad populato varace Oe of the ma objectves after

More information

Analysis of Variance with Weibull Data

Analysis of Variance with Weibull Data Aalyss of Varace wth Webull Data Lahaa Watthaacheewaul Abstract I statstcal data aalyss by aalyss of varace, the usual basc assumptos are that the model s addtve ad the errors are radomly, depedetly, ad

More information

GOALS The Samples Why Sample the Population? What is a Probability Sample? Four Most Commonly Used Probability Sampling Methods

GOALS The Samples Why Sample the Population? What is a Probability Sample? Four Most Commonly Used Probability Sampling Methods GOLS. Epla why a sample s the oly feasble way to lear about a populato.. Descrbe methods to select a sample. 3. Defe ad costruct a samplg dstrbuto of the sample mea. 4. Epla the cetral lmt theorem. 5.

More information

Chapter Two. An Introduction to Regression ( )

Chapter Two. An Introduction to Regression ( ) ubject: A Itroducto to Regresso Frst tage Chapter Two A Itroducto to Regresso (018-019) 1 pg. ubject: A Itroducto to Regresso Frst tage A Itroducto to Regresso Regresso aalss s a statstcal tool for the

More information

Chapter 3 Sampling For Proportions and Percentages

Chapter 3 Sampling For Proportions and Percentages Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys

More information

ENGI 3423 Simple Linear Regression Page 12-01

ENGI 3423 Simple Linear Regression Page 12-01 ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable

More information

BASICS ON DISTRIBUTIONS

BASICS ON DISTRIBUTIONS BASICS ON DISTRIBUTIONS Hstograms Cosder a epermet whch dfferet outcomes are possble (e. Dce tossg). The probablty of all the outcomes ca be represeted a hstogram Dstrbutos Probabltes are descrbed wth

More information

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades STAT 101 Dr. Kar Lock Morga 11/20/12 Exam 2 Grades Multple Regresso SECTIONS 9.2, 10.1, 10.2 Multple explaatory varables (10.1) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (10.2) Trasformatos

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statstc ad Radom Samples A parameter s a umber that descrbes the populato. It s a fxed umber, but practce we do ot kow ts value. A statstc s a fucto of the sample data,.e., t s a quatty whose

More information

StatiStical MethodS for GeoGraphy

StatiStical MethodS for GeoGraphy Peter A. Rogerso StatStcal MethodS for GeoGraphy A Studet S Gude F o u r t h e d t o 00_Rogerso_4e_BAB1405B0092_Prelms.dd 3 10/13/2014 5:30:29 PM 2 DESCRIPTIVE STATISTICS Learg Objectves Types of Data

More information

Statistics MINITAB - Lab 5

Statistics MINITAB - Lab 5 Statstcs 10010 MINITAB - Lab 5 PART I: The Correlato Coeffcet Qute ofte statstcs we are preseted wth data that suggests that a lear relatoshp exsts betwee two varables. For example the plot below s of

More information

Transforms that are commonly used are separable

Transforms that are commonly used are separable Trasforms s Trasforms that are commoly used are separable Eamples: Two-dmesoal DFT DCT DST adamard We ca the use -D trasforms computg the D separable trasforms: Take -D trasform of the rows > rows ( )

More information

Arithmetic Mean Suppose there is only a finite number N of items in the system of interest. Then the population arithmetic mean is

Arithmetic Mean Suppose there is only a finite number N of items in the system of interest. Then the population arithmetic mean is Topc : Probablty Theory Module : Descrptve Statstcs Measures of Locato Descrptve statstcs are measures of locato ad shape that perta to probablty dstrbutos The prmary measures of locato are the arthmetc

More information

Machine Learning. knowledge acquisition skill refinement. Relation between machine learning and data mining. P. Berka, /18

Machine Learning. knowledge acquisition skill refinement. Relation between machine learning and data mining. P. Berka, /18 Mache Learg The feld of mache learg s cocered wth the questo of how to costruct computer programs that automatcally mprove wth eperece. (Mtchell, 1997) Thgs lear whe they chage ther behavor a way that

More information

1. BLAST (Karlin Altschul) Statistics

1. BLAST (Karlin Altschul) Statistics Parwse seuece algmet global ad local Multple seuece algmet Substtuto matrces Database searchg global local BLAST Seuece statstcs Evolutoary tree recostructo Gee Fdg Prote structure predcto RNA structure

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

STATISTICAL INFERENCE

STATISTICAL INFERENCE (STATISTICS) STATISTICAL INFERENCE COMPLEMENTARY COURSE B.Sc. MATHEMATICS III SEMESTER ( Admsso) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION CALICUT UNIVERSITY P.O., MALAPPURAM, KERALA, INDIA -

More information

Objectives of Multiple Regression

Objectives of Multiple Regression Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of

More information

Third handout: On the Gini Index

Third handout: On the Gini Index Thrd hadout: O the dex Corrado, a tala statstca, proposed (, 9, 96) to measure absolute equalt va the mea dfferece whch s defed as ( / ) where refers to the total umber of dvduals socet. Assume that. The

More information

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura Statstcs Descrptve ad Iferetal Statstcs Istructor: Dasuke Nagakura (agakura@z7.keo.jp) 1 Today s topc Today, I talk about two categores of statstcal aalyses, descrptve statstcs ad feretal statstcs, ad

More information

A Study of the Reproducibility of Measurements with HUR Leg Extension/Curl Research Line

A Study of the Reproducibility of Measurements with HUR Leg Extension/Curl Research Line HUR Techcal Report 000--9 verso.05 / Frak Borg (borgbros@ett.f) A Study of the Reproducblty of Measuremets wth HUR Leg Eteso/Curl Research Le A mportat property of measuremets s that the results should

More information

ESS Line Fitting

ESS Line Fitting ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here

More information

3D Geometry for Computer Graphics. Lesson 2: PCA & SVD

3D Geometry for Computer Graphics. Lesson 2: PCA & SVD 3D Geometry for Computer Graphcs Lesso 2: PCA & SVD Last week - egedecomposto We wat to lear how the matrx A works: A 2 Last week - egedecomposto If we look at arbtrary vectors, t does t tell us much.

More information

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes coometrcs, CON Sa Fracsco State Uversty Mchael Bar Sprg 5 Mdterm am, secto Soluto Thursday, February 6 hour, 5 mutes Name: Istructos. Ths s closed book, closed otes eam.. No calculators of ay kd are allowed..

More information

Chapter 11 Systematic Sampling

Chapter 11 Systematic Sampling Chapter stematc amplg The sstematc samplg techue s operatoall more coveet tha the smple radom samplg. It also esures at the same tme that each ut has eual probablt of cluso the sample. I ths method of

More information

Analysis of System Performance IN2072 Chapter 5 Analysis of Non Markov Systems

Analysis of System Performance IN2072 Chapter 5 Analysis of Non Markov Systems Char for Network Archtectures ad Servces Prof. Carle Departmet of Computer Scece U Müche Aalyss of System Performace IN2072 Chapter 5 Aalyss of No Markov Systems Dr. Alexader Kle Prof. Dr.-Ig. Georg Carle

More information

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs CLASS NOTES for PBAF 58: Quattatve Methods II SPRING 005 Istructor: Jea Swaso Dael J. Evas School of Publc Affars Uversty of Washgto Ackowledgemet: The structor wshes to thak Rachel Klet, Assstat Professor,

More information

ANÁLISE DOS DADOS. Daniela Barreiro Claro

ANÁLISE DOS DADOS. Daniela Barreiro Claro ANÁLISE DOS DADOS Daniela Barreiro Claro Outline Data types Graphical Analysis Proimity measures Prof. Daniela Barreiro Claro Types of Data Sets Record Ordered Relational records Video data: sequence of

More information

Module 7: Probability and Statistics

Module 7: Probability and Statistics Lecture 4: Goodess of ft tests. Itroducto Module 7: Probablty ad Statstcs I the prevous two lectures, the cocepts, steps ad applcatos of Hypotheses testg were dscussed. Hypotheses testg may be used to

More information

(Monte Carlo) Resampling Technique in Validity Testing and Reliability Testing

(Monte Carlo) Resampling Technique in Validity Testing and Reliability Testing Iteratoal Joural of Computer Applcatos (0975 8887) (Mote Carlo) Resamplg Techque Valdty Testg ad Relablty Testg Ad Setawa Departmet of Mathematcs, Faculty of Scece ad Mathematcs, Satya Wacaa Chrsta Uversty

More information

Centers of Gravity - Centroids

Centers of Gravity - Centroids RCH Note Set 9. S205ab Ceters of Gravt - Cetrods Notato: C Fz L O Q Q t tw = ame for area = desgato for chael secto = ame for cetrod = force compoet the z drecto = ame for legth = ame for referece org

More information

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever.

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever. 9.4 Sequeces ad Seres Pre Calculus 9.4 SEQUENCES AND SERIES Learg Targets:. Wrte the terms of a explctly defed sequece.. Wrte the terms of a recursvely defed sequece. 3. Determe whether a sequece s arthmetc,

More information

Investigating Cellular Automata

Investigating Cellular Automata Researcher: Taylor Dupuy Advsor: Aaro Wootto Semester: Fall 4 Ivestgatg Cellular Automata A Overvew of Cellular Automata: Cellular Automata are smple computer programs that geerate rows of black ad whte

More information

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn: Chapter 3 3- Busess Statstcs: A Frst Course Ffth Edto Chapter 2 Correlato ad Smple Lear Regresso Busess Statstcs: A Frst Course, 5e 29 Pretce-Hall, Ic. Chap 2- Learg Objectves I ths chapter, you lear:

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Lecture 1: Introduction to Regression

Lecture 1: Introduction to Regression Lecture : Itroducto to Regresso A Eample: Eplag State Homcde Rates What kds of varables mght we use to epla/predct state homcde rates? Let s cosder just oe predctor for ow: povert Igore omtted varables,

More information

Overcoming Limitations of Sampling for Aggregation Queries

Overcoming Limitations of Sampling for Aggregation Queries CIS 6930 Approxmate Quer Processg Paper Presetato Sprg 2004 - Istructor: Dr Al Dobra Overcomg Lmtatos of Samplg for Aggregato Queres Authors: Surajt Chaudhur, Gautam Das, Maur Datar, Rajeev Motwa, ad Vvek

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:

More information

Dimensionality reduction Feature selection

Dimensionality reduction Feature selection CS 750 Mache Learg Lecture 3 Dmesoalty reducto Feature selecto Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 750 Mache Learg Dmesoalty reducto. Motvato. Classfcato problem eample: We have a put data

More information

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov Iteratoal Boo Seres "Iformato Scece ad Computg" 97 MULTIIMNSIONAL HTROGNOUS VARIABL PRICTION BAS ON PRTS STATMNTS Geady Lbov Maxm Gerasmov Abstract: I the wors [ ] we proposed a approach of formg a cosesus

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

Regresso What s a Model? 1. Ofte Descrbe Relatoshp betwee Varables 2. Types - Determstc Models (o radomess) - Probablstc Models (wth radomess) EPI 809/Sprg 2008 9 Determstc Models 1. Hypothesze

More information

Wendy Korn, Moon Chang (IBM) ACM SIGARCH Computer Architecture News Vol. 35, No. 1, March 2007

Wendy Korn, Moon Chang (IBM) ACM SIGARCH Computer Architecture News Vol. 35, No. 1, March 2007 CPU 2006 Sestvty to Memory Page Szes Wedy Kor, Moo Chag (IBM) ACM SIGARCH Computer Archtecture News Vol. 35, No. 1, March 2007 Memory usage 1. Mmum ad Maxmum memory used 2. Sestvty to page szes 3. 4K,

More information

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur

More information