Probability and MLE.
|
|
- Lesley Harrington
- 5 years ago
- Views:
Transcription
1 Probability ad MLE
2 (brief) itro to probability
3 Basic otatios Radom variable - referrig to a elemet / evet whose status is ukow: A = it will rai tomorrow Domai (usually deoted by ) - The set of values a radom variable ca take: - A = The stock market will go up this year : Biary - A = Number of Steelers wis i 2015 : Discrete - A = % chage i Google stock i 2015 : Cotiuous
4 Axioms of probability (Kolmogorov s axioms) A variety of useful facts ca be derived from just three axioms: 1. 0 P(A) 1 2. P(true) = 1, P(false) = 0 3. P(A B) = P(A) + P(B) P(A B) There have bee several other attempts to provide a foudatio for probability theory. Kolmogorov s axioms are the most widely used.
5 Priors Degree of belief i a evet i the absece of ay other iformatio No rai Rai P(rai tomorrow) = 0.2 P(o rai tomorrow) = 0.8
6 Coditioal probability P(A = 1 B = 1): The fractio of cases where A is true if B is true P(A = 0.2) P(A B = 0.5)
7 Coditioal probability I some cases, give kowledge of oe or more radom variables we ca improve upo our prior belief of aother radom variable For example: p(slept i movie) = 0.5 p(slept i movie liked movie) = 1/4 p(did t sleep i movie liked movie) = 3/4 Slept Liked
8 Joit distributios The probability that a set of radom variables will take a specific value is their joit distributio. Notatio: P(A B) or P(A,B) Example: P(liked movie, slept) If we assume idepedece the P(A,B)=P(A)P(B) However, i may cases such a assumptio may be too strog (more later i the class)
9 Joit distributio (cot) P(class size > 20) = 0.6 P(summer) = 0.4 P(class size > 20, summer) =? Evaluatio of classes Size Time Eval 30 R 2 70 R 1 12 S 2 8 S 3 56 R 1 24 S 2 10 S 3 23 R 3 9 R 2 45 R 1
10 Joit distributio (cot) P(class size > 20) = 0.6 P(summer) = 0.4 P(class size > 20, summer) = 0.1 Evaluatio of classes Size Time Eval 30 R 2 70 R 1 12 S 2 8 S 3 56 R 1 24 S 2 10 S 3 23 R 3 9 R 2 45 R 1
11 Joit distributio (cot) P(class size > 20) = 0.6 P(eval = 1) = 0.3 P(class size > 20, eval = 1) = 0.3 Size Time Eval 30 R 2 70 R 1 12 S 2 8 S 3 56 R 1 24 S 2 10 S 3 23 R 3 9 R 2 45 R 1
12 Joit distributio (cot) Evaluatio of classes P(class size > 20) = 0.6 P(eval = 1) = 0.3 P(class size > 20, eval = 1) = 0.3 Size Time Eval 30 R 2 70 R 1 12 S 2 8 S 3 56 R 1 24 S 2 10 S 3 23 R 3 9 R 2 45 R 1
13 Chai rule The joit distributio ca be specified i terms of coditioal probability: P(A,B) = P(A B)*P(B) Together with Bayes rule (which is actually derived from it) this is oe of the most powerful rules i probabilistic reasoig
14 Bayes rule Oe of the most importat rules for this class. Derived from the chai rule: P(A,B) = P(A B)P(B) = P(B A)P(A) Thus, P( A B) P( B A) P( A) P( B) Thomas Bayes was a Eglish clergyma who set out his theory of probability i 1764.
15 Bayes rule (cot) Ofte it would be useful to derive the rule a bit further: A A P A B P A P A B P B P A P A B P A B P ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( This results from: P(B) = A P(B,A) A B A B P(B,A=1) P(B,A=0)
16 Recall: Your first cosultig job A billioaire from the suburbs of Seattle asks you a questio: He says: I have a coi, if I flip it, what s the probability it will fall with the head up? You say: Please flip it a few times: You say: The probability is: 3/5 because frequecy of heads i all flips He says: But ca I put moey o this estimate? You say: ummm. Maybe ot. Not eough flips (less tha sample complexity)
17 What about prior kowledge? Billioaire says: Wait, I kow that the coi is close to What ca you do for me ow? You say: I ca lear it the Bayesia way Rather tha estimatig a sigle, we obtai a distributio over possible values of Before data After data 50-50
18 Bayesia Learig Use Bayes rule: Or equivaletly: posterior likelihood prior 18
19 AIDS test (Bayes rule) Data Approximately 0.1% are ifected Test detects all ifectios Test reports positive for 1% healthy people Probability of havig AIDS if test is positive: Oly 9%!... 10
20 AIDS test (Bayes rule) Data Approximately 0.1% are ifected Test detects all ifectios Test reports positive for 1% healthy people Probability of havig AIDS if test is positive: Oly 9%!... 10
21 Prior distributio From where do we get the prior? - Represets expert kowledge (philosophical approach) - Simple posterior form (egieer s approach) Uiformative priors: - Uiform distributio Cojugate priors: - Closed-form represetatio of posterior - P(q) ad P(q D) have the same algebraic form as a fuctio of \theta
22 Cojugate Prior P(q) ad P(q D) have the same form as a fuctio of theta Eg. 1 Coi flip problem Likelihood give Beroulli model: If prior is Beta distributio, The posterior is Beta distributio 22
23 Beta distributio More cocetrated as values of b H, b T icrease
24 Beta cojugate prior As = a H + a T icreases As we get more samples, effect of prior is washed out
25 Cojugate Prior P() ad P( D) have the same form Eg. 2 Dice roll problem (6 outcomes istead of 2) Likelihood is ~ Multiomial( { 1, 2,, k }) If prior is Dirichlet distributio, The posterior is Dirichlet distributio For Multiomial, cojugate prior is Dirichlet distributio.
26 Posterior Distributio The approach see so far is what is kow as a Bayesia approach Prior iformatio ecoded as a distributio over possible values of parameter Usig the Bayes rule, you get a updated posterior distributio over parameters, which you provide with flourish to the Billioaire But the billioaire is ot impressed - Distributio? I just asked for oe umber: is it 3/5, 1/2, what is it? - How do we go from a distributio over parameters, to a sigle estimate of the true parameters?
27 Maximum A Posteriori Estimatio Choose that maximizes a posterior probability MAP estimate of probability of head: Mode of Beta distributio 27
28 Desity estimatio
29 Desity Estimatio A Desity Estimator lears a mappig from a set of attributes to a Probability Iput data for a variable or a set of variables Desity Estimator Probability
30 Desity estimatio Estimate the distributio (or coditioal distributio) of a radom variable Types of variables: - Biary coi flip, alarm - Discrete dice, car model year - Cotiuous height, weight, temp.,
31 Whe do we eed to estimate desities? Desity estimators are critical igrediets i several of the ML algorithms we will discuss I some cases these are combied with other iferece types for more ivolved algorithms (i.e. EM) while i others they are part of a more geeral process (learig i BNs ad HMMs)
32 Desity estimatio Biary ad discrete variables: Cotiuous variables: Easy: Just cout! Harder (but just a bit): Fit a model
33 Learig a desity estimator for discrete variables P ˆ (x u) #records i which x i u i total umber of records A trivial learig algorithm! But why is this true?
34 Maximum Likelihood Priciple We ca defie the likelihood of the data give the model as follows: k1 P ˆ (dataset M) P ˆ (x x x M) P ˆ (x M) 1 2 k For example M is - The probability of head for a coi flip - The probabilities of observig 1,2,3,4 ad 5 for a dice - etc. M is our model (usually a collectio of parameters)
35 Maximum Likelihood Priciple Our goal is to determie the values for the parameters i M We ca do this by maximizig the probability of geeratig the observed samples For example, let be the probabilities for a coi flip The L(x 1,,x ) = p(x 1 ) p(x ) The observatios (differet flips) are assumed to be idepedet For such a coi flip with P(H)=q the best assigmet for h is Why? P ˆ (dataset M) P ˆ (x x x M) P ˆ (x M) 1 2 k argmax q = #H/#samples k1
36 Maximum Likelihood Priciple: Biary variables For a biary radom variable A with P(A=1)=q argmax q = #1/#samples Why? Data likelihood: P( D M ) q (1 q 1 ) 2 We would like to fid: arg max q q (1 q 1 ) 2 Omittig terms that do ot deped o q
37 Data likelihood: We would like to fid: Maximum Likelihood Priciple 2 1 ) (1 ) ( q q M D P 2 1 ) (1 arg max q q q ) (1 0 ) ) (1 ( ) (1 0 ) (1 ) (1 0 ) (1 ) (1 ) ( q q q q q q q q q q q q q q q q q q q q q
38 Log Probabilities Whe workig with products, probabilities of etire datasets ofte get too small. A possible solutio is to use the log of probabilities, ofte termed log likelihood log ˆ P (dataset M) log P ˆ (x M) log P ˆ (x M) k k k1 Maximizig this likelihood fuctio is the same as maximizig P(dataset M) k1 Log values betwee 0 ad 1 I some cases movig to log space would also make computatio easier (for example, removig the expoets)
39 How much do grad studets sleep? Lets try to estimate the distributio of the time studets sped sleepig (outside class).
40 Possible statistics X Sleep time 12 Sleep Mea of X: 10 E{X} 7.03 Variace of X: Frequecy Sleep Var{X} = E{(X-E{X})^2} Hours
41 Covariace: Sleep vs. GPA Co-Variace of X1, X2: Covariace{X1,X2} = E{(X1-E{X1})(X2-E{X2})} = Sleep / GPA 4 GPA 3.5 Sleep / GPA Sleep hours
42 Statistical Models Statistical models attempt to characterize properties of the populatio of iterest For example, we might believe that repeated measuremets follow a ormal (Gaussia) distributio with some mea µ ad variace 2, x ~ N(µ, 2 ) where ( x x ) ( ) e 2 2 p 2 2 ad =(µ, 2 ) defies the parameters (mea ad variace) of the model. 1 2
43 The Parameters of Our Model A statistical model is a collectio of distributios; the parameters specify idividual distributios x ~ N(µ, 2 ) We eed to adjust the parameters so that the resultig distributio fits the data well
44 The Parameters of Our Model A statistical model is a collectio of distributios; the parameters specify idividual distributios x ~ N(µ, 2 ) We eed to adjust the parameters so that the resultig distributio fits the data well
45 Computig the parameters of our model Lets assume a Guassia distributio for our sleep data How do we compute the parameters of the model? 12 Sleep 10 8 Frequecy 6 4 Sleep Hours
46 Maximum Likelihood Priciple We ca fit statistical models by maximizig the probability of geeratig the observed samples: L(x 1,,x ) = p(x 1 ) p(x ) (the samples are assumed to be idepedet) I the Gaussia case we simply set the mea ad the variace to the sample mea ad the sample variace: 1 i 1 xi 2 1 ( i 1 xi ) 2 Why?
47 Desity estimatio Biary ad discrete variables: Cotiuous variables: Easy: Just cout! Harder (but just a bit): Fit a model But what if we oly have very few samples?
48 MLE vs. MAP Maximum Likelihood estimatio (MLE) Choose value that maximizes the probability of observed data Maximum a posteriori (MAP) estimatio Choose value that is most probable give observed data ad prior belief
49 Importat poits Radom variables Chai rule Bayes rule Joit distributio, idepedece, coditioal idepedece MLE
50 Assume we performed coi flips ad used the outcome to lear the probability of heads, defied as q. I the questios below assume that 0 < q < 1 uless stated otherwise. 1. We have performed a additioal coi flip ad leared a ew probability for heads, q1, based o the +1 observatios. The followig holds: a. q1 = q b. q1 q c. it depeds o q ad the value of the ew observatio 2. We have performed two additioal coi flips ad leared a ew probability for heads, q1, based o the +2 observatios. The followig holds: a. q1 = q b. q1 q c. it depeds o q ad the values of the ew observatios 3. Now assume that 0.6 < q < 1. Similar to (2) we have performed two additioal coi flips ad leared a ew probability for heads, q1, based o the +2 observatios. The followig holds: 1. q1 = q 2. q1 q 3. it depeds o q ad the values of the ew observatios
51 Probability Desity Fuctio Discrete distributios Cotiuous: Cumulative Desity Fuctio (CDF): F(a) f(x) a x
52 Total probability Cumulative Desity Fuctios Probability Desity Fuctio (PDF) Properties: F(x)
53 Expectatios Mea/Expected Value: Variace: I geeral:
54 Multivariate Joit for (x,y) Margial: Coditioals: Chai rule:
55 Bayes Rule Stadard form: Replacig the bottom:
56 Biomial Distributio: Mea/Var:
57 Uiform Aythig is equally likely i the regio [a,b] Distributio: Mea/Var a b
58 Gaussia (Normal) If I look at the height of wome i coutry xx, it will look approximately Gaussia Small radom oise errors, look Gaussia/Normal Distributio: Mea/var
59 Why Do People Use Gaussias Cetral Limit Theorem: (loosely) - Sum of a large umber of IID radom variables is approximately Gaussia
60 Multivariate Gaussias Distributio for vector x PDF:
61 Multivariate Gaussias 1 cov( x, x ) ( x )( x ) 1 2 1, i 1 2, i 2 i1
62 Covariace examples Ati-correlated Correlated Idepedet (almost) Covariace: -9.2 Covariace: Covariace: 0.6
63 Sum of Gaussias The sum of two Gaussias is a Gaussia:
Machine Learning.
10-701 Machie Learig http://www.cs.cmu.edu/~epxig/class/10701-15f/ Orgaizatioal ifo All up-to-date ifo is o the course web page (follow liks from my page). Istructors - Eric Xig - Ziv Bar-Joseph TAs: See
More information15-780: Graduate Artificial Intelligence. Density estimation
5-780: Graduate Artificial Itelligece Desity estimatio Coditioal Probability Tables (CPT) But where do we get them? P(B)=.05 B P(E)=. E P(A B,E) )=.95 P(A B, E) =.85 P(A B,E) )=.5 P(A B, E) =.05 A P(J
More informationExponential Families and Bayesian Inference
Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where
More informationDistribution of Random Samples & Limit theorems
STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationElementary manipulations of probabilities
Elemetary maipulatios of probabilities Set probability of multi-valued r.v. {=Odd} = +3+5 = /6+/6+/6 = ½ X X,, X i j X i j Multi-variat distributio: Joit probability: X true true X X,, X X i j i j X X
More informationLecture 11 and 12: Basic estimation theory
Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis
More information10-701: Introduction to Deep Neural Networks Machine Learning.
10-701: Introduction to Deep Neural Networks Machine Learning http://www.cs.cmu.edu/~10701 Organizational info All up-to-date info is on the course web page (follow links from my page). Instructors - Nina
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationThe Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model
Back to Maximum Likelihood Give a geerative model f (x, y = k) =π k f k (x) Usig a geerative modellig approach, we assume a parametric form for f k (x) =f (x; k ) ad compute the MLE θ of θ =(π k, k ) k=
More informationSTAT Homework 1 - Solutions
STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better
More informationKurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)
Kurskod: TAMS Provkod: TENB 2 March 205, 4:00-8:00 Examier: Xiagfeg Yag (Tel: 070 2234765). Please aswer i ENGLISH if you ca. a. You are allowed to use: a calculator; formel -och tabellsamlig i matematisk
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationModule 1 Fundamentals in statistics
Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly
More informationQuick Review of Probability
Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter & Teachig Material.
More informationStatisticians use the word population to refer the total number of (potential) observations under consideration
6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space
More informationMixtures of Gaussians and the EM Algorithm
Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity
More informationProbability and statistics: basic terms
Probability ad statistics: basic terms M. Veeraraghava August 203 A radom variable is a rule that assigs a umerical value to each possible outcome of a experimet. Outcomes of a experimet form the sample
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationQuick Review of Probability
Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter 2 & Teachig
More informationHOMEWORK I: PREREQUISITES FROM MATH 727
HOMEWORK I: PREREQUISITES FROM MATH 727 Questio. Let X, X 2,... be idepedet expoetial radom variables with mea µ. (a) Show that for Z +, we have EX µ!. (b) Show that almost surely, X + + X (c) Fid the
More informationIE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.
Closed book ad otes. No calculators. 120 miutes. Cover page, five pages of exam, ad tables for discrete ad cotiuous distributios. Score X i =1 X i / S X 2 i =1 (X i X ) 2 / ( 1) = [i =1 X i 2 X 2 ] / (
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More information4. Partial Sums and the Central Limit Theorem
1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems
More informationBHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13
BHW # /5 ENGR Probabilistic Aalysis Beautiful Homework # Three differet roads feed ito a particular freeway etrace. Suppose that durig a fixed time period, the umber of cars comig from each road oto the
More informationJanuary 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS
Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationPRACTICE PROBLEMS FOR THE FINAL
PRACTICE PROBLEMS FOR THE FINAL Math 36Q Fall 25 Professor Hoh Below is a list of practice questios for the Fial Exam. I would suggest also goig over the practice problems ad exams for Exam ad Exam 2 to
More informationClustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.
Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.
More informationHypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance
Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationIt is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.
MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationTopic 8: Expected Values
Topic 8: Jue 6, 20 The simplest summary of quatitative data is the sample mea. Give a radom variable, the correspodig cocept is called the distributioal mea, the epectatio or the epected value. We begi
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationApproximations and more PMFs and PDFs
Approximatios ad more PMFs ad PDFs Saad Meimeh 1 Approximatio of biomial with Poisso Cosider the biomial distributio ( b(k,,p = p k (1 p k, k λ: k Assume that is large, ad p is small, but p λ at the limit.
More informationSTAT Homework 2 - Solutions
STAT-36700 Homework - Solutios Fall 08 September 4, 08 This cotais solutios for Homework. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better isight.
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationBayesian Methods: Introduction to Multi-parameter Models
Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested
More information1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete
More informationStatistical Pattern Recognition
Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig
More informationKLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions
We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give
More informationAMS570 Lecture Notes #2
AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationAAEC/ECON 5126 FINAL EXAM: SOLUTIONS
AAEC/ECON 5126 FINAL EXAM: SOLUTIONS SPRING 2015 / INSTRUCTOR: KLAUS MOELTNER This exam is ope-book, ope-otes, but please work strictly o your ow. Please make sure your ame is o every sheet you re hadig
More informationRegression and generalization
Regressio ad geeralizatio CE-717: Machie Learig Sharif Uiversity of Techology M. Soleymai Fall 2016 Curve fittig: probabilistic perspective Describig ucertaity over value of target variable as a probability
More informationThis section is optional.
4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore
More informationMassachusetts Institute of Technology
Solutios to Quiz : Sprig 006 Problem : Each of the followig statemets is either True or False. There will be o partial credit give for the True False questios, thus ay explaatios will ot be graded. Please
More informationExpectation-Maximization Algorithm.
Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More informationOverview of Estimation
Topic Iferece is the problem of turig data ito kowledge, where kowledge ofte is expressed i terms of etities that are ot preset i the data per se but are preset i models that oe uses to iterpret the data.
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationLecture 3: MLE and Regression
STAT/Q SCI 403: Itroductio to Resamplig Methods Sprig 207 Istructor: Ye-Chi Che Lecture 3: MLE ad Regressio 3. Parameters ad Distributios Some distributios are idexed by their uderlyig parameters. Thus,
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More informationCSE 527, Additional notes on MLE & EM
CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be
More informationEE 4TM4: Digital Communications II Probability Theory
1 EE 4TM4: Digital Commuicatios II Probability Theory I. RANDOM VARIABLES A radom variable is a real-valued fuctio defied o the sample space. Example: Suppose that our experimet cosists of tossig two fair
More informationMachine Learning 4771
Machie Learig 4771 Istructor: Toy Jebara Topic 14 Structurig Probability Fuctios for Storage Structurig Probability Fuctios for Iferece Basic Graphical Models Graphical Models Parameters as Nodes Structurig
More informationDiscrete probability distributions
Discrete probability distributios I the chapter o probability we used the classical method to calculate the probability of various values of a radom variable. I some cases, however, we may be able to develop
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationJoint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }
UCLA STAT A Applied Probability & Statistics for Egieers Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistat: Neda Farziia, UCLA Statistics Uiversity of Califoria, Los Ageles, Sprig
More informationBIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov
Microarray Ceter BIOSTATISTICS Lecture 5 Iterval Estimatios for Mea ad Proportio dr. Petr Nazarov 15-03-013 petr.azarov@crp-sate.lu Lecture 5. Iterval estimatio for mea ad proportio OUTLINE Iterval estimatios
More informationSome Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables
Some Basic Probability Cocepts 2. Experimets, Outcomes ad Radom Variables A radom variable is a variable whose value is ukow util it is observed. The value of a radom variable results from a experimet;
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More informationThe standard deviation of the mean
Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider
More informationNOTES ON DISTRIBUTIONS
NOTES ON DISTRIBUTIONS MICHAEL N KATEHAKIS Radom Variables Radom variables represet outcomes from radom pheomea They are specified by two objects The rage R of possible values ad the frequecy fx with which
More informationCHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics
CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS 8.1 Radom Samplig The basic idea of the statistical iferece is that we are allowed to draw ifereces or coclusios about a populatio based
More informationSolutions: Homework 3
Solutios: Homework 3 Suppose that the radom variables Y,...,Y satisfy Y i = x i + " i : i =,..., IID where x,...,x R are fixed values ad ",...," Normal(0, )with R + kow. Fid ˆ = MLE( ). IND Solutio: Observe
More informationMathematical Statistics - MS
Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios
More informationGoodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)
Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationUnderstanding Samples
1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationProperties of Joints Chris Piech CS109, Stanford University
Properties of Joits Chris Piech CS09, Staford Uiversity Titaic Probability 7% of passegers were from the Ottoma Empire Biometric Keystroes Altruism? Scores for a stadardized test that studets i Polad
More informationEcon 325: Introduction to Empirical Economics
Eco 35: Itroductio to Empirical Ecoomics Lecture 3 Discrete Radom Variables ad Probability Distributios Copyright 010 Pearso Educatio, Ic. Publishig as Pretice Hall Ch. 4-1 4.1 Itroductio to Probability
More information1.010 Uncertainty in Engineering Fall 2008
MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval
More informationPattern Classification
Patter Classificatio All materials i these slides were tae from Patter Classificatio (d ed) by R. O. Duda, P. E. Hart ad D. G. Stor, Joh Wiley & Sos, 000 with the permissio of the authors ad the publisher
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:
Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal
More informationLecture 1 Probability and Statistics
Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationf(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1
Parameter Estimatio Samples from a probability distributio F () are: [,,..., ] T.Theprobabilitydistributio has a parameter vector [,,..., m ] T. Estimator: Statistic used to estimate ukow. Estimate: Observed
More informationStat 421-SP2012 Interval Estimation Section
Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible
More information7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More informationProbability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].
Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More information