Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
|
|
- Archibald Boyd
- 5 years ago
- Views:
Transcription
1 Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
2 Why uncertainty? Why should data mining care about uncertainty? We are living in the world where everything is possible! The world Modeling is full uncertainty of uncertainty is a The data necessary that describe component the of world express almost the uncertainty all data analysis The way we seek model structures contains uncertainty
3 Formalizing the uncertainty Probability the most widely used tools for modeling the uncertainty with theoretical backbone Widespread application and acceptance Other tools Fuzzy logic Rough set
4 Two views of probability Frequentist (objective, based only on data) The probability of an event is defined as the limiting proportion of times that the event would occur in repetitions of essentially identical situations Bayesian (subjective, with a belief) Bayesian statistics explicitly characterize all forms of uncertainty in data analysis: Uncertainty about any parameters to be estimated from data Uncertainty as to which among the set of model structures are the best of closest to the ground-truth Uncertainty in any forecast to be made
5 Models and data probability MODEL DATA Statistical inference Probability specifies how the concerned properties of the observed data can be generated from the models Statistical inference makes statement about the concerned properties of the population.
6 Estimation Usually, there are numerous models that can generate the observed data points, how to determine which model to use? Statistical inference: 1. Assume a model family (or form). 2. Estimate the parameters to determine the specific model That s it!
7 Properties of estimators Bias The difference between the expected value of the estimator and the true value of parameter Unbiased estimators show no systematic departure from the true value Variance It measures the random, data-driven component of error in our estimation procedure Mean squared error The mean of the squared difference between the estimated value and the true value of a parameter Consistency The difference between the estimated value and true value approaches 0 as the sample size increases
8 Maximum likelihood estimation (MLE) Likelihood Likelihood describes the probability of the observed data generated from the model conditioned on the given parameter. Probability of the observed rather than the unseen Examples are iid The MLE The value for the parameter for which the data has the highest probability of having arisen. MLE is to find the most likely model that generates the data
9 Maximum likelihood estimation (MLE)
10 Maximum likelihood estimation (MLE) Remarks on MLE If g(.) is one-to-one function, then is an MLE of when is an MLE of MLE is a point estimate, which only cares about the best To capture some uncertainty, one can compute the confidence interval which specifying the region containing the true value Use normal approximation based on central limits theorem Use Boostrapping
11 Bayesian estimation Bayesian statistics treats the data as known and the parameters as random variables A prior distribution is associated with the parameter to express our prior belief of where the true parameter may be Analysis of D leads to modification of this distribution to take into account the empirical data, yielding posterior probability. Distribution instead of one value Maximum a posteriori (MAP) Selecting the mode of this posterior distribution as the MAP estimation
12 Bayesian estimation Belief vs. evidence What is probable value of θ that generating these evidence Fixed. If model is assumed Belief on where the value should be Could vary a lot! Different prior may lead to different distribution, thus a good balance should be made when choosing the prior
13 Bayesian estimation How to select the prior distribution? Ideal: coding the domain knowledge into the prior Based on the own experience or belief Cons: too subjective. Varies from person to person. Reference prior Less subjective (e.g., Jeffrey s prior) Conjugate prior where The resulted posterior distribution in the same family as the prior distribution. (e.g., Normal distribution)
14 Hypothesis testing Hypothesis testing is used to test whether the observed data support some idea about the value of a parameter. Key assumption: A random sample has been drawn from some distribution and the aim of the testing is to make probability statement about a parameter of that distribution General steps: Null hypothesis (H 0 ) vs. alternative hypothesis (H 1 ) Determine the distribution of some chosen statistics related to the nature of the problem Determine a reject region or critical region If the observed value falling into the reject region, accept H 1 otherwise H 0
15 Widely-used hypothesis test Goodness-of-fit: use the test to compare an observed distribution with a hypothesized distribution Pairwise t-test Chi-square test Distribution-free test: no distribution is assumed from which the sample is drawn Sign test rank sum test Wilcoxon test,
16 Sampling Sometimes, we need to conduct data mining over a sample of data points instead of the entire database Why sampling? Major issue is efficiency! How to create a qualified sample? Try to maintain the distribution information of the original data set Try to avoid using systematic sampling (i.e., avoid injected certain characteristics of the sampling method into the sampled data)
17 Widely-used sampling methods Simple random sampling Sampling each data point with equal probability (without replacement) Bootstrap sampling Sampling each data point with equal probability (with replacement) Stratified sampling Split data into nonoverlapping strata, a sample is drawn separately from within each stratum Cluster sampling If data are naturally grouped into clusters, sample clusters instead of data points and then use simple random sampling or all data points in the sampled clusters.
18 Let s move to Chapter 5
STATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationParameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn
Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation
More informationBayesian Regression (1/31/13)
STA613/CBB540: Statistical methods in computational biology Bayesian Regression (1/31/13) Lecturer: Barbara Engelhardt Scribe: Amanda Lea 1 Bayesian Paradigm Bayesian methods ask: given that I have observed
More informationRidge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation
Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013
Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationEstimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio
Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist
More informationA3. Statistical Inference Hypothesis Testing for General Population Parameters
Appendix / A3. Statistical Inference / General Parameters- A3. Statistical Inference Hypothesis Testing for General Population Parameters POPULATION H 0 : θ = θ 0 θ is a generic parameter of interest (e.g.,
More informationModule 22: Bayesian Methods Lecture 9 A: Default prior selection
Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical
More informationTwo examples of the use of fuzzy set theory in statistics. Glen Meeden University of Minnesota.
Two examples of the use of fuzzy set theory in statistics Glen Meeden University of Minnesota http://www.stat.umn.edu/~glen/talks 1 Fuzzy set theory Fuzzy set theory was introduced by Zadeh in (1965) as
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationStatistics 135: Fall 2004 Final Exam
Name: SID#: Statistics 135: Fall 2004 Final Exam There are 10 problems and the number of points for each is shown in parentheses. There is a normal table at the end. Show your work. 1. The designer of
More informationFrequentist Statistics and Hypothesis Testing Spring
Frequentist Statistics and Hypothesis Testing 18.05 Spring 2018 http://xkcd.com/539/ Agenda Introduction to the frequentist way of life. What is a statistic? NHST ingredients; rejection regions Simple
More informationCONTENTS OF DAY 2. II. Why Random Sampling is Important 10 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 4 Problems with small populations 9 II. Why Random Sampling is Important 10 A myth,
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?
Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationCOS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION
COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:
More informationBios 6649: Clinical Trials - Statistical Design and Monitoring
Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & nformatics Colorado School of Public Health University of Colorado Denver
More informationDensity Estimation: ML, MAP, Bayesian estimation
Density Estimation: ML, MAP, Bayesian estimation CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Maximum-Likelihood Estimation Maximum
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationPrinciples of Statistical Inference
Principles of Statistical Inference Nancy Reid and David Cox August 30, 2013 Introduction Statistics needs a healthy interplay between theory and applications theory meaning Foundations, rather than theoretical
More informationPrinciples of Statistical Inference
Principles of Statistical Inference Nancy Reid and David Cox August 30, 2013 Introduction Statistics needs a healthy interplay between theory and applications theory meaning Foundations, rather than theoretical
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More information3 Joint Distributions 71
2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random
More informationLecture 5: Clustering, Linear Regression
Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 .0.0 5 5 1.0 7 5 X2 X2 7 1.5 1.0 0.5 3 1 2 Hierarchical clustering
More informationBayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017
Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries
More informationIntroduction to Bayesian Learning. Machine Learning Fall 2018
Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability
More informationData Analysis and Uncertainty Part 2: Estimation
Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics in Estimation 1. Estimation 2. Desirable
More informationHypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006
Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)
More informationMachine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?
Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do
More informationy Xw 2 2 y Xw λ w 2 2
CS 189 Introduction to Machine Learning Spring 2018 Note 4 1 MLE and MAP for Regression (Part I) So far, we ve explored two approaches of the regression framework, Ordinary Least Squares and Ridge Regression:
More informationCOS513 LECTURE 8 STATISTICAL CONCEPTS
COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions
More informationSome slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2
Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional
More informationLecture 5: Sampling Methods
Lecture 5: Sampling Methods What is sampling? Is the process of selecting part of a larger group of participants with the intent of generalizing the results from the smaller group, called the sample, to
More informationSTAT 135 Lab 5 Bootstrapping and Hypothesis Testing
STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationDS-GA 1002 Lecture notes 11 Fall Bayesian statistics
DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian
More informationLecture 5: Clustering, Linear Regression
Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 Hierarchical clustering Most algorithms for hierarchical clustering
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide
More informationEXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science
EXAMINATION: QUANTITATIVE EMPIRICAL METHODS Yale University Department of Political Science January 2014 You have seven hours (and fifteen minutes) to complete the exam. You can use the points assigned
More informationProbability and Statistics
CHAPTER 5: PARAMETER ESTIMATION 5-0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 5: PARAMETER
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty
More informationTUTORIAL 8 SOLUTIONS #
TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level
More informationDS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling
DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including
More informationLectures 5 & 6: Hypothesis Testing
Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across
More informationPoint Estimation. Vibhav Gogate The University of Texas at Dallas
Point Estimation Vibhav Gogate The University of Texas at Dallas Some slides courtesy of Carlos Guestrin, Chris Bishop, Dan Weld and Luke Zettlemoyer. Basics: Expectation and Variance Binary Variables
More informationConfidence Distribution
Confidence Distribution Xie and Singh (2013): Confidence distribution, the frequentist distribution estimator of a parameter: A Review Céline Cunen, 15/09/2014 Outline of Article Introduction The concept
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference
More informationNew Bayesian methods for model comparison
Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationBrandon C. Kelly (Harvard Smithsonian Center for Astrophysics)
Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming
More informationLecture 3: More on regularization. Bayesian vs maximum likelihood learning
Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting
More informationUnobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:
Pi Priors Unobservable Parameter population proportion, p prior: π ( p) Conjugate prior π ( p) ~ Beta( a, b) same PDF family exponential family only Posterior π ( p y) ~ Beta( a + y, b + n y) Observed
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics jimmyol@kth.se Lecture 13 Introduction to bootstrap 5 May 2014 Computer Intensive Methods (1) Plan of today s lecture 1 Last
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Hypothesis testing. Anna Wegloop Niels Landwehr/Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Hypothesis testing Anna Wegloop iels Landwehr/Tobias Scheffer Why do a statistical test? input computer model output Outlook ull-hypothesis
More informationLecture 5: Clustering, Linear Regression
Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-2 STATS 202: Data mining and analysis Sergio Bacallado September 19, 2018 1 / 23 Announcements Starting next week, Julia Fukuyama
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationBayesian Statistics. Debdeep Pati Florida State University. February 11, 2016
Bayesian Statistics Debdeep Pati Florida State University February 11, 2016 Historical Background Historical Background Historical Background Brief History of Bayesian Statistics 1764-1838: called probability
More informationChapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1
Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in
More informationSTAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01
STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist
More informationHypothesis Testing with the Bootstrap. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods
Hypothesis Testing with the Bootstrap Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Bootstrap Hypothesis Testing A bootstrap hypothesis test starts with a test statistic
More informationPreliminary Statistics Lecture 5: Hypothesis Testing (Outline)
1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 5: Hypothesis Testing (Outline) Gujarati D. Basic Econometrics, Appendix A.8 Barrow M. Statistics
More informationStatistical Inference
Statistical Inference Classical and Bayesian Methods Revision Class for Midterm Exam AMS-UCSC Th Feb 9, 2012 Winter 2012. Session 1 (Revision Class) AMS-132/206 Th Feb 9, 2012 1 / 23 Topics Topics We will
More informationGaussians Linear Regression Bias-Variance Tradeoff
Readings listed in class website Gaussians Linear Regression Bias-Variance Tradeoff Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University January 22 nd, 2007 Maximum Likelihood Estimation
More informationVisual interpretation with normal approximation
Visual interpretation with normal approximation H 0 is true: H 1 is true: p =0.06 25 33 Reject H 0 α =0.05 (Type I error rate) Fail to reject H 0 β =0.6468 (Type II error rate) 30 Accept H 1 Visual interpretation
More informationBayesian Inference. Chapter 1. Introduction and basic concepts
Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationRelated Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM
Lecture 9 SEM, Statistical Modeling, AI, and Data Mining I. Terminology of SEM Related Concepts: Causal Modeling Path Analysis Structural Equation Modeling Latent variables (Factors measurable, but thru
More informationIntroduction into Bayesian statistics
Introduction into Bayesian statistics Maxim Kochurov EF MSU November 15, 2016 Maxim Kochurov Introduction into Bayesian statistics EF MSU 1 / 7 Content 1 Framework Notations 2 Difference Bayesians vs Frequentists
More informationBTRY 4830/6830: Quantitative Genomics and Genetics
BTRY 4830/6830: Quantitative Genomics and Genetics Lecture 23: Alternative tests in GWAS / (Brief) Introduction to Bayesian Inference Jason Mezey jgm45@cornell.edu Nov. 13, 2014 (Th) 8:40-9:55 Announcements
More informationWhy Try Bayesian Methods? (Lecture 5)
Why Try Bayesian Methods? (Lecture 5) Tom Loredo Dept. of Astronomy, Cornell University http://www.astro.cornell.edu/staff/loredo/bayes/ p.1/28 Today s Lecture Problems you avoid Ambiguity in what is random
More informationAccouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF
Accouncements You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF Please do not zip these files and submit (unless there are >5 files) 1 Bayesian Methods Machine Learning
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More information2. A Basic Statistical Toolbox
. A Basic Statistical Toolbo Statistics is a mathematical science pertaining to the collection, analysis, interpretation, and presentation of data. Wikipedia definition Mathematical statistics: concerned
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationStatistics notes. A clear statistical framework formulates the logic of what we are doing and why. It allows us to make precise statements.
Statistics notes Introductory comments These notes provide a summary or cheat sheet covering some basic statistical recipes and methods. These will be discussed in more detail in the lectures! What is
More informationBehavioral Data Mining. Lecture 2
Behavioral Data Mining Lecture 2 Autonomy Corp Bayes Theorem Bayes Theorem P(A B) = probability of A given that B is true. P(A B) = P(B A)P(A) P(B) In practice we are most interested in dealing with events
More informationInterval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean
Interval estimation October 3, 2018 STAT 151 Class 7 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0
More informationORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing
ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing Robert Vanderbei Fall 2014 Slides last edited on November 24, 2014 http://www.princeton.edu/ rvdb Coin Tossing Example Consider two coins.
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationOne-parameter models
One-parameter models Patrick Breheny January 22 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/17 Introduction Binomial data is not the only example in which Bayesian solutions can be worked
More informationMachine Learning
Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 13, 2011 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting
More informationStatistical Methods in Particle Physics Lecture 1: Bayesian methods
Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan
More informationStatistical Inference: Maximum Likelihood and Bayesian Approaches
Statistical Inference: Maximum Likelihood and Bayesian Approaches Surya Tokdar From model to inference So a statistical analysis begins by setting up a model {f (x θ) : θ Θ} for data X. Next we observe
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationExam 2 Practice Questions, 18.05, Spring 2014
Exam 2 Practice Questions, 18.05, Spring 2014 Note: This is a set of practice problems for exam 2. The actual exam will be much shorter. Within each section we ve arranged the problems roughly in order
More informationSimulation. Alberto Ceselli MSc in Computer Science Univ. of Milan. Part 4 - Statistical Analysis of Simulated Data
Simulation Alberto Ceselli MSc in Computer Science Univ. of Milan Part 4 - Statistical Analysis of Simulated Data A. Ceselli Simulation P.4 Analysis of Sim. data 1 / 15 Statistical analysis of simulated
More informationBayesian vs frequentist techniques for the analysis of binary outcome data
1 Bayesian vs frequentist techniques for the analysis of binary outcome data By M. Stapleton Abstract We compare Bayesian and frequentist techniques for analysing binary outcome data. Such data are commonly
More informationDecision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over
Point estimation Suppose we are interested in the value of a parameter θ, for example the unknown bias of a coin. We have already seen how one may use the Bayesian method to reason about θ; namely, we
More informationBayesian Methods. David S. Rosenberg. New York University. March 20, 2018
Bayesian Methods David S. Rosenberg New York University March 20, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 March 20, 2018 1 / 38 Contents 1 Classical Statistics 2 Bayesian
More information(1) Introduction to Bayesian statistics
Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they
More informationBasic of Probability Theory for Ph.D. students in Education, Social Sciences and Business (Shing On LEUNG and Hui Ping WU) (May 2015)
Basic of Probability Theory for Ph.D. students in Education, Social Sciences and Business (Shing On LEUNG and Hui Ping WU) (May 2015) This is a series of 3 talks respectively on: A. Probability Theory
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationReasoning with Uncertainty
Reasoning with Uncertainty Representing Uncertainty Manfred Huber 2005 1 Reasoning with Uncertainty The goal of reasoning is usually to: Determine the state of the world Determine what actions to take
More informationMAXIMUM LIKELIHOOD, SET ESTIMATION, MODEL CRITICISM
Eco517 Fall 2004 C. Sims MAXIMUM LIKELIHOOD, SET ESTIMATION, MODEL CRITICISM 1. SOMETHING WE SHOULD ALREADY HAVE MENTIONED A t n (µ, Σ) distribution converges, as n, to a N(µ, Σ). Consider the univariate
More information