Lecture 2: Introduction to Probability
|
|
- Colin Dawson
- 5 years ago
- Views:
Transcription
1 Statistical Methods for Intelligent Information Processing (SMIIP) Lecture 2: Introduction to Probability Shuigeng Zhou School of Computer Science September 20, 2017
2 Outline Background and concepts Some discrete distributions Some continuous distributions Joint probability distribution Transformations of random variables Monte Carlo approximation Information theory Examples 2017/9/25 SMIIP 2
3 Background and Concepts 2017/9/25 SMIIP 3
4 What is probability? Probability theory is nothing but common sense reduced to calculation Pierre Laplace Two probability interpretations Frequentist interpretation (objectivists) Probabilities represent long run frequencies of events Bayesian interpretation (subjectivists) Probability is used to quantify our uncertainty about something 2017/9/25 SMIIP 4
5 German tank problem During World War II, German tanks were sequentially numbered; assume 1, 2, 3,, N Some of the numbers became known to Allied Forces when tanks were captured or records seized The Allied statisticians developed an estimation procedure to determine N At the end of WWII, the serial-number estimate for German tank production was very close to the actual figure 2017/9/25 SMIIP 5
6 Sampling methods Convenience sampling: Obtain the easiest sample you can get (this is a bad idea) 2017/9/25 SMIIP 6
7 Sampling methods Random sampling: Any method where every member of the population has an equal chance of being selected 2017/9/25 SMIIP 7
8 Sampling methods Stratified Sample: Split the population into groups (strata) and sample from each group separately The goal here is for the strata to be homogeneous (the members are very similar) 2017/9/25 SMIIP 8
9 Sampling methods Cluster sample: randomly select a few clusters and sample all members of the clusters. 2017/9/25 SMIIP 9
10 Sampling methods Systematic sampling: Set an order for the data, start from a random element, and then select every k th member, with k=n/n where N is the dataset size, n is the number of samples to be selected 2017/9/25 SMIIP 10
11 Basic concepts (1) Event A and its probability p(a): 0 p(a) 1 Discrete random variable X State space χ Probability mass function (pmf): p(x) Probability of a union of two events A and B p(a B)=p(A)+p(B)-p(A B) Joint probability: the probability of the joint event A and B p(a, B)= p(a B)=p (A) p(b A)=p (B) p(a B) --- product rule Conditional probability p A B = p(a,b) p(b) Marginal distribution if p B > 0 p A = b p A, B = b p A B = b p(b = b) --- sum rule 2017/9/25 SMIIP 11
12 Basic concepts (2) Continuous random variable X Cumulative distribution function (cdf): F(q) F q = p X q Probability density function (pdf): f(x) b p a < X b = f x dx a Quantile( 分位数 ) If F is the cdf of X, and F x α = α, then x α is the α quantile of F Mean, or expected value: ; Variance: 2017/9/25 SMIIP 12
13 Mode, median and range Median: the middle value in the dataset Mode: the value that occurs most often in the dataset Range: the difference between the largest and the smallest values 2017/9/25 SMIIP 13
14 Descriptive variables 2017/9/25 SMIIP 14
15 Descriptive statistics to measure the central tendency 2017/9/25 SMIIP 15
16 The Variance estimation It measure dispersion relative to the scatter of the values about the mean 2017/9/25 SMIIP 16
17 The Variance estimation Population variance 2 = 1 N μ= 1 N N i=1 N i=1 x i (x i μ) 2 = Sample variance 1 N N i=1 x i 2 μ 2 Taking n samples from the population, estimate the variance y 2 = 1 n n i=1 (y i μ y ) 2, μ y = 1 n n i=1 Sampling multiple times, computing the expected valued of y 2 E y 2 = n 1 n 2, so 2 = n n 1 E y 2 y i We take the variance of one time sampling as E y 2, the sample variance s 2 is s 2 = 1 n 1 n i=1 (y i μ y ) /9/25 SMIIP 17
18 Independence and conditional independence Unconditionally independence Marginally independence Conditional independence 2017/9/25 SMIIP 18
19 Bayes rule 2017/9/25 SMIIP 19
20 Some Common Discrete Distributions 2017/9/25 SMIIP 20
21 The binomial and Bernoulli distributions Binomial distribution: toss a coin n times, the probability of having k heads Bernoulli: a special case of binominal distribution where tossing a coin only once 2017/9/25 SMIIP 21
22 The binomial distribution 2017/9/25 SMIIP 22
23 The multinomial and multinoulli distributions Multinomial distribution: tossing a die of K-side n times, x=(x 1, x 2,, x k ) is a vector indicating the appearing time of each side Multinoulli: a special case of multinomial distribution with n=1 2017/9/25 SMIIP 23
24 Summary of the multinomial and related distributions 2017/9/25 SMIIP 24
25 Application: DNA sequence motifs 2017/9/25 SMIIP 25
26 The Poisson distribution The Poisson distribution is often used as a model for counts of rare events like radioactive decay and traffic accidents 2017/9/25 SMIIP 26
27 The Poisson distribution Considering a binomial distribution 2017/9/25 SMIIP 27
28 Mean and Variance of Poisson Distribution Recall the mean of a binomial distribution B(n, p) = np, variance of B(n, p) = np(1-p)= λ(1-p) Since Poisson distribution is an approximation of binomial distribution when n is approaching infinity, and p is extremely small, then its mean E(x)=np= λ Variance λ(1-p) ~ λ when p is very small Mean and Variance of Poisson distribution are the same: λ 2017/9/25 SMIIP 28
29 The Poisson distribution 2017/9/25 SMIIP 29
30 Empirical distribution Here, A is a range 2017/9/25 SMIIP 30
31 Discrete probability distributions 2017/9/25 SMIIP 31
32 Some Common Continuous Distributions 2017/9/25 SMIIP 32
33 Gaussian (normal) distribution --- Standard normal distribution CDF of the Gaussian is defined as 2017/9/25 SMIIP 33
34 Why Gaussian distribution is important? It is simple with only two parameters, and easy to be used Many phenomena in real world have an approximate Gaussian distribution According to the central limit theorem, the sums of independent random variables have an approximate Gaussian distribution 2017/9/25 SMIIP 34
35 Student t distribution Gaussian distribution is sensitive to outliers. A more robust distribution is Student t distribution When v=1, it is known as Cauchy or Lorentz distribution, which has a heavy tail When v>>5, it approaches to Gaussian distribution 2017/9/25 SMIIP 35
36 The Laplace distribution Also called double sided exponential distribution 2017/9/25 SMIIP 36
37 pdf and log(pdf) 2017/9/25 SMIIP 37
38 Effect of Outliers 2017/9/25 SMIIP 38
39 The gamma distribution The gamma distribution is a flexible distribution for positive real valued random variables 2017/9/25 SMIIP 39
40 The beta distribution The beta distribution has support over the interval [0, 1] and is defined as follows: Here B(p, q) is the beta function: 2017/9/25 SMIIP 40
41 The beta distribution a=b=1, uninform distribution a and b <1, bimodal distribution with the spikes at 0 and 1 a and b >1, unimodal distribution 2017/9/25 SMIIP 41
42 Pareto distribution The Pareto distribution is used to model the distribution of quantities that exhibit long tails, also called heavy tails The Pareto pdf is defined as follow: This distribution has the following properties 2017/9/25 SMIIP 42
43 Pareto distribution 2017/9/25 SMIIP 43
44 Continuous probability distributions 2017/9/25 SMIIP 44
45 Joint Probability Distributions 2017/9/25 SMIIP 45
46 Covariance A joint probability distribution has the form p(x 1,..., x D ) for a set of D > 1 variables The covariance between two rv s X and Y measures the degree to which X and Y are (linearly) related For a d-dimensional random vector x, its covariance matrix is: 2017/9/25 SMIIP 46
47 Correlation The (Pearson) correlation coefficient between X and Y is defined as For a d-dimensional random vector x, its correlation matrix is: 2017/9/25 SMIIP 47
48 Correlation Correlation coefficient is as a degree of linearity, it is not related to the slope of the regression line The regression coefficient is If X and Y are independent, meaning p(x, Y) = p(x)p(y ), then cov [X, Y] = 0, and hence corr [X, Y] = 0 so they are uncorrelated. However, the converse is not true: uncorrelated does not imply independent 2017/9/25 SMIIP 48
49 Correlation 2017/9/25 SMIIP 49
50 The multivariate Gaussian The pdf of multivariate Gaussian or multivariate normal (MVN) in D dimension is Here, μ = E [x] RD is the mean vector, and Σ = cov[x] is the D D covariance matrix. 2017/9/25 SMIIP 50
51 2D Gaussians 2017/9/25 SMIIP 51
52 Multivariate Student t distribution The pdf of multivariate Student t distribution is The distribution has the following properties 2017/9/25 SMIIP 52
53 Dirichlet distribution Dirichlet distribution is a multivariate generalization of the beta distribution. which has support over the probability simplex, defined by The pdf is defined as follows: the distribution has these properties 2017/9/25 SMIIP 53
54 Transformations of Random Variables
55 Linear transformations Suppose f() is a linear function We have If f() is a scalar-valued function, f(x) = a T x + b, then 2017/9/25 SMIIP 55
56 General transformations If X is a discrete rv, we can derive the pmf for y by simply summing up the probability mass for all the x s such that f(x) = y: If X is continuous 2017/9/25 SMIIP 56
57 Multivariate change of variables Let f be a function that maps R n to R n, and let y = f(x). Then its Jacobian matrix J is given by If f is an invertible mapping, we can define the pdf of the transformed variables using the Jacobian of the inverse mapping y x: 2017/9/25 SMIIP 57
58 Central limit theorem Now consider N random variables with pdf s (not necessarily Gaussian) p(x i ), each with mean μ and variance σ 2. We assume each variable is independent and identically distributed or iid for short N Let S N = i=1 X i be the sum of the rv s. One can show that, as N increases, the distribution of this sum approaches 2017/9/25 SMIIP 58
59 Central limit theorem 2017/9/25 SMIIP 59
60 Monte Carlo Approximation 2017/9/25 SMIIP 60
61 Monte Carlo approximation In general, computing the distribution of a function of an rv using the change of variables formula can be difficult One simple but powerful alternative is Monte Carlo approximation as follows: First, we generate S samples from the distribution, call them x 1,..., x S. By Markov chain Monte Carlo or MCMC Then, we can approximate the distribution of f(x) by using the empirical distribution of {f(x s )} 1 S s= /9/25 SMIIP 61
62 Monte Carlo approximation By varying the function f(), we can approximate many quantities of interest, such as 2017/9/25 SMIIP 62
63 Monte Carlo approximation 2017/9/25 SMIIP 63
64 Some Concepts of Information Theory
65 Entropy Entropy of a random variable X with distribution p For binary random variables, we have This is called binary entropy function 2017/9/25 SMIIP 65
66 Entropy 2017/9/25 SMIIP 66
67 KL divergence KL divergence is the average number of extra bits needed to encode the data 2017/9/25 SMIIP 67
68 Why mutual information? Often, we want to know something of a variable Y from another variable X Correlation can measure the relationship between two variables, but it is defined on real values Furthermore, and it cannot describe the independence between two variables well Independent -> uncorrelated Uncorrelated does not imply independent 2017/9/25 SMIIP 68
69 Mutual information For two rvs X and Y, the MI is defined as conditional entropy We can show that Ⅱ(X, Y) 0 with equality iif p(x, Y)=p(X) p(y) MI between X and Y as the reduction in uncertainty about X after observing Y 2017/9/25 SMIIP 69
70 Pointwise mutual information For two events (not random variables) x and y, PMI is defined as PMI measures the discrepancy between these events occurring together compared to what would be expected by chance MI of X and Y is just the expected value of PMI 2017/9/25 SMIIP 70
71 Two Examples
72 Example: medical diagnosis (rare diseases) breast cancer: p(y = 1) healthy: p(y = 0) Test Positive Test Negative p(x = 1 y = 1) p(x = 1 y = 0) p(x=1) p(x = 0 y = 1) p(x = 0 y = 0) p(x=0) If Jenny is tested positive on breast cancer, what is the probability that she really suffers from breast cancer? 2017/9/25 SMIIP 72
73 Example 1: medical diagnosis (rare diseases) breast cancer: p(y = 1) healthy: p(y = 0) Test Positive Test Negative p(x = 1 y = 1) p(x = 1 y = 0) p(x=1) p(x = 0 y = 1) p(x = 0 y = 0) p(x=0) If Jenny is tested positive on breast cancer, what is the probability that she really suffers from breast cancer? p(y = 1 x = 1)? 2017/9/25 SMIIP 73
74 Example: medical diagnosis (rare diseases) breast cancer: p(y = 1)= healthy: p(y = 0)=0.996 Test Positive p(x = 1 y = 1) =0.8 p(x = 1 y = 0) = 0.1 p(x=1)=0.004* *0.1 Test Negative p(x = 0 y = 1) p(x = 0 y = 0) p(x=0) If Jenny is tested positive on breast cancer, what is the probability that she really suffers from breast cancer? p(y = 1 x = 1)? 2017/9/25 SMIIP 74
75 Example 1: medical diagnosis (rare diseases) breast cancer: p(y = 1)= healthy: p(y = 0)=0.996 Test Positive p(x = 1 y = 1) =0.8 p(x = 1 y = 0) = 0.1 p(x=1)=0.004* *0.1 Test Negative p(x = 0 y = 1) p(x = 0 y = 0) p(x=0) Test Positive should be treated carefully for rare diseases 2017/9/25 SMIIP 75
76 Example 2: German tank problem 1. Frequentist statistics Sample k labels, the largest one is m; this event: E[largest] = P largest = m = N m=k m m 1 k 1 N k = N = μ k 1 m 1 k 1 N k k N + 1 k + 1 = μ μ k 1 = E m k 1 N = m k 1 k = 4, samples = m = 14 N = = /9/25 SMIIP 76
77 German tank problem 2. Bayesian statistics 贝叶斯方法要考虑当观察到的坦克总数 K 等于数 k 序列号最大值 M 等于数 m 时, 敌方坦克总数 N 等于数 n 的可信度 (N = n M = m, K = k)( 简写为 n m, k ), 条件概率有 n k n m, k = m n, k m k 坦克总数已知为 n 观察 k 辆坦克中序列号最大值等于 m 的概率 : 2017/9/25 SMIIP 77
78 German tank problem(bayesian statistics) 2017/9/25 SMIIP 78
79 German tank problem(bayesian statistics) /9/25 SMIIP 79
80 German tank problem 假设某个情报人员已经发现了 k = 4 辆坦克, 其序列号分别为 , 观测到的最大的序列号为 m = 14 坦克未知的总数设为 N 2017/9/25 SMIIP 80
81 German tank problem 根据常规盟军情报的估计, 德国在 1940 年 6 月和 1942 年 9 月之间, 每月大约能生产 1,400 辆坦克 将缴获坦克的序列号代入下文的公式, 可计算出每月 246 辆 战后, 从阿尔伯特 斯佩尔所管辖的部门缴获的德国生产记录显示, 实际数目是 245 辆 某些特定月份的估计如下 : 2017/9/25 SMIIP 81
82 The end Assignment: reading Chapter 2 of the Murphy book 2017/9/25 SMIIP 82
Introduction to Machine Learning
Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB
More informationIntroduction to Machine Learning
What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes
More information课内考试时间? 5/10 5/17 5/24 课内考试? 5/31 课内考试? 6/07 课程论文报告
课内考试时间? 5/10 5/17 5/24 课内考试? 5/31 课内考试? 6/07 课程论文报告 Testing Hypotheses and Assessing Goodness of Fit Generalized likelihood ratio tests The likelihood ratio test is optimal for simple vs. simple hypotheses.
More informationConditional expectation and prediction
Conditional expectation and prediction Conditional frequency functions and pdfs have properties of ordinary frequency and density functions. Hence, associated with a conditional distribution is a conditional
More informationGRE 精确 完整 数学预测机经 发布适用 2015 年 10 月考试
智课网 GRE 备考资料 GRE 精确 完整 数学预测机经 151015 发布适用 2015 年 10 月考试 20150920 1. n is an integer. : (-1)n(-1)n+2 : 1 A. is greater. B. is greater. C. The two quantities are equal D. The relationship cannot be determined
More informationLecture 1: Probability Fundamentals
Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability
More informationIntroduction to Probability and Statistics (Continued)
Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative
More information2012 AP Calculus BC 模拟试卷
0 AP Calculus BC 模拟试卷 北京新东方罗勇 luoyong@df.cn 0-3- 说明 : 请严格按照实际考试时间进行模拟, 考试时间共 95 分钟 Multiple-Choice section A 部分 : 无计算器 B 部分 : 有计算器 Free-response section A 部分 : 有计算器 B 部分 : 无计算器 总计 45 题 /05 分钟 8 题,55 分钟
More informationOn the Quark model based on virtual spacetime and the origin of fractional charge
On the Quark model based on virtual spacetime and the origin of fractional charge Zhi Cheng No. 9 Bairong st. Baiyun District, Guangzhou, China. 510400. gzchengzhi@hotmail.com Abstract: The quark model
More informationProbability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014
Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationLecture 2. Random variables: discrete and continuous
Lecture 2 Random variables: discrete and continuous Random variables: discrete Probability theory is concerned with situations in which the outcomes occur randomly. Generically, such situations are called
More informationA Tutorial on Variational Bayes
A Tutorial on Variational Bayes Junhao Hua ( 华俊豪 ) Laboratory of Machine and Biological Intelligence, Department of Information Science & Electronic Engineering, ZheJiang University 204/3/27 Email: huajh7@gmail.com
More informationRecitation 2: Probability
Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions
More informationAlgorithms for Uncertainty Quantification
Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example
More informationChapter 5 continued. Chapter 5 sections
Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More information系统生物学. (Systems Biology) 马彬广
系统生物学 (Systems Biology) 马彬广 通用建模工具 ( 第十四讲 ) 梗概 (Synopsis) 通用建模工具 ( 数学计算软件 ) 专用建模工具 ( 细胞生化体系建模 ) 通用建模工具 主要是各种数学计算软件, 有些是商业软件, 有些是自由软件 商业软件, 主要介绍 : MatLab, Mathematica, Maple, 另有 MuPAD, 现已被 MatLab 收购 自由软件
More informationLecture 3. Probability - Part 2. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. October 19, 2016
Lecture 3 Probability - Part 2 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza October 19, 2016 Luigi Freda ( La Sapienza University) Lecture 3 October 19, 2016 1 / 46 Outline 1 Common Continuous
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationSource mechanism solution
Source mechanism solution Contents Source mechanism solution 1 1. A general introduction 1 2. A step-by-step guide 1 Step-1: Prepare data files 1 Step-2: Start GeoTaos or GeoTaos_Map 2 Step-3: Convert
More informationChapter 2 Bayesian Decision Theory. Pattern Recognition Soochow, Fall Semester 1
Chapter 2 Bayesian Decision Theory Pattern Recognition Soochow, Fall Semester 1 Decision Theory Decision Make choice under uncertainty Pattern Recognition Pattern Category Given a test sample, its category
More informationReview of probability
Review of probability Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts definition of probability random variables
More informationLectures on Statistical Data Analysis
Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk
More informationComputational Genomics
Computational Genomics http://www.cs.cmu.edu/~02710 Introduction to probability, statistics and algorithms (brief) intro to probability Basic notations Random variable - referring to an element / event
More informationIntroduction to Probability and Statistics (Continued)
Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:
More informationProperties Measurement of H ZZ* 4l and Z 4l with ATLAS
Properties Measurement of H ZZ* 4l and Z 4l with ATLAS Haijun Yang (Shanghai Jiao Tong University) LHC mini-workshop Zhejiang University, HangZhou, China November 8-11, 2014 1 Outline o Discovery of the
More information0 0 = 1 0 = 0 1 = = 1 1 = 0 0 = 1
0 0 = 1 0 = 0 1 = 0 1 1 = 1 1 = 0 0 = 1 : = {0, 1} : 3 (,, ) = + (,, ) = + + (, ) = + (,,, ) = ( + )( + ) + ( + )( + ) + = + = = + + = + = ( + ) + = + ( + ) () = () ( + ) = + + = ( + )( + ) + = = + 0
More information三类调度问题的复合派遣算法及其在医疗运营管理中的应用
申请上海交通大学博士学位论文 三类调度问题的复合派遣算法及其在医疗运营管理中的应用 博士生 : 苏惠荞 导师 : 万国华教授 专业 : 管理科学与工程 研究方向 : 运作管理 学校代码 : 10248 上海交通大学安泰经济与管理学院 2017 年 6 月 Dissertation Submitted to Shanghai Jiao Tong University for the Degree of
More information通量数据质量控制的理论与方法 理加联合科技有限公司
通量数据质量控制的理论与方法 理加联合科技有限公司 通量变量 Rn = LE + H + G (W m -2 s -1 ) 净辐射 潜热 感热 地表热 通量 通量 通量 通量 Fc (mg m -2 s -1 ) 二氧化碳通量 τ [(kg m s -1 ) m -2 s -1 ] 动量通量 质量控制 1. 概率统计方法 2. 趋势法 3. 大气物理依据 4. 测定实地诊断 5. 仪器物理依据 '
More informationJoint Probability Distributions, Correlations
Joint Probability Distributions, Correlations What we learned so far Events: Working with events as sets: union, intersection, etc. Some events are simple: Head vs Tails, Cancer vs Healthy Some are more
More informationPROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS
PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability
More informationClass 26: review for final exam 18.05, Spring 2014
Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event
More informationProbability Distributions Columns (a) through (d)
Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)
More informationDEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY
DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY OUTLINE 3.1 Why Probability? 3.2 Random Variables 3.3 Probability Distributions 3.4 Marginal Probability 3.5 Conditional Probability 3.6 The Chain
More informationThe dynamic N1-methyladenosine methylome in eukaryotic messenger RNA 报告人 : 沈胤
The dynamic N1-methyladenosine methylome in eukaryotic messenger RNA 报告人 : 沈胤 2016.12.26 研究背景 RNA 甲基化作为表观遗传学研究的重要内容之一, 是指发生在 RNA 分子上不同位置的甲基化修饰现象 RNA 甲基化在调控基因表达 剪接 RNA 编辑 RNA 稳定性 控制 mrna 寿命和降解等方面可能扮演重要角色
More informationBayesian Models in Machine Learning
Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationJoint Probability Distributions, Correlations
Joint Probability Distributions, Correlations What we learned so far Events: Working with events as sets: union, intersection, etc. Some events are simple: Head vs Tails, Cancer vs Healthy Some are more
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume II: Probability Emlyn Lloyd University oflancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester - New York - Brisbane
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationA Few Special Distributions and Their Properties
A Few Special Distributions and Their Properties Econ 690 Purdue University Justin L. Tobias (Purdue) Distributional Catalog 1 / 20 Special Distributions and Their Associated Properties 1 Uniform Distribution
More informationChapter 22 Lecture. Essential University Physics Richard Wolfson 2 nd Edition. Electric Potential 電位 Pearson Education, Inc.
Chapter 22 Lecture Essential University Physics Richard Wolfson 2 nd Edition Electric Potential 電位 Slide 22-1 In this lecture you ll learn 簡介 The concept of electric potential difference 電位差 Including
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationCS 591, Lecture 2 Data Analytics: Theory and Applications Boston University
CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University Charalampos E. Tsourakakis January 25rd, 2017 Probability Theory The theory of probability is a system for making better guesses.
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More information能源化学工程专业培养方案. Undergraduate Program for Specialty in Energy Chemical Engineering 专业负责人 : 何平分管院长 : 廖其龙院学术委员会主任 : 李玉香
能源化学工程专业培养方案 Undergraduate Program for Specialty in Energy Chemical Engineering 专业负责人 : 何平分管院长 : 廖其龙院学术委员会主任 : 李玉香 Director of Specialty: He Ping Executive Dean: Liao Qilong Academic Committee Director:
More informationMachine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014
Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,
More informationReview (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology
Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna
More informationLecture 1: August 28
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random
More informationPreliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com
1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix
More informationExpectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda
Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationLearning Objectives for Stat 225
Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:
More informationExpectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,
More informationChapter 2 the z-transform. 2.1 definition 2.2 properties of ROC 2.3 the inverse z-transform 2.4 z-transform properties
Chapter 2 the -Transform 2.1 definition 2.2 properties of ROC 2.3 the inverse -transform 2.4 -transform properties 2.1 definition One motivation for introducing -transform is that the Fourier transform
More informationA proof of the 3x +1 conjecture
A proof of he 3 + cojecure (Xjag, Cha Rado ad Televso Uversy) (23..) Su-fawag Absrac: Fd a soluo o 3 + cojecures a mahemacal ool o fd ou he codo 3 + cojecures gve 3 + cojecure became a proof. Keywords:
More informationLecture 2: Review of Basic Probability Theory
ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent
More information课内考试时间 5/21 5/28 课内考试 6/04 课程论文报告?
课内考试时间 5/21 5/28 课内考试 6/04 课程论文报告? Generalized likelihood ratio tests The likelihood ratio test is optimal for simple vs. simple hypotheses. Generalized likelihood ratio tests are for use when hypotheses
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationAlgorithms and Complexity
Algorithms and Complexity 2.1 ALGORITHMS( 演算法 ) Def: An algorithm is a finite set of precise instructions for performing a computation or for solving a problem The word algorithm algorithm comes from the
More informationProbability Theory for Machine Learning. Chris Cremer September 2015
Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares
More information3. Review of Probability and Statistics
3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More information= lim(x + 1) lim x 1 x 1 (x 2 + 1) 2 (for the latter let y = x2 + 1) lim
1061 微乙 01-05 班期中考解答和評分標準 1. (10%) (x + 1)( (a) 求 x+1 9). x 1 x 1 tan (π(x )) (b) 求. x (x ) x (a) (5 points) Method without L Hospital rule: (x + 1)( x+1 9) = (x + 1) x+1 x 1 x 1 x 1 x 1 (x + 1) (for the
More informationMachine Learning using Bayesian Approaches
Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes
More informationBayesian analysis in nuclear physics
Bayesian analysis in nuclear physics Ken Hanson T-16, Nuclear Physics; Theoretical Division Los Alamos National Laboratory Tutorials presented at LANSCE Los Alamos Neutron Scattering Center July 25 August
More informationBinomial and Poisson Probability Distributions
Binomial and Poisson Probability Distributions Esra Akdeniz March 3, 2016 Bernoulli Random Variable Any random variable whose only possible values are 0 or 1 is called a Bernoulli random variable. What
More informationEaster Traditions 复活节习俗
Easter Traditions 复活节习俗 1 Easter Traditions 复活节习俗 Why the big rabbit? 为什么有个大兔子? Read the text below and do the activity that follows 阅读下面的短文, 然后完成练习 : It s Easter in the UK and the shops are full of Easter
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More information國立中正大學八十一學年度應用數學研究所 碩士班研究生招生考試試題
國立中正大學八十一學年度應用數學研究所 碩士班研究生招生考試試題 基礎數學 I.(2%) Test for convergence or divergence of the following infinite series cos( π (a) ) sin( π n (b) ) n n=1 n n=1 n 1 1 (c) (p > 1) (d) n=2 n(log n) p n,m=1 n 2 +
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationName: Firas Rassoul-Agha
Midterm 1 - Math 5010 - Spring 016 Name: Firas Rassoul-Agha Solve the following 4 problems. You have to clearly explain your solution. The answer carries no points. Only the work does. CALCULATORS ARE
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationBivariate distributions
Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient
More informationLecture 3. Discrete Random Variables
Math 408 - Mathematical Statistics Lecture 3. Discrete Random Variables January 23, 2013 Konstantin Zuev (USC) Math 408, Lecture 3 January 23, 2013 1 / 14 Agenda Random Variable: Motivation and Definition
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationConcurrent Engineering Pdf Ebook Download >>> DOWNLOAD
1 / 6 Concurrent Engineering Pdf Ebook Download >>> DOWNLOAD 2 / 6 3 / 6 Rozenfeld, WEversheim, HKroll - Springer.US - 1998 WDuring 2005 年 3 月 1 日 - For.the.journal,.see.Conc urrent.engineering.(journal)verhagen
More information5. Polymorphism, Selection. and Phylogenetics. 5.1 Population genetics. 5.2 Phylogenetics
5. Polymorphism, Selection 5.1 Population genetics and Phylogenetics Polymorphism in the genomes Types of polymorphism Measure of polymorphism Natural and artificial selection: the force shaping the genomes
More informationReview of Probabilities and Basic Statistics
Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to
More informationLecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019
Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationIntroduction to Applied Bayesian Modeling. ICPSR Day 4
Introduction to Applied Bayesian Modeling ICPSR Day 4 Simple Priors Remember Bayes Law: Where P(A) is the prior probability of A Simple prior Recall the test for disease example where we specified the
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationAn Introduction to Generalized Method of Moments. Chen,Rong aronge.net
An Introduction to Generalized Method of Moments Chen,Rong http:// aronge.net Asset Pricing, 2012 Section 1 WHY GMM? 2 Empirical Studies 3 Econometric Estimation Strategies 4 5 Maximum Likelihood Estimation
More informationMATH Notebook 5 Fall 2018/2019
MATH442601 2 Notebook 5 Fall 2018/2019 prepared by Professor Jenny Baglivo c Copyright 2004-2019 by Jenny A. Baglivo. All Rights Reserved. 5 MATH442601 2 Notebook 5 3 5.1 Sequences of IID Random Variables.............................
More informationRiemann s Hypothesis and Conjecture of Birch and Swinnerton-Dyer are False
Riemann s Hypothesis and Conjecture of Birch and Swinnerton-yer are False Chun-Xuan Jiang. O. Box 3924, Beijing 854 China jcxuan@sina.com Abstract All eyes are on the Riemann s hypothesis, zeta and L-functions,
More informationStatistical Methods for Astronomy
Statistical Methods for Astronomy If your experiment needs statistics, you ought to have done a better experiment. -Ernest Rutherford Lecture 1 Lecture 2 Why do we need statistics? Definitions Statistical
More informationWhy study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables
ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section
More informationLecture 5: GPs and Streaming regression
Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X
More informationThe Binomial distribution. Probability theory 2. Example. The Binomial distribution
Probability theory Tron Anders Moger September th 7 The Binomial distribution Bernoulli distribution: One experiment X i with two possible outcomes, probability of success P. If the experiment is repeated
More informationDigital Image Processing. Point Processing( 点处理 )
Digital Image Processing Point Processing( 点处理 ) Point Processing of Images In a digital image, point = pixel. Point processing transforms a pixel s value as function of its value alone; it it does not
More informationM378K In-Class Assignment #1
The following problems are a review of M6K. M7K In-Class Assignment # Problem.. Complete the definition of mutual exclusivity of events below: Events A, B Ω are said to be mutually exclusive if A B =.
More informationBayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007
Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.
More informationd) There is a Web page that includes links to both Web page A and Web page B.
P403-406 5. Determine whether the relation R on the set of all eb pages is reflexive( 自反 ), symmetric( 对 称 ), antisymmetric( 反对称 ), and/or transitive( 传递 ), where (a, b) R if and only if a) Everyone who
More informationEE514A Information Theory I Fall 2013
EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/
More informationModeling effects of changes in diffuse radiation on light use efficiency in forest ecosystem. Wei Nan
Modeling effects of changes in diffuse radiation on light use efficiency in forest ecosystem Wei Nan 2018.05.04 1 Outline 1. Background 2. Material and methods 3. Results & Discussion 4. Conclusion 2 1
More information