One-shot Learning of Poisson Distributions Information Theory of Audic-Claverie Statistic for Analyzing cdna Arrays
|
|
- Sherman Gardner
- 5 years ago
- Views:
Transcription
1 One-shot Learning of Poisson Distributions Information Theory of Audic-Claverie Statistic for Analyzing cdna Arrays Peter Tiňo School of Computer Science University of Birmingham, UK One-shot Learning of Poisson Distributions p.1/27
2 cdna array analysis Biologists - analyze patterns of expression levels of selected genes in different tissues possibly obtained under different conditions or treatment regimes. Measurement of gene expression levels: via hybridization to microarrays by counting gene tags (signatures) using e.g. Serial Analysis of Gene Expression (SAGE) or Massively Parallel Signature Sequencing (MPSS) methodologies One-shot Learning of Poisson Distributions p.2/27
3 SAGE SAGE procedure results in a library of short sequence tags, each representing an expressed gene. Key assumption: every mrna copy in the tissue has the same chance of ending up as a tag in the library. Selecting a specific tag from the pool of transcripts can be approximately considered as sampling with replacement Key step in many SAGE studies: identification of interesting genes typically those that are differentially expressed under different conditions/treatments. Compare the number of specific tags found in two SAGE libraries corresponding to different conditions or treatments. One-shot Learning of Poisson Distributions p.3/27
4 The approach of Audic and Claverie Audic and Claverie among the first to systematically study the influence of random fluctuations and sampling size on the reliability of digital expression profile data. a popular approach in current biological research: 427 citations (ISI Web of Knowledge), over 100 citations in the past 3 years. Typically, cdna libraries contain a large number of different expressed genes and observing a given cdna qualifies as a rare event. One-shot Learning of Poisson Distributions p.4/27
5 A-C approach Consider a transcript representing a small fraction of the library and a large number N of clones. The probability of observing x tags of the same gene is well-approximated by the Poisson distribution parametrized by λ 0: P(X = x λ) = e λλx x!. The unknown parameter λ signifies the number of transcripts of the given type (tag) per N clones in the cdna library. One-shot Learning of Poisson Distributions p.5/27
6 A-C approach - cont d Null hypothesis of not differentially expressed genes: the tag count x in one library comes from the same underlying Poisson distribution P( λ) as the tag count y in the other library. Each SAGE library represents a single (count) measurement only! From a purely statistical standpoint, resolving this issue is potentially quite problematic... Key instrument of the A-C approach: distribution P(y x) over tag counts y in one library informed by the tag count x in the other library, under the null hypothesis that the tag counts are generated from the same but unknown Poisson distribution. One-shot Learning of Poisson Distributions p.6/27
7 So what do we really want to do? press once press once λ 1 λ 2 x y Are you crazy?? λ = 1 λ 2 One-shot Learning of Poisson Distributions p.7/27
8 P(y x) P(y x) = = = p(y,λ x) dλ P(y λ,x) p(λ x) dλ P(y λ) P(x λ) p(λ) 0 P(x λ ) p(λ ) dλ dλ. Imposing flat prior p(λ) over the Poisson parameter λ results in P(y x) = 1 y! 0 e 2λ λ x+y dλ 0 e λ λ x dλ. One-shot Learning of Poisson Distributions p.8/27
9 A-C statistic Since Gamma distribution parametrized by a, b > 0 takes the form Gamma(λ a,b) = 1 Γ(a) ba λ a 1 e bλ, where Γ(a) = 0 u a 1 e u du is the Gamma function, we have P(y x) = 1 Γ(x + y + 1) y! 2 x+y+1, Γ(x + 1) which, since x and y are integers (i.e. Γ(x) = (x 1)!), can be rewritten as P(y x) = 1 (x + y)! 2 x+y+1 x! y! = 1 2 x+y+1 ( x + y x ). One-shot Learning of Poisson Distributions p.9/27
10 A-C statistic - cont d P(y x) is symmetric, i.e. for x,y 0, P(y x) = P(x y). a desirable property: since if the counts x,y are related to two libraries of the same size, they should be interchangeable when analyzing whether they come from the same underlying process or not. A-C statistic can be used e.g. for principled inferences, construction of confidence intervals, statistical testing etc. we ask: 1. How natural is the A-C statistic s representation of the underlying unknown Poisson distribution governing the tag counts? 2. Given that the observed tag count sample is very limited, how well can the Audic-Claverie approach work, i.e. how well does the A-C statistic capture the underlying Poisson distribution? One-shot Learning of Poisson Distributions p.10/27
11 Poisson distribution vs A-C statistic x=10 x= p(yix) Poisson(y x) 0.12 p(yix) Poisson(y x) y y Graphs of A-C statistic P(y x) (solid line) and the corresponding Poisson distribution P(y λ) at λ = x (dashed line) for x = 10 (a) and x = 30 (b). One-shot Learning of Poisson Distributions p.11/27
12 A-C statistic - mode structure The A-C statistic and the underlying Poisson distribution are quite similar in their nature. For any (integer) mean tag count λ 1, the Poisson distribution P( λ) has two neighboring modes located at λ and λ 1, with P(λ λ) = P(λ 1 λ). After observing a count x, the A-C statistic expects counts y = x and y = x 1 with the highest and equal probability. The other values of count y are less probable. One-shot Learning of Poisson Distributions p.12/27
13 A-C statistic - mode structure cont d Theorem 1 Let x,y and d be integers with ranges specified below. It holds: 1. P(x x) > P(x + d x) for any x 0 and d For x 1, P(x x) = P(x 1 x). 3. P(x x) > P(x d x) for any x 2 and 2 d x. One-shot Learning of Poisson Distributions p.13/27
14 A-C statistic - mean and variance Theorem 2 Consider a non-negative integer x and the associated A-C statistic P(y x). Then it holds: 1. E P(y x) [y] = x V ar P(y x) [y] = E P(y x) [(y E P(y x) [y]) 2 ] = 2 E P(y x) [y]. Note that for Poisson distribution: E P(y x) [y] = V ar P(y x) [y] = x. The larger variance of P( x) is the result of Bayesian averaging of Poisson distributions under a flat prior. One-shot Learning of Poisson Distributions p.14/27
15 A-C statistic - Information Theory Assume that there is some true underlying Poisson distribution P(y λ) over possible counts y 0 with unknown mean λ. In the same process, we first generate a count x and then use the A-C statistic P(y x) to define a distribution over y, given the already observed count x. We ask: How different, in terms of Kullback-Leibler (K-L) divergence, are the two distributions P(y x) and P(y λ) over y? For the A-C statistic to work, one would like P(y x) to be sufficiently representative of the true unknown distribution P(y λ). One-shot Learning of Poisson Distributions p.15/27
16 Thought experiment 1 The environment P(. λ) press once Let s see how close you can get λ What do you mean by "close"? x P(. x ) "similarity" between P(. λ) and P(. x) One-shot Learning of Poisson Distributions p.16/27
17 K-L divergence from P(y λ) to P(y x) D(λ,x) = D KL [P(y λ) P(y x)] = y=0 P(y λ) log P(y λ) P(y x). We have D(λ,x) = H[P(y λ)] + log x! + (λ + x + 1)log 2 + F(λ,0) F(λ,x), where for each integer d 0, F(λ,d) = E P(y λ) [log(y + d)!] = P(y λ) log(y + d)! y=0 and H[P(y λ)] is the entropy of P(y λ). One-shot Learning of Poisson Distributions p.17/27
18 Minimum of D(λ, x) One might intuitively expect D(λ,x) to be minimal at x = λ. The conditioning count in the A-C statistic would be the mean of the underlying Poisson distribution. However, the mode of that Poisson distribution, λ 1, is surrounded by enough probability mass to yield: Theorem 3 For any integer λ 1, it holds D(λ, λ) > D(λ, λ 1). In other words, D KL [P(y λ) P(y λ)] > D KL [P(y λ) P(y λ 1)]. One-shot Learning of Poisson Distributions p.18/27
19 Thought experiment 2 The environment P(. λ) press many times Repeat this many times... λ... and see how close I get on average x P(. x ) "similarity" between P(. λ) and P(. x) One-shot Learning of Poisson Distributions p.19/27
20 Expectation of D(λ, x) under sampling of x Given an underlying Poisson distribution P(x λ), if we repeatedly generated a representative count x from P(x λ), what would be the average divergence of the corresponding A-C statistic P(y x) from the truth P(y λ)? We are interested in the quantity E(λ) = E P(x λ) [D(λ,x)]. (1) Up to terms of order O(λ 1 ), the expected divergence of A-C statistic P(y x)] from the true underlying Poisson distribution P(y λ) is equal to (1/2)log 2. One-shot Learning of Poisson Distributions p.20/27
21 Expectation of D(λ, x) Theorem 4 Consider an underlying Poisson distribution P( λ) parametrized by some λ > 0. Then ) E(λ) = E P(x λ) [D KL [P(y λ) P(y x)]] = 1 2 log 2 + O ( 1 λ Sketch of proof: E(λ) = λ(log λ log e + 2log 2) + log 2 + F(λ,0) E P(x λ) [F(λ,x)] E P(x λ) [F(λ,x)] = F(2λ,0) F(λ,0) = λ(log λ log e) log(2πeλ) + O(λ 1 ) One-shot Learning of Poisson Distributions p.21/27
22 Higher order expansion of E(λ) Entropy expansion: H[P(y λ)] = 1 2 log 2(2πeλ) 1 12λ 1 24λ 2 + O(λ 3 ) Expected divergence measured in bits: E(λ) = λ ( 1 1 ) λ 2 ( ) + O(λ 3 ) One-shot Learning of Poisson Distributions p.22/27
23 Analytical approximation of E(λ) E[D KL] numer anal lambda One-shot Learning of Poisson Distributions p.23/27
24 Discussion The Audic-Claverie method a popular approach for detection of differentially expressed genes in the SAGE framework. Main assumption under the null hypothesis the tag counts x,y in two libraries come from the same but unknown Poisson distribution P( λ). The problem each SAGE library represents only a single measurement. Poisson distribution is rather rigid : it is unimodal and parametrized by a single parameter λ representing both its mean and variance. Learning about P( λ) from a very limited sample (as one is effectively bound to do in the SAGE framework) is much less suspicious than one might naively expect. One-shot Learning of Poisson Distributions p.24/27
25 Discussion - cont d We analyzed how close is the A-C statistic P( λ) (in terms of K-L divergence) to the underlying Poisson distribution P( λ) of tag counts. On average, the A-C statistic is never too far from the true underlying distribution. Up to terms of order O(λ 3 ), on average, the A-C statistic is never further away from the truth P( λ) than half-a-bit of additional information. Hence, the Audic-Claverie method can be expected to work well even though the SAGE libraries represent very sparse samples. One-shot Learning of Poisson Distributions p.25/27
26 Discussion - cont d So far the Audic-Claverie methodology has been verified only empirically through a series of specific Monte Carlo simulations. It has not been clear how general the apparently stable simulation findings were. The A-C statistic is universally applicable in any situation where inferences about the underlying Poisson distribution must be made based on an extremely sparse sample. In the Monte Carlo simulations the false alarm rate was small for genes associated with small tag counts and gradually increased for higher tag counts. These findings are consistent with our theoretically calculated divergence function E(λ). One-shot Learning of Poisson Distributions p.26/27
27 Thank you! Further reading, full proofs etc.: P. Tiňo: Basic Properties and Information Theory of Audic-Claverie Statistic for Analyzing cdna Arrays. BMC Bioinformatics, 10:310, Open access, or pxt/my.publ.html One-shot Learning of Poisson Distributions p.27/27
ISSN Article
Entropy 23, 5, 22-22; doi:.339/e5422 OPEN ACCESS entropy ISSN 99-43 www.mdpi.com/journal/entropy Article Pushing for the Extreme: Estimation of Poisson Distribution from Low Count Unreplicated Data How
More informationDecision-making, inference, and learning theory. ECE 830 & CS 761, Spring 2016
Decision-making, inference, and learning theory ECE 830 & CS 761, Spring 2016 1 / 22 What do we have here? Given measurements or observations of some physical process, we ask the simple question what do
More informationEXAMPLE: INFERENCE FOR POISSON SAMPLING
EXAMPLE: INFERENCE FOR POISSON SAMPLING ERIK QUAEGHEBEUR AND GERT DE COOMAN. REPRESENTING THE DATA Observed numbers of emails arriving in Gert s mailbox between 9am and am on ten consecutive Mondays: 2
More informationTechnologie w skali genomowej 2/ Algorytmiczne i statystyczne aspekty sekwencjonowania DNA
Technologie w skali genomowej 2/ Algorytmiczne i statystyczne aspekty sekwencjonowania DNA Expression analysis for RNA-seq data Ewa Szczurek Instytut Informatyki Uniwersytet Warszawski 1/35 The problem
More informationChapter 7 Comparison of two independent samples
Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N
More informationLectures 5 & 6: Hypothesis Testing
Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across
More informationExpression arrays, normalization, and error models
1 Epression arrays, normalization, and error models There are a number of different array technologies available for measuring mrna transcript levels in cell populations, from spotted cdna arrays to in
More informationMath Review Sheet, Fall 2008
1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the
More informationT.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS
ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only
More informationIf we want to analyze experimental or simulated data we might encounter the following tasks:
Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction
More informationLecture 21: October 19
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use
More informationChapter 3: Statistical methods for estimation and testing. Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001).
Chapter 3: Statistical methods for estimation and testing Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001). Chapter 3: Statistical methods for estimation and testing Key reference:
More informationg A n(a, g) n(a, ḡ) = n(a) n(a, g) n(a) B n(b, g) n(a, ḡ) = n(b) n(b, g) n(b) g A,B A, B 2 RNA-seq (D) RNA mrna [3] RNA 2. 2 NGS 2 A, B NGS n(
,a) RNA-seq RNA-seq Cuffdiff, edger, DESeq Sese Jun,a) Abstract: Frequently used biological experiment technique for observing comprehensive gene expression has been changed from microarray using cdna
More informationA variational radial basis function approximation for diffusion processes
A variational radial basis function approximation for diffusion processes Michail D. Vrettas, Dan Cornford and Yuan Shen Aston University - Neural Computing Research Group Aston Triangle, Birmingham B4
More informationBayesian Inference for the Multivariate Normal
Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate
More informationStatistics for Differential Expression in Sequencing Studies. Naomi Altman
Statistics for Differential Expression in Sequencing Studies Naomi Altman naomi@stat.psu.edu Outline Preliminaries what you need to do before the DE analysis Stat Background what you need to know to understand
More informationLecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1
Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,
More informationAdvanced Statistical Methods. Lecture 6
Advanced Statistical Methods Lecture 6 Convergence distribution of M.-H. MCMC We denote the PDF estimated by the MCMC as. It has the property Convergence distribution After some time, the distribution
More informationPractical Statistics
Practical Statistics Lecture 1 (Nov. 9): - Correlation - Hypothesis Testing Lecture 2 (Nov. 16): - Error Estimation - Bayesian Analysis - Rejecting Outliers Lecture 3 (Nov. 18) - Monte Carlo Modeling -
More informationGS Analysis of Microarray Data
GS01 0163 Analysis of Microarray Data Keith Baggerly and Kevin Coombes Section of Bioinformatics Department of Biostatistics and Applied Mathematics UT M. D. Anderson Cancer Center kabagg@mdanderson.org
More informationGLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data
GLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data 1 Gene Networks Definition: A gene network is a set of molecular components, such as genes and proteins, and interactions between
More informationSingle gene analysis of differential expression. Giorgio Valentini
Single gene analysis of differential expression Giorgio Valentini valenti@disi.unige.it Comparing two conditions Each condition may be represented by one or more RNA samples. Using cdna microarrays, samples
More informationIntelligent Data Analysis Lecture Notes on Document Mining
Intelligent Data Analysis Lecture Notes on Document Mining Peter Tiňo Representing Textual Documents as Vectors Our next topic will take us to seemingly very different data spaces - those of textual documents.
More informationQuantitative Biology II Lecture 4: Variational Methods
10 th March 2015 Quantitative Biology II Lecture 4: Variational Methods Gurinder Singh Mickey Atwal Center for Quantitative Biology Cold Spring Harbor Laboratory Image credit: Mike West Summary Approximate
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationLecture 1: Introduction, Entropy and ML estimation
0-704: Information Processing and Learning Spring 202 Lecture : Introduction, Entropy and ML estimation Lecturer: Aarti Singh Scribes: Min Xu Disclaimer: These notes have not been subjected to the usual
More informationA Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data
A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationLinear Algebra and Probability
Linear Algebra and Probability for Computer Science Applications Ernest Davis CRC Press Taylor!* Francis Group Boca Raton London New York CRC Press is an imprint of the Taylor Sc Francis Croup, an informa
More informationDesign of microarray experiments
Design of microarray experiments Ulrich ansmann mansmann@imbi.uni-heidelberg.de Practical microarray analysis September Heidelberg Heidelberg, September otivation The lab biologist and theoretician need
More informationhsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference
CS 229 Project Report (TR# MSB2010) Submitted 12/10/2010 hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference Muhammad Shoaib Sehgal Computer Science
More informationChapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.
Chapter 24 Comparing Means Copyright 2010 Pearson Education, Inc. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side. For example:
More informationLecture 10: Generalized likelihood ratio test
Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.
Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 Suggested Projects: www.cs.ubc.ca/~arnaud/projects.html First assignement on the web: capture/recapture.
More informationComputational Systems Biology
Computational Systems Biology Vasant Honavar Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Graduate Program Center for Computational Intelligence, Learning, & Discovery
More informationPurposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions
Part 1: Probability Distributions Purposes of Data Analysis True Distributions or Relationships in the Earths System Probability Distribution Normal Distribution Student-t Distribution Chi Square Distribution
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More informationProbability and Statistics
Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Chapter 3: Parametric families of univariate distributions CHAPTER 3: PARAMETRIC
More informationSupplementary Material for:
Supplementary Material for: Correction to Kyllingsbæk, Markussen, and Bundesen (2012) Unfortunately, the computational shortcut used in Kyllingsbæk et al. (2012) (henceforth; the article) to fit the Poisson
More informationVariational Inference and Learning. Sargur N. Srihari
Variational Inference and Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics in Approximate Inference Task of Inference Intractability in Inference 1. Inference as Optimization 2. Expectation Maximization
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationSample Size Estimation for Studies of High-Dimensional Data
Sample Size Estimation for Studies of High-Dimensional Data James J. Chen, Ph.D. National Center for Toxicological Research Food and Drug Administration June 3, 2009 China Medical University Taichung,
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationMEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES
XX IMEKO World Congress Metrology for Green Growth September 9 14, 212, Busan, Republic of Korea MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES A B Forbes National Physical Laboratory, Teddington,
More informationLearning Objectives. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 7.2, Page 1
Learning Objectives At the end of the class you should be able to: identify a supervised learning problem characterize how the prediction is a function of the error measure avoid mixing the training and
More informationIEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm
IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.
More informationIntroduction to Machine Learning
Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB
More informationLecture 3: More on regularization. Bayesian vs maximum likelihood learning
Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting
More informationDEGseq: an R package for identifying differentially expressed genes from RNA-seq data
DEGseq: an R package for identifying differentially expressed genes from RNA-seq data Likun Wang Zhixing Feng i Wang iaowo Wang * and uegong Zhang * MOE Key Laboratory of Bioinformatics and Bioinformatics
More informationStable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence
Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham NC 778-5 - Revised April,
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationStatistical Data Analysis Stat 3: p-values, parameter estimation
Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,
More information2. The Binomial Distribution
1 of 11 7/16/2009 6:39 AM Virtual Laboratories > 11. Bernoulli Trials > 1 2 3 4 5 6 2. The Binomial Distribution Basic Theory Suppose that our random experiment is to perform a sequence of Bernoulli trials
More informationDesign of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments
Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments The hypothesis testing framework The two-sample t-test Checking assumptions, validity Comparing more that
More informationSemi-Parametric Importance Sampling for Rare-event probability Estimation
Semi-Parametric Importance Sampling for Rare-event probability Estimation Z. I. Botev and P. L Ecuyer IMACS Seminar 2011 Borovets, Bulgaria Semi-Parametric Importance Sampling for Rare-event probability
More informationCOMP6053 lecture: Sampling and the central limit theorem. Jason Noble,
COMP6053 lecture: Sampling and the central limit theorem Jason Noble, jn2@ecs.soton.ac.uk Populations: long-run distributions Two kinds of distributions: populations and samples. A population is the set
More informationBiology as Information Dynamics
Biology as Information Dynamics John Baez Biological Complexity: Can It Be Quantified? Beyond Center February 2, 2017 IT S ALL RELATIVE EVEN INFORMATION! When you learn something, how much information
More informationDispersion modeling for RNAseq differential analysis
Dispersion modeling for RNAseq differential analysis E. Bonafede 1, F. Picard 2, S. Robin 3, C. Viroli 1 ( 1 ) univ. Bologna, ( 3 ) CNRS/univ. Lyon I, ( 3 ) INRA/AgroParisTech, Paris IBC, Victoria, July
More informationLAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2
LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 Data Analysis: The mean egg masses (g) of the two different types of eggs may be exactly the same, in which case you may be tempted to accept
More informationHands-On Learning Theory Fall 2016, Lecture 3
Hands-On Learning Theory Fall 016, Lecture 3 Jean Honorio jhonorio@purdue.edu 1 Information Theory First, we provide some information theory background. Definition 3.1 (Entropy). The entropy of a discrete
More informationPrincipal component analysis (PCA) for clustering gene expression data
Principal component analysis (PCA) for clustering gene expression data Ka Yee Yeung Walter L. Ruzzo Bioinformatics, v17 #9 (2001) pp 763-774 1 Outline of talk Background and motivation Design of our empirical
More informationInformation Geometric view of Belief Propagation
Information Geometric view of Belief Propagation Yunshu Liu 2013-10-17 References: [1]. Shiro Ikeda, Toshiyuki Tanaka and Shun-ichi Amari, Stochastic reasoning, Free energy and Information Geometry, Neural
More informationProbability Distribution for a normal random variable x:
Chapter5 Continuous Random Variables 5.3 The Normal Distribution Probability Distribution for a normal random variable x: 1. It is and about its mean µ. 2. (the that x falls in the interval a < x < b is
More informationMixtures and Hidden Markov Models for analyzing genomic data
Mixtures and Hidden Markov Models for analyzing genomic data Marie-Laure Martin-Magniette UMR AgroParisTech/INRA Mathématique et Informatique Appliquées, Paris UMR INRA/UEVE ERL CNRS Unité de Recherche
More informationRobust Monte Carlo Methods for Sequential Planning and Decision Making
Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory
More information27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling
10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel
More informationDifferential expression analysis for sequencing count data. Simon Anders
Differential expression analysis for sequencing count data Simon Anders RNA-Seq Count data in HTS RNA-Seq Tag-Seq Gene 13CDNA73 A2BP1 A2M A4GALT AAAS AACS AADACL1 [...] ChIP-Seq Bar-Seq... GliNS1 4 19
More informationConfidence Intervals and Hypothesis Tests
Confidence Intervals and Hypothesis Tests STA 281 Fall 2011 1 Background The central limit theorem provides a very powerful tool for determining the distribution of sample means for large sample sizes.
More informationThe connection of dropout and Bayesian statistics
The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University
More informationPrimer on statistics:
Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood
More informationA Modification of Linfoot s Informational Correlation Coefficient
Austrian Journal of Statistics April 07, Volume 46, 99 05. AJS http://www.ajs.or.at/ doi:0.773/ajs.v46i3-4.675 A Modification of Linfoot s Informational Correlation Coefficient Georgy Shevlyakov Peter
More informationLecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016
Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationMachine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)
More informationLeverage Sparse Information in Predictive Modeling
Leverage Sparse Information in Predictive Modeling Liang Xie Countrywide Home Loans, Countrywide Bank, FSB August 29, 2008 Abstract This paper examines an innovative method to leverage information from
More informationBustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #
Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either
More informationConditional distributions (discrete case)
Conditional distributions (discrete case) The basic idea behind conditional distributions is simple: Suppose (XY) is a jointly-distributed random vector with a discrete joint distribution. Then we can
More informationStatistics notes. A clear statistical framework formulates the logic of what we are doing and why. It allows us to make precise statements.
Statistics notes Introductory comments These notes provide a summary or cheat sheet covering some basic statistical recipes and methods. These will be discussed in more detail in the lectures! What is
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationComparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters
Journal of Modern Applied Statistical Methods Volume 13 Issue 1 Article 26 5-1-2014 Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Yohei Kawasaki Tokyo University
More informationStatistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data
Statistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data Taposh Banerjee University of Texas at San Antonio Joint work with Gene Whipps (US Army Research Laboratory) Prudhvi
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationLecture 30. DATA 8 Summer Regression Inference
DATA 8 Summer 2018 Lecture 30 Regression Inference Slides created by John DeNero (denero@berkeley.edu) and Ani Adhikari (adhikari@berkeley.edu) Contributions by Fahad Kamran (fhdkmrn@berkeley.edu) and
More informationProbability and Statistics
CHAPTER 5: PARAMETER ESTIMATION 5-0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 5: PARAMETER
More informationHypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true
Hypothesis esting Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Statistical Hypothesis: conjecture about a population parameter
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory
More informationIntroduction to Machine Learning
Introduction to Machine Learning Generative Models Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1
More informationBayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida
Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:
More informationMachine Learning Linear Regression. Prof. Matteo Matteucci
Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares
More informationMachine Learning. Lecture 02.2: Basics of Information Theory. Nevin L. Zhang
Machine Learning Lecture 02.2: Basics of Information Theory Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering The Hong Kong University of Science and Technology Nevin L. Zhang
More informationWidths. Center Fluctuations. Centers. Centers. Widths
Radial Basis Functions: a Bayesian treatment David Barber Bernhard Schottky Neural Computing Research Group Department of Applied Mathematics and Computer Science Aston University, Birmingham B4 7ET, U.K.
More information28 Bayesian Mixture Models for Gene Expression and Protein Profiles
28 Bayesian Mixture Models for Gene Expression and Protein Profiles Michele Guindani, Kim-Anh Do, Peter Müller and Jeff Morris M.D. Anderson Cancer Center 1 Introduction We review the use of semi-parametric
More informationMachine Learning Lecture Notes
Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective
More informationConfidence Intervals for Population Mean
Confidence Intervals for Population Mean Reading: Sections 7.1, 7.2, 7.3 Learning Objectives: Students should be able to: Understand the meaning and purpose of confidence intervals Calculate a confidence
More informationIntroduction to Bioinformatics
CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics
More informationMultiple Sample Categorical Data
Multiple Sample Categorical Data paired and unpaired data, goodness-of-fit testing, testing for independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html
More informationProcess Mining in Non-Stationary Environments
and Machine Learning. Bruges Belgium), 25-27 April 2012, i6doc.com publ., ISBN 978-2-87419-049-0. Process Mining in Non-Stationary Environments Phil Weber, Peter Tiňo and Behzad Bordbar School of Computer
More information