Probability Distributions

Size: px
Start display at page:

Download "Probability Distributions"

Transcription

1 Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples of probability distributions and their properties. As well as being of great interest in their own right, these distributions can for building blocks for ore coplex odels and will be used extensively throughout the book. The distributions introduced in this chapter will also serve another iportant purpose, naely to provide us with the opportunity to discuss soe key statistical concepts, such as Bayesian inference, in the context of siple odels before we encounter the in ore coplex situations in later chapters. One role for the distributions discussed in this chapter is to odel the probability distribution p(x) of a rando variable x, given a finite set x,...,x N of observations. This proble is known as density estiation. For the purposes of this chapter, we shall assue that the data points are independent and identically distributed. It should be ephasized that the proble of density estiation is fun- 67

2 68. PROBABILITY DISTRIBUTIONS daentally ill-posed, because there are infinitely any probability distributions that could have given rise to the observed finite data set. Indeed, any distribution p(x) that is nonzero at each of the data points x,...,x N is a potential candidate. The issue of choosing an appropriate distribution relates to the proble of odel selection that has already been encountered in the context of polynoial curve fitting in Chapter and that is a central issue in pattern recognition. We begin by considering the binoial and ultinoial distributions for discrete rando variables and the Gaussian distribution for continuous rando variables. These are specific exaples of paraetric distributions, so-called because they are governed by a sall nuber of adaptive paraeters, such as the ean and variance in the case of a Gaussian for exaple. To apply such odels to the proble of density estiation, we need a procedure for deterining suitable values for the paraeters, given an observed data set. In a frequentist treatent, we choose specific values for the paraeters by optiizing soe criterion, such as the likelihood function. By contrast, in a Bayesian treatent we introduce prior distributions over the paraeters and then use Bayes theore to copute the corresponding posterior distribution given the observed data. We shall see that an iportant role is played by conjugate priors, that lead to posterior distributions having the sae functional for as the prior, and that therefore lead to a greatly siplified Bayesian analysis. For exaple, the conjugate prior for the paraeters of the ultinoial distribution is called the Dirichlet distribution, while the conjugate prior for the ean of a Gaussian is another Gaussian. All of these distributions are exaples of the exponential faily of distributions, which possess a nuber of iportant properties, and which will be discussed in soe detail. One liitation of the paraetric approach is that it assues a specific functional for for the distribution, which ay turn out to be inappropriate for a particular application. An alternative approach is given by nonparaetric density estiation ethods in which the for of the distribution typically depends on the size of the data set. Such odels still contain paraeters, but these control the odel coplexity rather than the for of the distribution. We end this chapter by considering three nonparaetric ethods based respectively on histogras, nearest-neighbours, and kernels... Binary Variables We begin by considering a single binary rando variable x {, }. For exaple, x ight describe the outcoe of flipping a coin, with x =representing heads, and x =representing tails. We can iagine that this is a daaged coin so that the probability of landing heads is not necessarily the sae as that of landing tails. The probability of x =will be denoted by the paraeter µ so that p(x = µ) =µ (.)

3 .. Binary Variables 69 where µ, fro which it follows that p(x = µ) = µ. The probability distribution over x can therefore be written in the for Bern(x µ) =µ x ( µ) x (.) Exercise. which is known as the Bernoulli distribution. It is easily verified that this distribution is noralized and that it has ean and variance given by E[x] = µ (.3) var[x] = µ( µ). (.4) Now suppose we have a data set D = {x,...,x N } of observed values of x. We can construct the likelihood function, which is a function of µ, on the assuption that the observations are drawn independently fro p(x µ), so that p(d µ) = N p(x n µ) = n= N µ x n ( µ) x n. (.5) In a frequentist setting, we can estiate a value for µ by axiizing the likelihood function, or equivalently by axiizing the logarith of the likelihood. In the case of the Bernoulli distribution, the log likelihood function is given by n= Section.4 ln p(d µ) = ln p(x n µ) = n= {x n ln µ + ( x n ) ln( µ)}. (.6) n= At this point, it is worth noting that the log likelihood function depends on the N observations x n only through their su n x n. This su provides an exaple of a sufficient statistic for the data under this distribution, and we shall study the iportant role of sufficient statistics in soe detail. If we set the derivative of ln p(d µ) with respect to µ equal to zero, we obtain the axiu likelihood estiator µ ML = N x n (.7) n= Jacob Bernoulli Jacob Bernoulli, also known as Jacques or Jaes Bernoulli, was a Swiss atheatician and was the first of any in the Bernoulli faily to pursue a career in science and atheatics. Although copelled to study philosophy and theology against his will by his parents, he travelled extensively after graduating in order to eet with any of the leading scientists of his tie, including Boyle and Hooke in England. When he returned to Switzerland, he taught echanics and becae Professor of Matheatics at Basel in 687. Unfortunately, rivalry between Jacob and his younger brother Johann turned an initially productive collaboration into a bitter and public dispute. Jacob s ost significant contributions to atheatics appeared in The Art of Conjecture published in 73, eight years after his death, which deals with topics in probability theory including what has becoe known as the Bernoulli distribution.

4 7. PROBABILITY DISTRIBUTIONS Figure. Histogra plot of the binoial distribution (.9) as a function of for N = and µ = which is also known as the saple ean. If we denote the nuber of observations of x =(heads) within this data set by, then we can write (.7) in the for µ ML = N (.8) Exercise.3 so that the probability of landing heads is given, in this axiu likelihood fraework, by the fraction of observations of heads in the data set. Now suppose we flip a coin, say, 3 ties and happen to observe 3 heads. Then N = =3and µ ML =. In this case, the axiu likelihood result would predict that all future observations should give heads. Coon sense tells us that this is unreasonable, and in fact this is an extree exaple of the over-fitting associated with axiu likelihood. We shall see shortly how to arrive at ore sensible conclusions through the introduction of a prior distribution over µ. We can also work out the distribution of the nuber of observations of x =, given that the data set has size N. This is called the binoial distribution, and fro (.5) we see that it is proportional to µ ( µ) N. In order to obtain the noralization coefficient we note that out of N coin flips, we have to add up all of the possible ways of obtaining heads, so that the binoial distribution can be written ( ) N Bin( N,µ) = µ ( µ) N (.9) where ( ) N N! (N )!! (.) is the nuber of ways of choosing objects out of a total of N identical objects. Figure. shows a plot of the binoial distribution for N = and µ =.5. The ean and variance of the binoial distribution can be found by using the result of Exercise., which shows that for independent events the ean of the su is the su of the eans, and the variance of the su is the su of the variances. Because = x x N, and for each observation the ean and variance are

5 .. Binary Variables 7 Exercise.4 Exercise.5 Exercise.6 given by (.3) and (.4), respectively, we have E[] Bin( N,µ) = Nµ (.) var[] = ( E[]) Bin( N,µ) = Nµ( µ). (.) = These results can also be proved directly using calculus... The beta distribution We have seen in (.8) that the axiu likelihood setting for the paraeter µ in the Bernoulli distribution, and hence in the binoial distribution, is given by the fraction of the observations in the data set having x =. As we have already noted, this can give severely over-fitted results for sall data sets. In order to develop a Bayesian treatent for this proble, we need to introduce a prior distribution p(µ) over the paraeter µ. Here we consider a for of prior distribution that has a siple interpretation as well as soe useful analytical properties. To otivate this prior, we note that the likelihood function takes the for of the product of factors of the for µ x ( µ) x. If we choose a prior to be proportional to powers of µ and ( µ), then the posterior distribution, which is proportional to the product of the prior and the likelihood function, will have the sae functional for as the prior. This property is called conjugacy and we will see several exaples of it later in this chapter. We therefore choose a prior, called the beta distribution, given by Beta(µ a, b) = Γ(a + b) Γ(a)Γ(b) µa ( µ) b (.3) where Γ(x) is the gaa function defined by (.4), and the coefficient in (.3) ensures that the beta distribution is noralized, so that Beta(µ a, b)dµ =. (.4) The ean and variance of the beta distribution are given by E[µ] = a a + b (.5) var[µ] = ab (a + b) (a + b + ). (.6) The paraeters a and b are often called hyperparaeters because they control the distribution of the paraeter µ. Figure. shows plots of the beta distribution for various values of the hyperparaeters. The posterior distribution of µ is now obtained by ultiplying the beta prior (.3) by the binoial likelihood function (.9) and noralizing. Keeping only the factors that depend on µ, we see that this posterior distribution has the for p(µ, l, a, b) µ +a ( µ) l+b (.7)

6 7. PROBABILITY DISTRIBUTIONS 3 a =. 3 a = b =. b =.5 µ 3 a = b =3.5 µ 3 a =8 b =4.5 µ.5 µ Figure. Plots of the beta distribution Beta(µ a, b) given by (.3) as a function of µ for various values of the hyperparaeters a and b. where l = N, and therefore corresponds to the nuber of tails in the coin exaple. We see that (.7) has the sae functional dependence on µ as the prior distribution, reflecting the conjugacy properties of the prior with respect to the likelihood function. Indeed, it is siply another beta distribution, and its noralization coefficient can therefore be obtained by coparison with (.3) to give p(µ, l, a, b) = Γ( + a + l + b) Γ( + a)γ(l + b) µ+a ( µ) l+b. (.8) We see that the effect of observing a data set of observations of x =and l observations of x =has been to increase the value of a by, and the value of b by l, in going fro the prior distribution to the posterior distribution. This allows us to provide a siple interpretation of the hyperparaeters a and b in the prior as an effective nuber of observations of x =and x =, respectively. Note that a and b need not be integers. Furtherore, the posterior distribution can act as the prior if we subsequently observe additional data. To see this, we can iagine taking observations one at a tie and after each observation updating the current posterior

7 .. Binary Variables 73 prior likelihood function posterior.5 µ.5 µ.5 µ Figure.3 Illustration of one step of sequential Bayesian inference. The prior is given by a beta distribution with paraeters a =, b =, and the likelihood function, given by (.9) with N = =, corresponds to a single observation of x =, so that the posterior is given by a beta distribution with paraeters a =3, b =. Section.3.5 distribution by ultiplying by the likelihood function for the new observation and then noralizing to obtain the new, revised posterior distribution. At each stage, the posterior is a beta distribution with soe total nuber of (prior and actual) observed values for x =and x =given by the paraeters a and b. Incorporation of an additional observation of x =siply corresponds to increenting the value of a by, whereas for an observation of x =we increent b by. Figure.3 illustrates one step in this process. We see that this sequential approach to learning arises naturally when we adopt a Bayesian viewpoint. It is independent of the choice of prior and of the likelihood function and depends only on the assuption of i.i.d. data. Sequential ethods ake use of observations one at a tie, or in sall batches, and then discard the before the next observations are used. They can be used, for exaple, in real-tie learning scenarios where a steady strea of data is arriving, and predictions ust be ade before all of the data is seen. Because they do not require the whole data set to be stored or loaded into eory, sequential ethods are also useful for large data sets. Maxiu likelihood ethods can also be cast into a sequential fraework. If our goal is to predict, as best we can, the outcoe of the next trial, then we ust evaluate the predictive distribution of x, given the observed data set D. Fro the su and product rules of probability, this takes the for p(x = D) = p(x = µ)p(µ D)dµ = µp(µ D)dµ = E[µ D]. (.9) Using the result (.8) for the posterior distribution p(µ D), together with the result (.5) for the ean of the beta distribution, we obtain p(x = D) = + a + a + l + b (.) which has a siple interpretation as the total fraction of observations (both real observations and fictitious prior observations) that correspond to x =. Note that in the liit of an infinitely large data set, l the result (.) reduces to the axiu likelihood result (.8). As we shall see, it is a very general property that the Bayesian and axiu likelihood results will agree in the liit of an infinitely

8 74. PROBABILITY DISTRIBUTIONS Exercise.7 Exercise.8 large data set. For a finite data set, the posterior ean for µ always lies between the prior ean and the axiu likelihood estiate for µ corresponding to the relative frequencies of events given by (.7). Fro Figure., we see that as the nuber of observations increases, so the posterior distribution becoes ore sharply peaked. This can also be seen fro the result (.6) for the variance of the beta distribution, in which we see that the variance goes to zero for a or b. In fact, we ight wonder whether it is a general property of Bayesian learning that, as we observe ore and ore data, the uncertainty represented by the posterior distribution will steadily decrease. To address this, we can take a frequentist view of Bayesian learning and show that, on average, such a property does indeed hold. Consider a general Bayesian inference proble for a paraeter θ for which we have observed a data set D, described by the joint distribution p(θ, D). The following result E θ [θ] =E D [E θ [θ D]] (.) where E θ [θ] E D [E θ [θ D]] p(θ)θ dθ (.) { } θp(θ D)dθ p(d)dd (.3) says that the posterior ean of θ, averaged over the distribution generating the data, is equal to the prior ean of θ. Siilarly, we can show that var θ [θ] =E D [var θ [θ D]] + var D [E θ [θ D]]. (.4) The ter on the left-hand side of (.4) is the prior variance of θ. On the righthand side, the first ter is the average posterior variance of θ, and the second ter easures the variance in the posterior ean of θ. Because this variance is a positive quantity, this result shows that, on average, the posterior variance of θ is saller than the prior variance. The reduction in variance is greater if the variance in the posterior ean is greater. Note, however, that this result only holds on average, and that for a particular observed data set it is possible for the posterior variance to be larger than the prior variance... Multinoial Variables Binary variables can be used to describe quantities that can take one of two possible values. Often, however, we encounter discrete variables that can take on one of K possible utually exclusive states. Although there are various alternative ways to express such variables, we shall see shortly that a particularly convenient representation is the -of-k schee in which the variable is represented by a K-diensional vector x in which one of the eleents x k equals, and all reaining eleents equal

9 Exercises 7 An interesting property of the nearest-neighbour (K =) classifier is that, in the liit N, the error rate is never ore than twice the iniu achievable error rate of an optial classifier, i.e., one that uses the true class distributions (Cover and Hart, 967). As discussed so far, both the K-nearest-neighbour ethod, and the kernel density estiator, require the entire training data set to be stored, leading to expensive coputation if the data set is large. This effect can be offset, at the expense of soe additional one-off coputation, by constructing tree-based search structures to allow (approxiate) near neighbours to be found efficiently without doing an exhaustive search of the data set. Nevertheless, these nonparaetric ethods are still severely liited. On the other hand, we have seen that siple paraetric odels are very restricted in ters of the fors of distribution that they can represent. We therefore need to find density odels that are very flexible and yet for which the coplexity of the odels can be controlled independently of the size of the training set, and we shall see in subsequent chapters how to achieve this. Exercises. ( ) www Verify that the Bernoulli distribution (.) satisfies the following properties p(x µ) = (.57) x= E[x] = µ (.58) var[x] = µ( µ). (.59) Show that the entropy H[x] of a Bernoulli distributed rando binary variable x is given by H[x] = µ ln µ ( µ) ln( µ). (.6). ( ) The for of the Bernoulli distribution given by (.) is not syetric between the two values of x. In soe situations, it will be ore convenient to use an equivalent forulation for which x {, }, in which case the distribution can be written ( ) ( x)/ ( ) (+x)/ µ +µ p(x µ) = (.6) where µ [, ]. Show that the distribution (.6) is noralized, and evaluate its ean, variance, and entropy..3 ( ) www In this exercise, we prove that the binoial distribution (.9) is noralized. First use the definition (.) of the nuber of cobinations of identical objects chosen fro a total of N to show that ( N ) + ( N ) ( ) N + =. (.6)

10 8. PROBABILITY DISTRIBUTIONS Use this result to prove by induction the following result ( + x) N = = ( ) N x (.63) which is known as the binoial theore, and which is valid for all real values of x. Finally, show that the binoial distribution is noralized, so that = ( ) N µ ( µ) N = (.64) which can be done by first pulling out a factor ( µ) N out of the suation and then aking use of the binoial theore..4 ( ) Show that the ean of the binoial distribution is given by (.). To do this, differentiate both sides of the noralization condition (.64) with respect to µ and then rearrange to obtain an expression for the ean of n. Siilarly, by differentiating (.64) twice with respect to µ and aking use of the result (.) for the ean of the binoial distribution prove the result (.) for the variance of the binoial..5 ( ) www In this exercise, we prove that the beta distribution, given by (.3), is correctly noralized, so that (.4) holds. This is equivalent to showing that µ a ( µ) b dµ = Γ(a)Γ(b) Γ(a + b). (.65) Fro the definition (.4) of the gaa function, we have Γ(a)Γ(b) = exp( x)x a dx exp( y)y b dy. (.66) Use this expression to prove (.65) as follows. First bring the integral over y inside the integrand of the integral over x, next ake the change of variable t = y + x where x is fixed, then interchange the order of the x and t integrations, and finally ake the change of variable x = tµ where t is fixed..6 ( ) Make use of the result (.65) to show that the ean, variance, and ode of the beta distribution (.3) are given respectively by E[µ] = var[µ] = ode[µ] = a a + b (.67) ab (a + b) (a + b + ) (.68) a a + b. (.69)

11 Exercises 9.7 ( ) Consider a binoial rando variable x given by (.9), with prior distribution for µ given by the beta distribution (.3), and suppose we have observed occurrences of x =and l occurrences of x =. Show that the posterior ean value of x lies between the prior ean and the axiu likelihood estiate for µ. To do this, show that the posterior ean can be written as λ ties the prior ean plus ( λ) ties the axiu likelihood estiate, where λ. This illustrates the concept of the posterior distribution being a coproise between the prior distribution and the axiu likelihood solution..8 ( ) Consider two variables x and y with joint distribution p(x, y). Prove the following two results E[x] = E y [E x [x y]] (.7) var[x] = E y [var x [x y]] + var y [E x [x y]]. (.7) Here E x [x y] denotes the expectation of x under the conditional distribution p(x y), with a siilar notation for the conditional variance..9 ( ) www. In this exercise, we prove the noralization of the Dirichlet distribution (.38) using induction. We have already shown in Exercise.5 that the beta distribution, which is a special case of the Dirichlet for M =, is noralized. We now assue that the Dirichlet distribution is noralized for M variables and prove that it is noralized for M variables. To do this, consider the Dirichlet distribution over M variables, and take account of the constraint M k= µ k =by eliinating µ M, so that the Dirichlet is written M p M (µ,...,µ M )=C M k= µ α k k ( M j= µ j ) αm (.7) and our goal is to find an expression for C M. To do this, integrate over µ M, taking care over the liits of integration, and then ake a change of variable so that this integral has liits and. By assuing the correct result for C M and aking use of (.65), derive the expression for C M.. ( ) Using the property Γ(x + ) = xγ(x) of the gaa function, derive the following results for the ean, variance, and covariance of the Dirichlet distribution given by (.38) where α is defined by (.39). E[µ j ] = α j α (.73) var[µ j ] = α j(α α j ) α (α + ) (.74) cov[µ j µ l ] = α jα l α (α + ), j l (.75)

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Probability Distributions

Probability Distributions 2 Probability Distributions In Chapter, we emphasized the central role played by probability theory in the solution of pattern recognition problems. We turn now to an exploration of some particular examples

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Biostatistics Department Technical Report

Biostatistics Department Technical Report Biostatistics Departent Technical Report BST006-00 Estiation of Prevalence by Pool Screening With Equal Sized Pools and a egative Binoial Sapling Model Charles R. Katholi, Ph.D. Eeritus Professor Departent

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words) 1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique

More information

Will Monroe August 9, with materials by Mehran Sahami and Chris Piech. image: Arito. Parameter learning

Will Monroe August 9, with materials by Mehran Sahami and Chris Piech. image: Arito. Parameter learning Will Monroe August 9, 07 with aterials by Mehran Sahai and Chris Piech iage: Arito Paraeter learning Announceent: Proble Set #6 Goes out tonight. Due the last day of class, Wednesday, August 6 (before

More information

Sampling How Big a Sample?

Sampling How Big a Sample? C. G. G. Aitken, 1 Ph.D. Sapling How Big a Saple? REFERENCE: Aitken CGG. Sapling how big a saple? J Forensic Sci 1999;44(4):750 760. ABSTRACT: It is thought that, in a consignent of discrete units, a certain

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA) Bayesian Learning Chapter 6: Bayesian Learning CS 536: Machine Learning Littan (Wu, TA) [Read Ch. 6, except 6.3] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theore MAP, ML hypotheses MAP learners Miniu

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

The Euler-Maclaurin Formula and Sums of Powers

The Euler-Maclaurin Formula and Sums of Powers DRAFT VOL 79, NO 1, FEBRUARY 26 1 The Euler-Maclaurin Forula and Sus of Powers Michael Z Spivey University of Puget Sound Tacoa, WA 98416 spivey@upsedu Matheaticians have long been intrigued by the su

More information

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation Machine Learning CMPT 726 Simon Fraser University Binomial Parameter Estimation Outline Maximum Likelihood Estimation Smoothed Frequencies, Laplace Correction. Bayesian Approach. Conjugate Prior. Uniform

More information

Kinetic Theory of Gases: Elementary Ideas

Kinetic Theory of Gases: Elementary Ideas Kinetic Theory of Gases: Eleentary Ideas 17th February 2010 1 Kinetic Theory: A Discussion Based on a Siplified iew of the Motion of Gases 1.1 Pressure: Consul Engel and Reid Ch. 33.1) for a discussion

More information

IN modern society that various systems have become more

IN modern society that various systems have become more Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto

More information

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,

More information

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm Acta Polytechnica Hungarica Vol., No., 04 Sybolic Analysis as Universal Tool for Deriving Properties of Non-linear Algoriths Case study of EM Algorith Vladiir Mladenović, Miroslav Lutovac, Dana Porrat

More information

16 Independence Definitions Potential Pitfall Alternative Formulation. mcs-ftl 2010/9/8 0:40 page 431 #437

16 Independence Definitions Potential Pitfall Alternative Formulation. mcs-ftl 2010/9/8 0:40 page 431 #437 cs-ftl 010/9/8 0:40 page 431 #437 16 Independence 16.1 efinitions Suppose that we flip two fair coins siultaneously on opposite sides of a roo. Intuitively, the way one coin lands does not affect the way

More information

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax:

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax: A general forulation of the cross-nested logit odel Michel Bierlaire, EPFL Conference paper STRC 2001 Session: Choices A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics,

More information

General Properties of Radiation Detectors Supplements

General Properties of Radiation Detectors Supplements Phys. 649: Nuclear Techniques Physics Departent Yarouk University Chapter 4: General Properties of Radiation Detectors Suppleents Dr. Nidal M. Ershaidat Overview Phys. 649: Nuclear Techniques Physics Departent

More information

Kinetic Theory of Gases: Elementary Ideas

Kinetic Theory of Gases: Elementary Ideas Kinetic Theory of Gases: Eleentary Ideas 9th February 011 1 Kinetic Theory: A Discussion Based on a Siplified iew of the Motion of Gases 1.1 Pressure: Consul Engel and Reid Ch. 33.1) for a discussion of

More information

Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression

Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression Advances in Pure Matheatics, 206, 6, 33-34 Published Online April 206 in SciRes. http://www.scirp.org/journal/ap http://dx.doi.org/0.4236/ap.206.65024 Inference in the Presence of Likelihood Monotonicity

More information

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models 2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511

More information

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents

More information

OBJECTIVES INTRODUCTION

OBJECTIVES INTRODUCTION M7 Chapter 3 Section 1 OBJECTIVES Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance, and

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

What is Probability? (again)

What is Probability? (again) INRODUCTION TO ROBBILITY Basic Concepts and Definitions n experient is any process that generates well-defined outcoes. Experient: Record an age Experient: Toss a die Experient: Record an opinion yes,

More information

Understanding Machine Learning Solution Manual

Understanding Machine Learning Solution Manual Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y

More information

Paul M. Goggans Department of Electrical Engineering, University of Mississippi, Anderson Hall, University, Mississippi 38677

Paul M. Goggans Department of Electrical Engineering, University of Mississippi, Anderson Hall, University, Mississippi 38677 Evaluation of decay ties in coupled spaces: Bayesian decay odel selection a),b) Ning Xiang c) National Center for Physical Acoustics and Departent of Electrical Engineering, University of Mississippi,

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Testing equality of variances for multiple univariate normal populations

Testing equality of variances for multiple univariate normal populations University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Inforation Sciences 0 esting equality of variances for ultiple univariate

More information

PAC-Bayes Analysis Of Maximum Entropy Learning

PAC-Bayes Analysis Of Maximum Entropy Learning PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

Birthday Paradox Calculations and Approximation

Birthday Paradox Calculations and Approximation Birthday Paradox Calculations and Approxiation Joshua E. Hill InfoGard Laboratories -March- v. Birthday Proble In the birthday proble, we have a group of n randoly selected people. If we assue that birthdays

More information

Optimal nonlinear Bayesian experimental design: an application to amplitude versus offset experiments

Optimal nonlinear Bayesian experimental design: an application to amplitude versus offset experiments Geophys. J. Int. (23) 155, 411 421 Optial nonlinear Bayesian experiental design: an application to aplitude versus offset experients Jojanneke van den Berg, 1, Andrew Curtis 2,3 and Jeannot Trapert 1 1

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

ESE 523 Information Theory

ESE 523 Information Theory ESE 53 Inforation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electrical and Systes Engineering Washington University 11 Urbauer Hall 10E Green Hall 314-935-4173 (Lynda Marha Answers) jao@wustl.edu

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

An Extension to the Tactical Planning Model for a Job Shop: Continuous-Time Control

An Extension to the Tactical Planning Model for a Job Shop: Continuous-Time Control An Extension to the Tactical Planning Model for a Job Shop: Continuous-Tie Control Chee Chong. Teo, Rohit Bhatnagar, and Stephen C. Graves Singapore-MIT Alliance, Nanyang Technological Univ., and Massachusetts

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

Research in Area of Longevity of Sylphon Scraies

Research in Area of Longevity of Sylphon Scraies IOP Conference Series: Earth and Environental Science PAPER OPEN ACCESS Research in Area of Longevity of Sylphon Scraies To cite this article: Natalia Y Golovina and Svetlana Y Krivosheeva 2018 IOP Conf.

More information

Chapter 2 General Properties of Radiation Detectors

Chapter 2 General Properties of Radiation Detectors Med Phys 4RA3, 4RB3/6R3 Radioisotopes and Radiation Methodology -1 Chapter General Properties of Radiation Detectors Ionizing radiation is ost coonly detected by the charge created when radiation interacts

More information

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples Open Journal of Statistics, 4, 4, 64-649 Published Online Septeber 4 in SciRes http//wwwscirporg/ournal/os http//ddoiorg/436/os4486 Estiation of the Mean of the Eponential Distribution Using Maiu Ranked

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

Binomial and Poisson Probability Distributions

Binomial and Poisson Probability Distributions Binoial and Poisson Probability Distributions There are a few discrete robability distributions that cro u any ties in hysics alications, e.g. QM, SM. Here we consider TWO iortant and related cases, the

More information

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS N. van Erp and P. van Gelder Structural Hydraulic and Probabilistic Design, TU Delft Delft, The Netherlands Abstract. In probles of odel coparison

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

When Short Runs Beat Long Runs

When Short Runs Beat Long Runs When Short Runs Beat Long Runs Sean Luke George Mason University http://www.cs.gu.edu/ sean/ Abstract What will yield the best results: doing one run n generations long or doing runs n/ generations long

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

About the definition of parameters and regimes of active two-port networks with variable loads on the basis of projective geometry

About the definition of parameters and regimes of active two-port networks with variable loads on the basis of projective geometry About the definition of paraeters and regies of active two-port networks with variable loads on the basis of projective geoetry PENN ALEXANDR nstitute of Electronic Engineering and Nanotechnologies "D

More information

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5,

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, 2015 31 11 Motif Finding Sources for this section: Rouchka, 1997, A Brief Overview of Gibbs Sapling. J. Buhler, M. Topa:

More information

Ocean 420 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers

Ocean 420 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers Ocean 40 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers 1. Hydrostatic Balance a) Set all of the levels on one of the coluns to the lowest possible density.

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

The Transactional Nature of Quantum Information

The Transactional Nature of Quantum Information The Transactional Nature of Quantu Inforation Subhash Kak Departent of Coputer Science Oklahoa State University Stillwater, OK 7478 ABSTRACT Inforation, in its counications sense, is a transactional property.

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Topic 5a Introduction to Curve Fitting & Linear Regression

Topic 5a Introduction to Curve Fitting & Linear Regression /7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes Graphical Models in Local, Asyetric Multi-Agent Markov Decision Processes Ditri Dolgov and Edund Durfee Departent of Electrical Engineering and Coputer Science University of Michigan Ann Arbor, MI 48109

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

E. Alpaydın AERFAISS

E. Alpaydın AERFAISS E. Alpaydın AERFAISS 00 Introduction Questions: Is the error rate of y classifier less than %? Is k-nn ore accurate than MLP? Does having PCA before iprove accuracy? Which kernel leads to highest accuracy

More information

Research Article Rapidly-Converging Series Representations of a Mutual-Information Integral

Research Article Rapidly-Converging Series Representations of a Mutual-Information Integral International Scholarly Research Network ISRN Counications and Networking Volue 11, Article ID 5465, 6 pages doi:1.54/11/5465 Research Article Rapidly-Converging Series Representations of a Mutual-Inforation

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Measuring Temperature with a Silicon Diode

Measuring Temperature with a Silicon Diode Measuring Teperature with a Silicon Diode Due to the high sensitivity, nearly linear response, and easy availability, we will use a 1N4148 diode for the teperature transducer in our easureents 10 Analysis

More information

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Statistica Sinica 6 016, 1709-178 doi:http://dx.doi.org/10.5705/ss.0014.0034 AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Nilabja Guha 1, Anindya Roy, Yaakov Malinovsky and Gauri

More information

Generalized Queries on Probabilistic Context-Free Grammars

Generalized Queries on Probabilistic Context-Free Grammars IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 20, NO. 1, JANUARY 1998 1 Generalized Queries on Probabilistic Context-Free Graars David V. Pynadath and Michael P. Wellan Abstract

More information

The Methods of Solution for Constrained Nonlinear Programming

The Methods of Solution for Constrained Nonlinear Programming Research Inventy: International Journal Of Engineering And Science Vol.4, Issue 3(March 2014), PP 01-06 Issn (e): 2278-4721, Issn (p):2319-6483, www.researchinventy.co The Methods of Solution for Constrained

More information

Curious Bounds for Floor Function Sums

Curious Bounds for Floor Function Sums 1 47 6 11 Journal of Integer Sequences, Vol. 1 (018), Article 18.1.8 Curious Bounds for Floor Function Sus Thotsaporn Thanatipanonda and Elaine Wong 1 Science Division Mahidol University International

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information