Gaussian Quiz. Preamble to The Humble Gaussian Distribution. David MacKay 1

Similar documents
Differential Equations

Topic 15 Notes Jeremy Orloff

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

15. LECTURE 15. I can calculate the dot product of two vectors and interpret its meaning. I can find the projection of one vector onto another one.

CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes

Bayesian Linear Regression [DRAFT - In Progress]

v = ( 2)

Lecture 9: Elementary Matrices

f rot (Hz) L x (max)(erg s 1 )

Fitting a Straight Line to Data

The Multivariate Gaussian Distribution [DRAFT]

Lecture 10: Powers of Matrices, Difference Equations

Usually, when we first formulate a problem in mathematics, we use the most familiar

Chapter 3 ALGEBRA. Overview. Algebra. 3.1 Linear Equations and Applications 3.2 More Linear Equations 3.3 Equations with Exponents. Section 3.

Getting Started with Communications Engineering. Rows first, columns second. Remember that. R then C. 1

Example: 2x y + 3z = 1 5y 6z = 0 x + 4z = 7. Definition: Elementary Row Operations. Example: Type I swap rows 1 and 3

Part I Electrostatics. 1: Charge and Coulomb s Law July 6, 2008

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

3 Fields, Elementary Matrices and Calculating Inverses

Confidence intervals

PageRank: The Math-y Version (Or, What To Do When You Can t Tear Up Little Pieces of Paper)

Quadratic Equations Part I

Algebra & Trig Review

Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2:

22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

Vectors Part 1: Two Dimensions

Eigenvalues & Eigenvectors

MATH 310, REVIEW SHEET 2

Introduction. So, why did I even bother to write this?

Getting Started with Communications Engineering

EXAM 2 REVIEW DAVID SEAL

Groups. s t or s t or even st rather than f(s,t).

1.4 Mathematical Equivalence

Solving with Absolute Value

Math 308 Midterm Answers and Comments July 18, Part A. Short answer questions

Notation, Matrices, and Matrix Mathematics

1.9 Algebraic Expressions

CSC2515 Assignment #2

Chapter 1. Linear Equations

Line Integrals and Path Independence

Calculus II. Calculus II tends to be a very difficult course for many students. There are many reasons for this.

Linear Algebra, Summer 2011, pt. 2

Inverses and Elementary Matrices

Name Solutions Linear Algebra; Test 3. Throughout the test simplify all answers except where stated otherwise.

Solution to Proof Questions from September 1st

Gaussian elimination

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA

Answers in blue. If you have questions or spot an error, let me know. 1. Find all matrices that commute with A =. 4 3

STEP Support Programme. STEP 2 Matrices Topic Notes

Appendix A: Matrices

Chapter 1 Review of Equations and Inequalities

M. Matrices and Linear Algebra

Math 138: Introduction to solving systems of equations with matrices. The Concept of Balance for Systems of Equations

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

Principal Components Theory Notes

Covariance to PCA. CS 510 Lecture #8 February 17, 2014

18.06 Quiz 2 April 7, 2010 Professor Strang

SCHOOL OF BUSINESS, ECONOMICS AND MANAGEMENT BUSINESS MATHEMATICS / MATHEMATICAL ANALYSIS

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

CPSC 540: Machine Learning

How Latent Semantic Indexing Solves the Pachyderm Problem

Lesson 11-1: Parabolas

1 Introduction to Generalized Least Squares

Descriptive Statistics (And a little bit on rounding and significant digits)

Sometimes the domains X and Z will be the same, so this might be written:

UCLA STAT 233 Statistical Methods in Biomedical Imaging

Preptests 55 Answers and Explanations (By Ivy Global) Section 4 Logic Games

Principal Components Analysis (PCA)

Math 31 Lesson Plan. Day 2: Sets; Binary Operations. Elizabeth Gillaspy. September 23, 2011

c 1 v 1 + c 2 v 2 = 0 c 1 λ 1 v 1 + c 2 λ 1 v 2 = 0

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Introduction to Vectors

PY 351 Modern Physics Short assignment 4, Nov. 9, 2018, to be returned in class on Nov. 15.

REVIEW FOR EXAM II. The exam covers sections , the part of 3.7 on Markov chains, and

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Lecture 2: Linear regression

6 EIGENVALUES AND EIGENVECTORS

AN ALGEBRA PRIMER WITH A VIEW TOWARD CURVES OVER FINITE FIELDS

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

base 2 4 The EXPONENT tells you how many times to write the base as a factor. Evaluate the following expressions in standard notation.

Sums of Squares (FNS 195-S) Fall 2014

ENGINEERING MATH 1 Fall 2009 VECTOR SPACES

Linear Algebra Basics

Exam Question 10: Differential Equations. June 19, Applied Mathematics: Lecture 6. Brendan Williamson. Introduction.

The following are generally referred to as the laws or rules of exponents. x a x b = x a+b (5.1) 1 x b a (5.2) (x a ) b = x ab (5.

Answers for Calculus Review (Extrema and Concavity)

10-725/36-725: Convex Optimization Spring Lecture 21: April 6

Algebra. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Mathematical Induction. EECS 203: Discrete Mathematics Lecture 11 Spring

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

Gradient. x y x h = x 2 + 2h x + h 2 GRADIENTS BY FORMULA. GRADIENT AT THE POINT (x, y)

COLLEGE ALGEBRA. Paul Dawkins

22A-2 SUMMER 2014 LECTURE 5

PRINCIPAL COMPONENTS ANALYSIS

Probabilistic Graphical Models

Finite Math - J-term Section Systems of Linear Equations in Two Variables Example 1. Solve the system

Please bring the task to your first physics lesson and hand it to the teacher.

Transcription:

Preamble to The Humble Gaussian Distribution. David MacKay Gaussian Quiz H y y y 3. Assuming that the variables y, y, y 3 in this belief network have a joint Gaussian distribution, which of the following matrices could be the covariance matrix? A B C D 8 3 9 3 0 9 3 0 3 9 3 3 9 3 3 0 3 3 8 0 3 9 0 3 9 9 3 3 9 3 3 9. Which of the matrices could be the inverse covariance matrix? H y y 3 y 3. Which of the matrices could be the covariance matrix of the second graphical model? 4. Which of the matrices could be the inverse covariance matrix of the second graphical model? 5. Let three variables y, y, y 3 have covariance matrix K (3), and inverse covariance matrix K K (3) =.5 0.5.5 0.5 K (3) =.5.5.5.5 Now focus on the variables y and y. Which statements about their covariance matrix K () and inverse covariance matrix K () are true? (A) K () = [.5.5 (B) K () = [.5 (3).

The Humble Gaussian Distribution David J.C. MacKay Cavendish Laboratory Cambridge CB3 0HE United Kingdom June, 006 Draft.0 Abstract These are elementary notes on Gaussian distributions, aimed at people who are about to learn about Gaussian processes. I emphasize the following points. What happens to a covariance matrix and inverse covariance matrix when we omit a variable. What it means to have zeros in a covariance matrix. What it means to have zeros in an inverse covariance matrix. How probabilistic models expressed in terms of energies relate to Gaussians. Why eigenvectors and eigenvalues don t have any fundamental status. Introduction Let s chat about a Gaussian distribution with zero mean, such as P(y) = Z e yt Ay, () where A = K is the inverse of the covariance matrix, K. I m going to emphasize dimensions throughout this note, because I think dimension-consciousness enhances understanding. I ll write K K K 3 K = K K K 3 (4) K 3 K 3 K 33 It s conventional to write the diagonal elements in K as σi and the offdiagonal elements as σ ij. For example K = σ σ σ 3 σ σ σ 3 () σ 3 σ 3 A confusing convention, since it implies that σ ij has different dimensions from σ i, even if all axes i, j have the same dimensions! Another way of writing an off-diagonal coefficient is K ij = ρ ij σ i σ j, (3) where ρ is the correlation coefficient between i and j. This is a better notation since it s dimensionally consistent in the way it uses the letter σ. But I will stick with the notation K ij.

The definition of the covariance matrix is K ij = y i y j (5) so the dimensions of the element K ij are (dimensions of y i ) times (dimensions of y j ).. Examples Let s work through a few graphical models. H y H y y 3 y y 3 y Example Example.. Example Maybe y is the temperature outside some buildings (or rather, the deviation of the outside temperature from its mean), and y is the temperature deviation inside building, and y 3 is the temperature inside building 3. This graphical model says that if you know the outside temperature y then y and y 3 are independent. Let s consider this generative model: y = ν (6) y = w y + ν (7) y 3 = w 3 y + ν 3, (8) where {ν i } are independent normal variables with variances {σ i }. Then we can write down the entries in the covariance matrix, starting with the diagonal entries K = y y = (w ν + ν )(w ν + ν ) = w ν + w ν ν + ν = w σ + σ (9) So we can fill in this much: K = The off diagonal terms are (and similarly for K 3 ) and K K K 3 K K K 3 K 3 K 3 K 33 K = σ (0) K 33 = w 3 σ + σ 3 () = w σ + σ σ w 3σ + σ 3 () K = y y = (w ν + ν )(ν ) = w σ (3) K 3 = y y 3 = (w ν + ν )(w 3 ν + ν 3 ) = w w 3 σ (4) 3

So the covariance matrix is: K = K K K 3 K K K 3 K 3 K 3 K 33 = w σ + σ w σ w w 3 σ σ w 3 σ w3 σ + σ 3 (where the remaining blank elements can be filled in by symmetry). Now let s think about the inverse covariance matrix. One way to get to it is to write down the joint distribution. (5) P(y, y, y 3 H ) = P(y )P(y y )P(y 3 y ) (6) = ( ) ( exp y exp (y w y ) ) ( exp (y 3 y ) ) (7) Z σ Z σ Z 3 We can now collect all the terms in y i y j. P(y, y, y 3 ) = Z exp ( y σ [ = ( y Z exp (y w y ) (y 3 y ) ) σ σ + w σ + w 3 = Z exp [ y y y 3 So the inverse covariance matrix is σ K = w σ [ σ σ w σ y σ 3 [ σ σ + y y w σ w σ + w σ 0 σ 3 w σ + w σ 0 σ 3 + w 3 σ 3 0 σ 3 σ 3 + w 3 σ 3 y 3 0 y y y 3 ) w 3 + y 3 y The first thing I d like you to notice here is the zeroes. [K 3 = 0. The meaning of a zero in an inverse covariance matrix (at location i, j) is conditional on all the other variables, these two variables i and j are independent. Next, notice that whereas y and y were positively correlated (assuming w > 0), the coefficient [K is negative. It s common that a covariance matrix K in which all the elements are nonnegative has an inverse that includes some negative elements. So positive off-diagonal terms in the covariance matrix always describe positive correlation; but the off-diagonal terms in the inverse covariance matrix can t be interpreted that way. The sign of an element (i, j) in the inverse covariance matrix does not tell you about the correlation between those two variables. For example, remember: there is a zero at [K 3. But that doesn t mean that variables y and y 3 are uncorrelated. Thanks to their parent y, they are correlated, with covariance w w 3 σ. The off-diagonal entry [K ij in an inverse covariance matrix indicates how y i and y j are correlated if we condition on all the other variables apart from those two: if [K ij < 0, they are positively correlated, conditioned on the others; if [K ij > 0, they are negatively correlated. 4

The inverse covariance matrix is great for reading out properties of conditional distributions in which we condition on all the variables except one. For example, look at [ K = ; if we know y σ and y 3, then the probability distribution of y is Gaussian with variance / [K. That one was easy. + w σ Look at [ [ K = σ of y is Gaussian with variance + w 3 σ 3. if we know y and y 3, then the probability distribution [K = σ + w σ. (8) + w 3 That s not so obvious, but it s familiar if you ve applied Bayes theorem to Gaussians when we do inference of a parent like y given its children, the inverse-variances of the prior and the likelihoods add. Here, the parent variable s inverse variance (also known as its precision) is the sum of the precision contributed by the prior σ, the precision contributed by the measurement of y, w, and σ the precision contributed by the measurement of y 3, w 3. The off-diagonal entries in K tell us how the mean of [the conditional distribution of one variable given the others depends on [the others. Let s take variable y 3 conditioned on the other two, for example. P(y 3 y, y, H ) P(y, y, y 3 H ) Z exp [ y y y 3 σ w σ [ σ w σ + w σ 0 σ 3 + w 3 σ 3 0 y y y 3 Let s highlight in Blue the terms y, y that are fixed and known and uninteresting, and highlight in Green everything that is multiplying the interesting term y 3. P(y 3 y, y, H ) P(y, y, y 3 H ) Z exp [ y y y 3 σ w σ [ σ w σ + w σ 0 σ 3 + w 3 σ 3 0 y y y 3 All those blue multipliers in the central matrix aren t achieving anything. We can just ignore them (and redefine the constant of proportionality). For the benefit of anyone with a colour-blind printer, here it is again: P(y 3 y, y, H ) exp [ y y y 3 0 0 0 0 0 0 σ 3 σ 3 σ 3 y y y 3 5

P(y 3 y, y, H ) ( exp σ 3 [y 3 [y 3 [ 0 y σ 3 y ) We obtain the mean by completing the square. P(y 3 y, y, H ) exp y 3 [ 0 y + w 3 σ 3 σ 3 y In this case, this all collapses down, of course, to ( P(y 3 y, y, H ) exp σ 3 [y 3 y ), (9) as defined in the original generative model (8). In general, the offdiagonal coefficients K tell us the sensitivity of [the mean of the conditional distribution to the other variables. µ y3 y,y = K 3 y K 3 y K 33 (0) So the conditional mean of y 3 is a linear function of the known variables, and the offdiagonal entries in K tell us the coefficients in that linear function. H y H y y 3 y y 3 y. Example Example Example Here s another example, where two parents have one child. For example, the price of electricity y from a power station might depend on the price of gas, y, and the price of carbon emission rights, y 3. H y y 3 y Example Completing the square is ay by = a(y b/a) + constant. 6

y = w y + w 3 y 3 + ν () Note that the units in which gas price, electricity price, and carbon price are measured are all different (pounds per cubic metre, pennies per kwh, and euros per tonne, for example). So y, y 3, and y 3 have different dimensions from each other. Most people who do data modelling treat their data as just numbers, but I think it is a useful discipline to keep track of dimensions and to carry out only dimensionally valid operations. [Dimensionally valid operations satisfy the two rules of dimensions: () only add, subtract and compare quantities that have like dimensions; () arguments of all functions like exp, log, sin must be dimensionless. Rule is really just a special case of rule, since exp(x) = + x + x +..., so to satisfy rule, the dimensions of x must be the same as the dimensions of. What is the covariance matrix? Here we assume that the parent variables y and y 3 are uncorrelated. The covariance matrix is K = K K K 3 K K K 3 K 3 K 3 K 33 = σ w σ 0 σ + w σ + w 3 σ 3 w 3 () Notice the zero correlation between the uncorrelated variables (, 3). What do you think the (, 3) entry in the inverse covariance matrix will be? Let s work it out in the same way as before. The joint distribution is P(y, y, y 3 H ) = P(y )P(y 3 )P(y y, y 3 ) (3) = ( ) ( ) ( exp y exp y 3 exp (y w y y 3 ) ) (4) Z σ Z 3 Z σ We collect all the terms in y i y j. P(y, y, y 3 ) = Z exp ( y = Z exp ( y = Z exp (y w y y 3 ) ) y 3 σ σ [ + w y σ σ σ [ y3 + w 3 σ [ σ [ y y y 3 + y y w σ + y 3 y w 3 + w σ w σ + w w 3 σ σ y 3 y w w 3 σ w σ σ σ ) + w w 3 σ [ σ + w 3 σ y y y 3 So the inverse covariance matrix is [ K = σ + w σ w σ + w w 3 σ w σ σ σ + w w 3 σ [ σ + w 3 σ (5) 7

Notice (assuming w > 0 and w 3 > 0) that the offdiagonal term connecting a parent and a child [K is negative and the offdiagonal term connecting the two parents [K 3 is positive. This positive term indicates that, conditional on all the other variables (i.e., y ), the two parents y and y 3 are anticorrelated. That s explaining away. Once you know the price of electricity was average, for example, you can deduce that if gas was more expensive than normal, carbon probably was less expensive than normal. Omission of one variable Consider example. H y y y 3 Example The covariance matrix of all three variables is: K K K 3 K = K K K 3 = K 3 K 3 K 33 wσ + σ w σ w w 3 σ σ w 3 σ w3 σ + σ 3 (6) If we decide we want to talk about the joint distribution of just y and y, the covariance matrix is simply the sub-matrix: [ [ K K K = w = σ + σ w σ K K σ (7) This follows from the definition of the covariance, K ij = y i y j. (8) The inverse covariance matrix, on the other hand, does not change in such a simple way. The 3 3 inverse covariance matrix was: w 0 σ σ K = w [ + w + w 3 σ σ σ 0 When we work out the inverse covariance matrix, all the Blue terms that originated from the child y 3 are lost. So we have K = σ w σ [ 8 σ w σ + w σ

Specifically, notice that [ K is different from the (, ) entry in the three by three K. We conclude: Leaving out a variable leaves K unchanged but changes K. This conclusion is important for understanding the answer to the question, When working with Gaussian processes, why not parameterize the inverse covariance instead of the covariance function? The answer is: you can t write down the inverse covariance associated with two points! The inverse covariance depends capriciously on what the other variables are. 3 Energy models Sometimes people express probabilistic models in terms of energy functions that are minimized in the most probable configuration. For example, in regression with cubic splines, a regularizer is defined which describes the energy that a steel ruler would have if bent into the shape of the curve. Such models usually have the form: P(y) = Z e E(y) T, (9) and in simple cases, the energy E(y) may be a quadratic function of y, such as E(y) = J ij y i y j + i<j i a i y i (30) If so, then the distribution is a Gaussian (just like ()), and the couplings J ij are minus the coefficients in the inverse covariance matrix. As a simple example, consider a set of three masses coupled by springs, and subjected to thermal perturbations. y y y 3 k 0 k k 3 k 34 Three masses, four springs The equilbrium positions are (y, y, y 3, y 4 ) = (0, 0, 0, 0), and the spring constants are k ij. The extension of the second spring is y y. The energy of this system is E(y) = k 0y + k (y y ) + k 3(y 3 y ) + k 34y3 = [ k 0 + k k 0 y y y 3 k k + k 3 k 3 0 k 3 k 3 + k 34 So at temperature T, the probability distribution of the displacements is Gaussian with inverse covariance matrix k 0 + k k 0 k k + k 3 k 3 (3) T 0 k 3 k 3 + k 34 Notice that there are 0 entries between displacements y and y 3, the two masses that are not directly coupled by a spring. 9 y y y 3

y y y 3 y 4 y 5 k k k k k k Figure. Five masses, six springs So inverse covariance matrices are sometimes very sparse. If we have five masses in a row connected by identical springs k for example, then 0 0 0 K = k 0 0 0 0. (3) T 0 0 0 0 0 But this sparsity doesn t carry over to the covariance matrix, which is 0.83 0.67 0.50 0.33 0.7 K = T 0.67.33.00 0.67 0.33 0.50.00.50.00 0.50. (33) k 0.33 0.67.00.33 0.67 0.7 0.33 0.50 0.67 0.83 4 Eigenvectors and eigenvalues are meaningless There seems to be a knee-jerk reaction when people see a square matrix: what are its eigenvectors? But here, where we are discussing quadratic forms, eigenvectors and eigenvalues have no fundamental status. They are dimensionally invalid objects. Any algorithm that features eigenvectors either didn t need to do so, or shouldn t have done so. (I think the whole idea of principal component analysis is misguided, for example.) Hang on, you say, what about the three masses example? Don t those three masses have meaningful normal modes? Yes, they do, but those modes are not the eigenvectors of the spring matrix (3). Remember, I didn t tell you what the masses of the masses were! I m not saying that eigenvectors are never meaningful. What I m saying is, in the context of quadratic forms Ay, (34) yt eigenvectors are meaningless and arbitrary. Consider a covariance matrix describing the correlation between something s mass y and its length y. [ K K K = (35) K K The dimensions of K are mass-squared. K might be measured in kg, for example. The dimensions of K y y are mass times length. K might be measured in kgm, for example. Here s an example, which might describe the correlation between weight and height of some animals in a survey. [ [ K K K = 0000 kg 70 kgm = K K 70 kgm m (36) 0

00 00 length / m 0 length / cm 0 - -00-00 -00 0 00 00 mass / kg -00-00 0 00 00 mass / kg Figure. Dataset with its eigenvectors. As the text explains, the eigenvectors of covariance matrices are meaningless and arbitrary. The knee-jerk reaction is let s find the principal components of [ our data, which means ignore 0000 70 those silly dimensional units, and just find the eigenvectors of. But let s consider 70 what this means. An eigenvector is a vector satisfying [ 0000 kg 70 kgm 70 kgm m e = λe. (37) By asking for an eigenvector, we are imagining that two equations are true first, the top row: and, second, the bottom row: 0000 kg e + 70 kgme = λe, (38) 70 kgme + m e = λe. (39) These expressions violate the rules of dimensions. Try all you like, but you won t be able to find dimensions for e, e, and λ such that rule is satisfied. No, no, the matlab lover says, I leave out the dimensions, and I get: > [e,v = eig(s) e = 0.007000-0.9999755 v = 5.0998e-0 0.0000e+00-0.9999755-0.007000 0.0000e+00.0000e+04 I notice that the eigenvectors (0.007, 0.9999), and (0.9999, 0.007), which are almost aligned with the coordinate axes. Very interesting! I also notice that the eigenvalues are 0 4 and 0.5. What an interestingly large eigenvalue ratio! Wow, that means that there is one very big principal component, and the second one is much smaller. Ooh, how interesting.

But this is nonsense. If we change the units in which we measure length from m to cm then the covariance matrix can be written: [ [ K K K = 0000 kg 7000 kg cm = K K 7000 kg cm 0000 cm (40) This is exactly the same covariance matrix of exactly the same data. But the eigenvectors and eigenvalues are now: e = -0.707 0.707 v = 3000 0 0.707 0.707 0 7000 Figure illustrates this situation. On the left, a data set of masses and lengths measured in metres. The arrows show the eigenvectors. (The arrows don t look orthogonal in this plot because a step of one unit on the x-axis happens to cover less paper than a step of one unit on the y-axis.) On the right, exactly the same data set but with lengths measured in centimetres. The arrows show the eigenvectors. In conclusion, eigenvectors of the matrix in a quadratic form are not fundamentally meaningful. [Properties of that matrix that are meaningful include its determinant. 4. Aside This complaint about eigenvectors comes hand in hand with another complaint, about steepest descent. A steepest descent algorithm is dimensionally invalid. A step in a parameter space does not have the same dimensions as a gradient. To turn a gradient into a sensible step direction, you need a metric. The metric defines how big a step is (in rather the same way that when gnuplot plotted the data above, it chose a vertical scale and a horizontal scale). Once you know how big alternative steps are, it becomes meaningful to take the step that is steepest (that is, it s the direction with the biggest change in function value per unit distance moved). Without a metric, steepest descents algorithms are not covariant. That is, the algorithm would behave differently if you just changed the units in which one parameter is measured. Appendix: Answers to quiz For the first four, you can quickly guess the answers based on whether the (, 3) entries are zero or not. For a careful answer you should also check that the matrices really are positive definite (they are) and that they are realisable by the respective graphical models (which isn t guaranteed by the preceding constraints).. A and B. C and D 3. C and D 4. A and B 5. A is true, B is false.