arxiv: v2 [stat.ml] 4 Jun 2015 Abstract

Size: px
Start display at page:

Download "arxiv: v2 [stat.ml] 4 Jun 2015 Abstract"

Transcription

1 Visual Causal Feature Learning Krzysztof Calupka Computation and Neural Systems California Institute of Tecnology Pasadena, CA, USA Pietro Perona Electrical Engineering California Institute of Tecnology Pasadena, CA, USA Frederick Eberardt Humanities and Social Sciences California Institute of Tecnology Pasadena, CA, USA arxiv: v2 [stat.ml] 4 Jun 2015 Abstract We provide a rigorous definition of te visual cause of a beavior tat is broadly applicable to te visually driven beavior in umans, animals, neurons, robots and oter perceiving systems. Our framework generalizes standard accounts of causal learning to settings in wic te causal variables need to be constructed from micro-variables. We prove te Causal Coarsening Teorem, wic allows us to gain causal knowledge from observational data wit minimal experimental effort. Te teorem provides a connection to standard inference tecniques in macine learning tat identify features of an image tat correlate wit, but may not cause, te target beavior. Finally, we propose an active learning sceme to learn a manipulator function tat performs optimal manipulations on te image to automatically identify te visual cause of a target beavior. We illustrate our inference and learning algoritms in experiments based on bot syntetic and real data. 1 INTRODUCTION Visual perception is an important trigger of uman and animal beavior. Te visual cause of a beavior can be easy to define, say, wen a traffic ligt turns green, or quite subtle: apparently it is te increased symmetry of features tat leads people to judge faces more attractive tan oters (Grammer and Tornill, 1994). Significant scientific and economic effort is focused on visual causes in advertising, entertainment, communication, design, medicine, robotics and te study of uman and animal cognition. Visual causes profoundly influence our daily activity, yet our understanding of wat constitutes a visual cause lacks a teoretical basis. In practice, it is well-known tat images are composed of millions of variables (te pixels) but it is functions of te pixels (often called features ) tat ave meaning, rater tan te pixels temselves. We present a teoretical framework and inference algoritms for visual causes in images. A visual cause is defined (more formally below) as a function (or feature) of raw image pixels tat as a causal effect on te target beavior of a perceiving system of interest. We present tree advances: We provide a definition of te visual cause of a target beavior as a macro-variable tat is constructed from te micro-variables (pixels) tat make up te image space. Te visual cause is distinguised from oter macro-variables in tat it contains all te causal information about te target beavior tat is available in te image. We place te visual cause witin te standard framework of causal grapical models (Spirtes et al., 2000; Pearl, 2009), tereby contributing to an account of ow to construct causal variables. We prove te Causal Coarsening Teorem (CCT), wic sows ow observational data can be used to learn te visual cause wit minimal experimental effort. It connects te present results to standard classification tasks in macine learning. We describe a metod to learn te manipulator function, wic automatically performs perceptually optimal manipulations on te visual causes. We illustrate our ideas using syntetic and real-data experiments. Pyton code tat implements our algoritms, as well as reproduces some of te experimental results, is available online at ttp://vision.caltec.edu/ kcalupk/code.tml. We cose to develop te teory witin te context of visual causes as tis setting makes te definitions most intuitive and is itself of significant practical interest. However, te framework and results can be equally well applied to extract causal information from any aggregate of microvariables on wic manipulations are possible. Examples include auditory, olfactory and oter sensory stimuli; igdimensional neural recordings; market data in finance; consumer data in marketing. Tere, causal feature learning is bot of teoretical ( Wat is te cause? ) and practical ( Can we automatically manipulate it? ) importance.

2 1.1 PREVIOUS WORK Our framework extends te teory of causal grapical models (Spirtes et al., 2000; Pearl, 2009) to a setting in wic te input data consists of raw pixel (or oter microvariable) data. In contrast to te standard setting, in wic te macro-variables in te statistical dataset already specify te candidate causal relata, te causal variables in our setting ave to be constructed from te micro-variables tey supervene on, before any causal relations can be establised. We empasize te difference between our metod of causal feature learning and metods for causal feature selection (Guyon et al., 2007; Pellet and Elisseeff, 2008). Te latter coose te best (under some causal criterion) features from a restricted set of plausible macro-variable candidates. In contrast, our framework efficiently searces te wole space of all te possible macro-variables tat can be constructed from an image. Our approac derives its teoretical underpinnings from computational mecanics (Salizi and Crutcfield, 2001; Salizi, 2001), but supports a more explicitly causal interpretation by incorporating te possibility of confounding and interventions. Since we allow for unmeasured common causes of te features in te image and te target beavior, we ave to distinguis between te plain conditional probability distribution of te target beavior (T ) given te (observed) image (I) and te distribution of te target beavior given tat te observed image was manipulated (i.e. P (T I) vs. P (T do(i))). Hoel et al. (2013), wo develop a similar model to investigate te relationsip between causal micro- and macro-variables, avoid tis distinction by assuming tat all teir data was generated from wat in our setting would be te manipulated distribution P (T do(i)). We take te distinction between interventional and observational distributions to be one of te key features of a causal analysis. Te extant literature on causal learning from image or video data does not generally consider te aggregation from pixel variables into causal macro-variables, but instead starts from annotated or pre-defined features of te image (see e.g. Fire and Zu (2013a,b)). 1.2 CAUSAL FEATURE LEARNING: AN EXAMPLE Fig. 1 presents a paradigmatic case study in visual causal feature learning, wic we will use as a running example. Te contents of an image I are caused by external, nonvisual binary idden variables H 1 and H 2 suc tat if H 1 is on, I contains a vertical bar (v-bar 1 ) at a random position, and if H 2 is on, I contains a orizontal bar (-bar) at a random position. A target beavior T {0, 1} is caused by H 1 and I, suc tat T = 1 is more likely wenever H 1 = 1 and wenever te image contains an -bar. 1 We take a v-bar (-bar) to consist of a complete column (row) of black pixels. We deliberately constructed tis example suc tat te visual cause is clearly identifiable: manipulating te presence of an -bar in te image will influence te distribution of T. Tus, we can call te following function C : I {0, 1} te causal feature of I or te visual cause of T : { 1 if I contains an -bar C(I) = 0 oterwise. Te presence of a v-bar, on te oter and, is not a causal feature. Manipulating te presence of a v-bar in te image as no effect on H 1 or T. Still, te presence of a v-bar is as strongly correlated wit te value of T (via te common cause H 1 ) as te presence of an -bar is. We will call te following function S : I {0, 1} te spurious correlate of T in I: { 1 if I contains a v-bar S(I) = 0 oterwise. Bot te presence of -bars and te presence of v-bars are good individual (and even better joint) predictors of te target variable, but only one of tem is a cause. Identifying te visual cause from te image tus requires te ability to distinguis among te correlates of te target variables tose tat are actually causal, even if te non-causal correlates are (possibly more strongly) correlated wit te target. Wile te values of S and C in our example stand in a bijective correspondence to te values of H 1 and H 2, respectively, tis is only to keep te illustration simple. In general, te visual cause and te spurious correlate can be probabilistic functions of any number of (not necessarily independent) idden variables, and can sare te same idden causes. 2 A THEORY OF VISUAL CAUSAL FEATURES In our example te identification of te visual cause wit te presence of an -bar is intuitively obvious, as te model is constructed to ave an easily describable visual cause. But te example does not provide a teoretical account of wat it takes to be a visual cause in te general case wen we do not know wat te causally relevant pixel configurations are. In tis section, we provide a general account of ow te visual cause is related to te pixel data. 2.1 VISUAL CAUSES AS MACRO-VARIABLES A visual cause is a ig-level random variable tat is a function (or feature) of te image, wic in turn is defined by te random micro-variables tat determine te pixel values. Te functional relation between te image and te visual cause is, in general, surjective, toug in principle it could be bijective. Wile we are interested in identifying

3 P( I H1=0, H2=0) = U( ) P( I H1=0, H2=1) = U( ) P( I H1=1, H2=0) = U( ) P( I H1=1, H2=1) = U( ) H1 H2 I T P(H2=0) = 0.5 P(H1=0) = 0.5 P(T=0 I (,, ), H1=1) = 0 P(T=0 I (, ), H1=0) =.33 P(T=0 I (, ), H1=1) =.66 P(T=0 I (, ), H1=0) = 1 Figure 1: Our case study generative model. Two binary idden (non-visual) variables H 1 and H 2 toss unbiased coins. Te content of te image I depends on tese variables as follows. If H 1 = H 2 = 0, I is cosen uniformly at random from all te images containing no v-bars and no -bars. If H 1 = 0 and H 2 = 1, I is cosen uniformly at random from all images containing at least one -bar but no v-bars. If H 1 = 1 and H 2 = 0, I is cosen uniformly at random from all te images containing at least one v-bar but no -bars. Finally, if H 1 = H 2 = 1, I is cosen from images containing at least one v-bar and at least one -bar. Te distribution of te binary beavior T depends only on te presence of an -bar in I and te value of H 1. In observational studies, H 1 = 1 iff I contains a v-bar. However, a manipulation of any specific image I = i tat introduces a v-bar (witout canging H 1 ) will in general not cange te probability of T occurring. Tus, T does not depend causally on te presence of v-bars in I. te visual causes of a target beavior, te functional relation between te image pixels and te visual cause sould not itself be interpreted as causal. Pixels do not cause te features of an image, tey constitute tem, just as te atoms of a table constitute te table (and its features). Te difference between te causal and te constitutive relation is tat te former requires te possibility of independent manipulation (at least to some extent), wereas by definition one cannot manipulate te visual cause witout manipulating te image pixels. Te probability distribution over te visual cause is induced by te probability distribution over te pixels in te image and te functional mapping from te image to te visual cause. But since a visual cause stands in a constitutive relation wit te image, we cannot witout furter explanation describe interventions on te visual cause in terms of te standard do-operation (Pearl, 2009). Our goal will be to define a macro-variable C, wic contains all te causal information available in an image about a given beavior T, and define its manipulation. To make te problem approacable, we introduce two (natural) assumptions about te causal relation between te image and te beavior: (i) Te value of te target beavior T is determined subsequently to te image in time, and (ii) te variable T is in no way represented in te image. Tese assumptions exclude te possibility tat T is a cause of features in te image or tat T can be seen as causing itself. 2.2 GENERATIVE MODELS: FROM MICRO- TO MACRO-VARIABLES Let T {0, 1} represent a target beavior. 2 Let I be a discrete space of all te images tat can influence te target beavior (in our experiments in Section 4, I is te space of n-dimensional black-and-wite images). We use te following generative model to describe te relation between te images and te target beavior: An image is generated by a finite set of unobserved discrete variables H 1,..., H m (we write H for sort). Te target beavior is ten determined by te image and possibly a subset of variables H c H tat are confounders of te image and te target beavior: P (T, I) = H = H P (T I, H)P (I H)P (H) P (T I, H c )P (I H)P (H). (1) Independent noise tat may contribute to te target beavior is marginalized and omitted for te sake of simplicity in te above equation. Te noise term incorporates any idden variables wic influence te beavior but stand in no causal relation to te image. Suc variables are not directly relevant to te problem. Fig. 2 sows tis generative model. Under tis model, we can define an observational partition of te space of images I tat groups images into classes tat ave te same conditional probability P (T I): Definition 1 (Observational Partition, Observational Class). Te observational partition Π o (T, I) of te set I w.r.t. beavior T is te partition induced by te equivalence relation suc tat i j if and only if P (T I = i) = P (T I = j). We will denote it as Π o wen te context is clear. A cell of an observational partition is called an observational class. In standard classification tasks in macine learning, te observational partition is associated wit class labels. In our case, two images tat belong to te same cell of te observational partition assign equal predictive probability to te target beavior. Tus, knowing te observational class 2 An extension of te framework to non-binary, discrete T is easy but complicates te notation significantly. An extension to te continuous case is beyond te scope of tis article.

4 called a causal class. HC = (H2, HN) H1 H = (H1,..., HN) I H2 T HN Te underlying idea is tat images are considered causally equivalent wit respect to T if tey ave te same causal effect on T. Given te causal partition of te image space, we can now define te visual cause of T : Definition 4 (Visual Cause). Te visual cause C of a target beavior T is a random variable wose value stands in a bijective relation to te causal class of I. Figure 2: A general model of visual causation. In our model eac image I is caused by a number of idden nonvisual variables H i, wic need not be independent. Te image itself is te only observed cause of a target beavior T. In addition, a (not necessarily proper) subset of te idden variables can be a cause of te target beavior. Tese confounders create visual spurious correlates of te beavior in I. of an image allows us to predict te value of T. However, te predictive probability assigned to an image does not tell us te causal effect of te image on T. For example, a barometer is widely taken to be an excellent predictor of te weater. But canging te barometer needle does not cause an improvement of te weater. It is not a (visual or oterwise) cause of te weater. In contrast, seeing a particular barometer reading may well be a visual cause of weter we pack an umbrella. Our notion of a visual cause depends on te ability to manipulate te image. Definition 2 (Visual Manipulation). A visual manipulation is te operation man(i = i) tat canges (te pixels of) te image to image i I, wile not affecting any oter variables (suc as H or T ). Tat is, te manipulated probability distribution of te generative model in Eq. (1) is given by P (T man(i = i)) = H c P (T I = i, H c )P (H c ). Te manipulation canges te values of te image pixels, but does not cange te underlying world, represented in our model by te H i tat generated te image. Formally, te manipulation is similar to te do-operator for standard causal models. However, we ere reserve te do-operation for interventions on causal macro-variables, suc as te visual cause of T. We discuss te distinction in more detail below. We can now define te causal partition of te image space (wit respect to te target beavior T ) as: Definition 3 (Causal Partition, Causal Class). Te causal partition Π c (T, I) of te set I w.r.t. beavior T is te partition induced by te equivalence relation defined on I suc tat i j if and only if P (T man(i = i)) = P (T man(i = j)) for i, j I. Wen te image space and te target beavior are clear from te context, we will indicate te causal partition by Π c. A cell of a causal partition is Te visual cause is tus a function over I, wose values correspond to te post-manipulation distributions C(i) = P (T man(i = i)). We will write C(i) = c to indicate tat te causal class of image i I is c, or in oter words, tat in image i, te visual cause C takes value c. Knowing C allows us to predict te effects of a visual manipulation P (T man(i = i)), as long as we ave estimated P (T man(i = i k )) for one representative i k of eac causal class k. 2.3 THE CAUSAL COARSENING THEOREM Our main teorem relates te causal and observational partitions for a given I and T. It turns out tat in general te causal partition is a coarsening of te observational partition. Tat is, te causal partition aligns wit te observational partition, but te observational partition may subdivide some of te causal classes. Teorem 5 (Causal Coarsening). Among all te generative distributions of te form sown in Fig. 2 wic induce a given observational partition Π o, almost all induce a causal partition Π c tat is a coarsening of te Π o. Trougout tis article, we use almost all to mean all except for a subset of Lebesgue measure zero. Fig. 3 illustrates te relation between te causal and te observational partition implied by te teorem. We note tat te measure-zero subset were Π C does not coarsen Π O can indeed be non-empty. We provide suc counter-examples in Appendix 7. We prove te CCT in Appendix 6 using a tecnique tat extends tat of Meek (1995): We sow tat (1) restricting te space of all te possible P (T, H, I) to only te distributions compatible wit a fixed observational partition puts a linear constraint on te distribution space; (2) requiring tat te CCT be false puts a non-trivial polynomial constraint on tis subspace, and finally, (3) it follows tat te teorem olds for almost all distributions tat agree wit te given observational partition. Te proof strategy indicates a close connection between te CCT and te faitfulness assumption (Spirtes et al., 2000). Two points are wort noting ere: First, te CCT is interesting inasmuc as te visual causes of a beavior do not contain all te information in te image tat predict te beavior. Suc information, toug not itself a cause of

5 P(T=0 ) = 0 P(T=0 ) =.33 P(T=0 ) =.66 P(T=0 ) = 1 P(T=0 do{ }) =.17 P(T=0 do{ }) =.83 Figure 3: Te Causal Coarsening Teorem. Te observational probabilities of T given I (gray frame) induce an observational partition on te space of all te images (left, observational partition in gray). Te causal probabilities (red frame) induce a causal partition, indicated on te left in red. Te CCT allows us to expect tat te causal partition is a coarsening of te observational partition. Te observational and causal probabilities correspond to te generative model sown in Fig. 1. te beavior, can be informative about te state of oter non-visual causes of te target beavior. Second, te CCT allows us to take any classification problem in wic te data is divided into observational classes, and assume tat te causal labels do not cange witin eac observational class. Tis will elp us develop efficient causal inference algoritms in Section VISUAL CAUSES IN A CAUSAL MODEL CONSISTING OF MACRO-VARIABLES We can now simplify our generative model by omitting all te information in I unrelated to beavior T. Assume tat te observational partition Π T o refines te causal partition Π T c. Eac of te causal classes c 1,, c K delineates a region in te image space I suc tat all te images belonging to tat region induce te same P (T man(i)). Eac of tose regions say, te k-t one can be furter partitioned into sub-regions s k 1,, s k M k suc tat all te images in te m-t sub-region of te k-t causal region induce te same observational probability P (T I). By assumption, te observational partition as a finite number of classes, and we can arbitrarily order te observational classes witin eac causal class. Once suc an ordering is fixed, we can assign an integer m {1, 2,, M k } to eac image i belonging to te k-t causal class suc tat i belongs to te m-t observational class among te M k observational classes contained in c k. By construction, tis integer explains all te variation of te observational class witin a given causal class. Tis suggests te following definition: Definition 6 (Spurious Correlate). Te spurious correlate S is a discrete random variable wose value differentiates between te observational classes contained in any causal Figure 4: A macro-variable model of visual causation. Using our teory of visual causation we can aggregate te information present in visual micro-variables (image pixels) into te visual cause C and spurious correlate S. According to Teorem 7, C and S contain all te information about T available in I. class. Te spurious correlate is a well-defined function on I, wose value ranges between 1 and max k M k. Like C, te spurious correlate S is a macro-variable constructed from te pixels tat make up te image. C and S togeter contain all and only te visual information in I relevant to T, but only C contains te causal information: Teorem 7 (Complete Macro-variable Description). Te following two statements old for C and S as defined above: 1. P (T I) = P (T C, S). 2. Any oter variable X suc tat P (T I) = P (T X) as Sannon entropy H(X) H(C, S). We prove te teorem in Appendix 8. It guarantees tat C and S constitute te smallest-entropy macro-variables tat encompass all te information about te relationsip between T and I. Fig. 4 sows te relationsip between C, S and T, te image space I and te observational and causal partitions scematically. C is now a cause of T, S correlates wit T due to te unobserved common causes H C, and any information irrelevant to T is pused into te independent noise variables (commonly not sown in grapical representations of structural equation models). 3 Te macro-variable model lends itself to te standard treatment of causal grapical models described in Pearl (2009). We can define interventions on te causal variables {C, S, T } using te standard do-operation. Te dooperator only sets te value of te intervened variable to 3 We note tat C may retain predictive information about T tat is not causal, i.e. it is not te case tat all spurious correlations can be accounted for in S. See Appendix 9 for an example.

6 te desired value, making it independent of its causes, but it does not (directly) affect te oter variables in te system or te relationsips between tem (see te modularity assumption in Pearl (2009)). However, unlike te standard case were causal variables are separated in location (e.g. smoking and lung cancer), te causal variables in an image may involve te same pixels: C may be te average brigtness of te image, wereas S may indicate te presence or absence of particular sapes in te image. An intervention on a causal variable using te do-operator tus requires tat te underlying manipulation of te image respects te state of te oter causal variables: Definition 8 (Causal Intervention on Macro-variables). Given te set of macro-variables {C, S} tat take on values {c, s} for an image i I, an intervention do(c = c ) on te macro-variable C is given by te manipulation of te image man(i = i ) suc tat C(i ) = c and S(i ) = s. Te intervention do(s = s ) is defined analogously as te cange of te underlying image tat keeps te value of C constant. In some cases it can be impossible to manipulate C to a desired value witout canging S. We do not take tis to be a problem special to our case. In fact, in te standard macrovariable setting of causal analysis we would expect interventions to be muc more restricted by pysical constraints tan we are wit our interventions in te image space. 3 CAUSAL FEATURE LEARNING: INFERENCE ALGORITHMS Given te teoretical specification of te concepts of interest in te previous section, we can now develop algoritms to learn C, te visual cause of a beavior. In addition, knowledge of C will allow us to specify a manipulator function: a function tat, given any image, can return a maximally similar image wit te desired causal effect. Definition 9 (Manipulator Function). Let C be te causal variable of T and d a metric on I. Te manipulator function of C is a function M C : I C I suc tat M C (i, k) = arg minî C 1 (k) d(i, î) for any i I, k C. In case d(i,.) as multiple minima, we group tem togeter into one equivalence class and leave te coice of te representative to te manipulator function. Te manipulator searces for an image closest to I among all te images wit te desired causal effect k. Te meaning of closest depends on te metric d and is discussed furter in Section 3.2 below. Note tat te manipulator function can find candidates for te image manipulation underlying te desired causal manipulation do(c = c), but it does not ceck weter oter variables in te system (in particular, te spurious correlate) remain in fact uncanged. Using te closest possible image wit desired causal effect is a euristic approac to fulfilling tat requirement. Algoritm 1: Causal Predictor Training input : D obs = {(i 1, p 1 = p(t i 1 )),, (i N, p N = p(t i N )} observational data P = {P 1,, P M } te set of observational classes (so tat k, p k P, 1 k N) Train a neural net training algoritm output: C : I [0, 1] te causal variable 1 Pick {i k1,, i km } {i 1,, i N } s.t. p km = P m ; 2 Estimate Ĉm P (T man(i = i km )) for eac m; 3 For all k let Ĉ(i k) Ĉm if p k = P m ; 4 D csl {(i 1, Ĉ(i 1)),, (i N, Ĉ(i N))}; 5 C Train(D csl ); Tere are several reasons wy we migt want suc a manipulator function: If our goal is to perform causal manipulations on images, te manipulator function offers an automated solution. A manipulator tat uses a given C and produces images wit te desired causal effect provides strong evidence tat C is indeed te visual cause of te beavior. Using te manipulator function we can enric our dataset wit new datapoints, in ope of acieving better generalization on bot te causal and predictive learning tasks. Te problem of visual causal feature learning can now be posed as follows: Given an image space I and a metric d, learn C te visual cause of T and te manipulator M C. 3.1 CAUSAL EFFECT PREDICTION A standard macine learning approac to learning te relation between I and T would be to take an observational dataset D obs = {(i k, P (T i k ))} k=1,,n and learn a predictor f wose training performance guarantees a low test error (so tat f(i ) P (T i ) for a test image i ). In causal feature learning, low test error on observational data is insufficient; it is entirely possible tat D contains spurious information useful in predicting test labels wic is neverteless not causal. Tat is, te prediction may be igly accurate for observational data, but completely inaccurate for a prediction of te effect of a manipulation of te image (recall te barometer example). However, we can use te CCT to obtain a causal dataset from te observational data, and ten train a predictor on tat dataset. Algoritm 1 uses tis strategy to learn a function C tat, presented wit any image i I, returns C(i) P (T man(i = i)). We use a fixed neural network arcitecture to learn C, but any differentiable ypotesis class could be susbtituted instead. Differentiability of C is necessary in Section 3.2 in order to learn te manipulator function.

7 In Step 1 te algoritm picks a representative member of eac observational class. Te CCT tells us tat te causal partition coarsens te observational one. Tat is, in principle (ignoring sampling issues) it is sufficient to estimate Ĉm = P (T man(i = i km )) for just one image in an observational class m in order to know tat P (T man(i = i)) = Ĉm for any oter i in te same observational class. Te coice of te experimental metod of estimating te causal class in Step 2 is left to te user and depends on te beaving agent and te beavior in question. If, for example, T represents weter te spiking rate of a recorded neuron is above a fixed tresold, estimating P (T man(i = i)) could consist of recording te neuron s response to i in a laboratory setting multiple times, and ten calculating te probability of spiking from te finite sample. Te causal dataset created in Step 4 consists of te observational inputs and teir causal classes. Te causal dataset is acquired troug O(N) experiments, were N is te number of observational classes. Te final step of te algoritm trains a neural network tat predicts te causal labels on unseen images. Te coice of te metod of training is again left to te user. 3.2 CAUSAL FEATURE MANIPULATION Once we ave learned C we can use te causal neural network to create syntetic examples of images as similar as possible to te originals, but wit a different causal label. Te meaning of as similar as possible depends on te image metric d (see Definition 9). Te coice of d is taskspecific and crucial to te quality of te manipulations. In our experiments, we use a metric induced by an L 2 norm. Alternatives include oter L p -induced metrics, distances in implicit feature spaces induced by image kernels (Harcaoui and Bac, 2007; Grauman and Darrell, 2007; Bosc et al., 2007; Viswanatan, 2010) and distances in learned representation spaces (Bengio et al., 2013). Algoritm 2 proposes one way to learn te manipulator function using a simple manipulation procedure tat approximates te requirements of Definition 9 up to local minima. Te algoritm, inspired by te active learning tecniques of uncertainty sampling (Lewis and Gale, 1994) and density weiging (Settles and Craven, 2008), starts off by training a causal neural network in Step 2. If only observational data is available, tis can be acieved using Algoritm 1. Next, it randomly cooses a set of images to be manipulated, and teir target post-manipulation causal labels. Te loop tat starts in Step 6 ten takes eac of tose images and searces for te image tat, among te images wit te same desired causal class, is closest to te original image. Note tat te causal class boundaries are defined by te current causal neural net C. Since C is in general a igly nonlinear function and it can be ard to find its inverse sets, we use an approximate solution. Te algoritm tus finds te minimum of a weigted sum of C(j) ĉ l,k Algoritm 2: Manipulator Function Learning input : d: I I R + a metric on te image space D csl = {(i 1, c 1 ), (i N, c N )} causal data C = {C 1,, C M } te set of causal classes (so tat i c i C) Train a neural net training algoritm niters number of experiment iterations Q number of queries per iteration α manipulation tuning parameter A: I C an oracle for P (T do(i)) output: M C : I C I te manipulator function 1 for l 1 to niters do 2 C Train(D csl ); 3 Coose manipulation starting points {i l,1,, i l,q } at random from D csl ; 4 Coose manipulation targets {ĉ l,1,, ĉ l,q } suc tat ĉ l,k c l,k ; 5 for k 1 to Q do 6 î l,k argmin (1 α) C(j) ĉ l,k j I + α d(j, i l,k ); 7 end 8 D csl D csl {(î l,1, A(î l,1 )),, (î l,q, A(î l,q ))}; 9 end (te difference of te output image j s label and te desired label ĉ l,k ) and d(i l,k, j) (te distance of te output image j from te original image i l,k ). At eac iteration, te algoritm performs Q manipulations and te same number of causal queries to te agent, wic result in new datapoints (î l,1, A(î l,1 )),, (î l,q, A(î l,q )). It is natural to claim tat te manipulator performs well if A(î l,k ) ĉ l,k for many k, wic means te target causal labels agree wit te true causal labels. We tus define te manipulation error of te lt iteration MErr l as MErr l = 1 Q Q A(î l,k ) ĉ l,k. (2) k=1 Wile it is important tat our manipulations are accurate, we also want tem to be minimal. Anoter measure of interest is tus te average manipulation distance MDist l = 1 Q Q d(i l,k, î l,k ). (3) k=1 A natural variant of Algoritm 2 is to set niters to a large integer and break te loop wen one or bot of tese performance criteria reaces a desired value.

8 4 EXPERIMENTS In order to illustrate te concepts presented in tis article we perform two causal feature learning experiments. Te first experiment, called GRATING, uses observational and causal data generated by te model from Section 1.2. Te GRATING experiment confirms tat our system can learn te ground trut cause and ignore te spurious correlates of a beavior. Te second experiment, MNIST, uses images of and-written digits (LeCun et al., 1998) to exemplify te use of te manipulator function on sligtly more realistic data: in tis example, we transform an image into a maximally similar image wit anoter class label. We cose problems tat are simple from te computer vision point of view. Our goal is to develop te teory of visual causal feature learning and sow tat it as feasible algoritmic solutions; we are at tis point not engineering advanced computer vision systems. MDist MErr Iteration NONE 4.1 THE GRATING EXPERIMENT In tis experiment we generate data using te model of Fig. 1, wit two minor differences: H 1 and H 2 only induce one v-bar or -bar in te image and we restrict our observational dataset to images wit only about 3% of te pixels filled wit random noise (see Fig. 5). Bot restrictions increase te clarity of presentation. We use Algoritms 1 and 2 (wit minor modifications imposed by te binary nature of te images) to learn te visual cause of beavior T. Figure 5 (top) sows te progress of te training process. Te first step (not sown in te figure) uses te CCT to learn te causal labels on te observational data. We ten train a simple neural network (a fully connected network wit one idden layer of 100 units) on tis data. Te same network is used on Iteration 1 to create new manipulated exemplars. We ten follow Algoritm 2 to train te manipulator iteratively. Fig. 5 (bottom) illustrates te difference between te manipulator on Iteration 1 (wic fails almost 40% of te time) and Iteration 20, were te error is about 6%. Eac column sows example manipulations of a particular kind. Columns wit green labels indicate successful manipulations of wic tere are two kinds: switcing te causal variable on (0 1, adding te -bar ), or switcing it off (1 0, removing te -bar ). Red-labeled columns sow cases in wic te manipulator failed to influence te cause: Tat is, eac red column sows an original image and its manipulated version wic te manipulator believes sould cause a cange in T, but wic does not induce suc cange. Te red/green orizontal bars sow te percentage of success/error for eac manipulation direction. Fig. 5 (bottom, a) sows tat after training on te causally-coarsened observational dataset, te manipulator fails about 40% of te time. In Fig. 5 (b), after twenty manipulator learning iterations, only six manipulations out of (a) Iteration 1 (b) Iteration NONE Figure 5: Manipulator learning for GRATING. Top. Te plots sow te progress of our manipulator function learning algoritm over ten iterations of experiments for te GRATING problem. Te manipulation error decreases quickly wit progressing iterations, wereas te manipulation distance stays close to constant. Bottom. Original and manipulated GRATING images. See text for te details. a undred are unsuccessful. Furtermore, te causally irrelevant image pixels are also muc better preserved tan at iteration 1. Te fully-trained manipulator correctly learned to manipulate te presence of te -bar to cause canges in T, and ignores te v-bar tat is strongly correlated wit te beavior but does not cause it. 4.2 THE MNIST ON MTURK EXPERIMENT In tis experiment we start wit te MNIST dataset of andwritten digits. In our terminology, tis as well as any standard vision dataset is already causal data: te labels are assigned in an experimental setting, not in nature. Consider te following binary uman beavior: T = 1 if a uman observer answers affirmatively to te question Does tis image contain te digit 7?, wile T = 0 if te observer judges tat te image does not contain te digit 7. For simplicity we will assume tat for any image ei-

9 MErr MDist Starting Digit Iteration Target Class Figure 6: Manipulator Learning for MNIST ON MTURK. Top. In contrast to te GRATING experiment, ere te manipulation distance grows as te manipulation error decreases. Tis is because a successful manipulator needs to cange significant parts of eac image (suc as continuous strokes). Bottom. Visualization of manipulator training on randomly selected (not cerry-picked) MNIST digits. See text for te details. ter P (T = 1 man(i)) = 0 or P (T = 1 man(i)) = 1. Our task is to learn te manipulator function tat will take any image and modify it minimally suc tat it will become a 7 if it was not before, or will stop resembling a 7 if it did originally. We conduct te manipulator training separately for all te ten MNIST digits using uman annotators on Amazon Mecanical Turk. Te exact training procedure is described in Appendix 10. Fig. 6 (top) sows training progress. As in Fig. 5, te manipulation error decreases wit training. Fig. 6 (bottom) visualizes te manipulator training progress. In te first row we see a randomly cosen MNIST 9 being manipulated to resemble a 0, pused troug successive 0-vs-all manipulators trained at iterations 0, 1,..., 5 (iteration 1 sows wat te neural net takes to be te closest manipulation to cange te 9 to a purely on te basis of te non-manipulated data). Furter rows perform similar experiments for te oter digits. Te plots sow ow successive manipulators progressively remove te original digits features and add target class features to te image. 5 DISCUSSION We provide a link between causal reasoning and neural network models tat ave recently enjoyed tremendous success in te fields of macine learning and computer vision (LeCun et al., 1998; Russakovsky et al., 2014). Despite very encouraging results in image classification (Krizevsky et al., 2012), object detection (Dollar et al., 2012) and fine-grained classification (Branson et al., 2014; Zang et al., 2014), some researcers ave found tat visual neural networks can be easily fooled using adversarial examples (Szegedy et al., 2014; Goodfellow et al., 2014). Te learning procedure for our manipulator function could be viewed as an attempt to train a classifier tat is robust against suc examples. Te procedure uses causal reasoning to improve on te boundaries of a standard, correlational classifier (Fig. 5 and 6 sow te improvement). However, te ultimate purpose of a causal manipulator network is to extract truly causal features from data and automatically perform causal manipulations based on tose features. A second contribution concerns te field of causal discovery. Modern causal discovery algoritms presuppose tat te set of causal variables is well-defined and meaningful. Wat exactly tis presupposition entails is unclear, but tere are clear counter-examples: x and 2x cannot be two distinct causal variables. Tere are also well understood problems wen causal variables are aggregates of oter variables (Cu et al., 2003; Spirtes and Sceines, 2004). We provide an account of ow causal macro-variables can supervene on micro-variables. Tis article is an attempt to clarify ow one may construct a set of well-defined causal macro-variables tat function as basic relata in a causal grapical model. Tis step strikes us as essential if causal metodology is to be successful in areas were we do not ave clearly delineated candidate causes or were causes supervene on micro-variables, suc as in climate science and neuroscience, economics and in our specific case vision. Acknowledgements KC s work was funded by te Qualcomm Innovation Fellowsip KC s and PP s work was supported by te ONR MURI grant N FE would like to tank Cosma Salizi for pointers to many relevant results tis paper builds on.

10 6 APPENDIX: PROOF OF THE CAUSAL COARSENING THEOREM Before we prove te Causal Coarsening Teorem, we prove its less general version in order to split te rater complex proof of CCT into two parts. Tis Auxiliary Teorem can be proven using simpler tecniques, owever ere we deliberately use tecniques tat transfer directly to te proof of te CCT. Auxiliary Teorem Among all te generative models of te form discussed in Fig. 2 (in te main text), te subset of distributions P (T, H, I) for wic te causal partition is not a coarsening (proper or improper) of te observational partition is Lebesgue measure zero. Proof. Our proof is inspired by a proof used by Meek (1995) to prove tat almost all distributions compatible wit a given causal grap are faitful. Te proof strategy is tus first to express te proposition tat for a given distribution, te observational partition does not refine te causal partition as a polynomial equation on te space of all distributions compatible wit te model. We ten sow tat tis polynomial equation is not trivial, i.e. tere is at least one distribution tat is not its root. By a simple algebraic lemma, tis will prove te teorem. We extend Meek s proof tecnique in our usage of Fubini s Teorem for te Lebesgue integral. It allows us to split te polynomial constraint into multiple different constraints along several of te distribution parameters. Tis allows for additional flexibility in creating useful assumptions (in our proof, te assumption tat te datapoints ave well-defined causal classes, but te observational class can still vary freely). Assume tat T is binary and H = (H 1,, H M ), I are discrete variables (say H i = K i, I = N, toug N can be very large. We will use te notation K K 1 K M for simplicity later on). Te discreteness assumption is not crucial, but will simplify te reasoning. We can factorize te joint as P (T, H, I) = P (T H, I)P (I H)P (H). P (T H, I) can be parametrized by H 1 H M I = K N parameters, P (I H) by (N 1) K parameters, and P (H) by anoter K parameters, all of wic are independent. Call te parameters, respectively, α,i P (T = 0 H =, I = i) β i, P (I = i H = ) γ P (H = ) We will denote parameter vectors as α = (α 1,i 1,, α K,i N ) R K N β = (β i1, 1,, β in 1, K ) R (N 1) K γ = (γ 1,, γ K ) R K, were te indices are arranged in lexicograpical order. Tis creates a one-to-one correspondence of eac possible joint distribution P (T, H, I) wit a point (α, β, γ) P [α, β, γ] R K3 N (N 1), were P [α, β, γ] is te K 3 N (N 1)-dimensional simplex of multinomial distributions. To proceed wit te proof, we first pick any point in te P (T H, I) P (H) space: tat is, we fix te values of α and γ. Te only free parameters are now β i, for all values of i, ; varying tese values creates a subset of te space of all te distributions wic we will call P [β; α, γ] = {(α, β, γ) β [0, 1] (N 1) K }. P [β; α, γ] is a subset of P [α, β, γ] isometric to te [0, 1] (N 1) K -dimensional simplex of multinomials. We will use te term P [β; α, γ] to refer bot te subset of P [α, β, γ] and te lower-dimensional simplex it is isometric to, remembering tat te latter comes equipped wit te Lebesgue measure on R (N 1) K. Now we are ready to sow tat te subset of P [β; α, γ] wic does not satisfy te Causal Coarsening constraint is of measure zero wit respect to te Lebesgue measure. To see tis, first note tat since α and γ are fixed, eac image i as a well-defined causal class C(i) = α,iγ. Te Causal Coarsening constraint says For every pair of images i, j suc tat P (T i) = P (T j) it olds tat C(i) = C(j). Te subset of P [β; α, γ] of all distributions tat do not satisfy te constraint consists of te P (T, H, I) for wic for some i, j it olds tat P (T = 0 i) = P (T = 0 j) and C(i) C(j). Take any pair i, j for wic C(i) C(j) (if suc a pair does not exist, ten te Causal Coarsening constraint olds for all te distributions in P [β; α, γ]). We can write P (T = 0 i) = P (T = 0, i)p ( i) = 1 P (T = 0, i)p (i )P (). P (i) Since te same equation applies to P (T = 0 j), te constraint P (T i) = P (T j) can be rewritten 1 P (i) P (T = 0, i)p (i )P () = 1 P (j) P (j) P (i) P (T = 0, j)p (j )P () P (T = 0, i)p (i )P () P (T = 0, j)p (j )P () = 0,

11 wic we can rewrite in terms of te independent parameters (after defining α 0,,i = α,i and α 1,,i = 1 α,i ) and furter simplify as α t,,j γ β j, α 0,,i γ β i, t {0,1} α t,,i γ β i, α 0,,j γ β j, = 0 t {0,1} ( ) α 1,,j γ β j, α 0,,i γ β i, ( ) α 1,,i γ β i, α 0,,j γ β j, = 0 ( ) (1 α,j )γ β j, α,i γ β i, ( ) (1 α,i )γ β i, α,j γ β j, = 0 ( ) γ β j, α,i γ β i, ( ) γ β i, α,j γ β j, = 0, (4) wic is a polynomial constraint on P [β; α, γ] (note tat to keep te notation manageable, we ave omitted te dependent term 1 γ from te equations). By a simple algebraic lemma (proven by Okamoto, 1973), if te above constraint is not trivial (tat is, if tere exists β for wic te constraint does not old), te subset of P [β; α, γ] on wic it olds is measure zero. To see tat Eq. (4) does not always old, note tat if for any we set β i, = 1 (and tus β i, = 0 for any ) and β j, = 1, te equation reduces to (γ ) 2 (α i,i α j,) = 0. Tus if Eq. (4) was trivially true, we would ave α,i = α,j or γ = 0 for all. However, tis implies C(i) = C(j), wic contradicts our assumption. We ave now sown tat te subset of P [β; α, γ] wic consists of distributions for wic P (T i) = P (T j) (even toug C(i) C(j)) is Lebesgue measure zero. Since tere are only finitely many pairs of images i, j for wic C(i) C(j), te subset of P [β; α, γ] of distributions wic violate te Causal Coarsening constraint is also Lebesgue measure zero. Te remainder of te proof is a direct application of Fubini s teorem. For eac α, γ, call te (measure zero) subset of P [β; α, γ] tat violates te Causal Coarsening constraint z[α, γ]. Let Z = α,γ z[α, γ] P [α, β, γ] be te set of all te joint distributions wic violate te Causal Coarsening constraint. We want to prove tat µ(z) = 0, were µ is te Lebesgue measure. To sow tis, we will use te indicator function ẑ(α, β, γ) = { 1 if β z[α, γ], 0 oterwise. By te basic properties of positive measures we ave µ(z) = P [α,β,γ] ẑ dµ. It is a standard application of Fubini s Teorem for te Lebesgue integral to sow tat te integral in question equals zero. For simplicity of notation, let We ave P [α,β,γ] ẑ dµ = = = = = 0. A = R K N B = R N K G = R K. A B G A G A G A G B ẑ(α, β, γ) d(α, β, γ) ẑ(α, β, γ) d(β) d(α, γ) µ(z[α, γ]) d(α, γ) (5) 0 d(α, γ) Equation (5) follows as ẑ restricted to P [β; α, γ] is te indicator function of z[α, γ]. Tis completes te proof tat Z, te set of joint distributions over T, H and I tat violate te Causal Coarsening constraint, is measure zero. We are now ready to prove te main teorem. Teorem (Causal Coarsening Teorem) Among all te generative models of te form discussed in Fig. 2 (in te main text) tat ave distributions P (T, H, I) tat induce some given observational partition Π o, almost all induce a causal partition Π c tat is a coarsening of Π o.

12 Proof. Any variables tat appear in tis proof witout definition are defined in te proof of te Auxiliary Teorem. We take te same α, β, γ parametrization of distributions. Fixing an observational partition means fixing a set of observational constraints (OCs) P (T i 1 1) = = P (T i 1 N 1 ),. P (T i L 1 ) = = P (T i L N K ), were 1 L N is te number of observational classes. Since P (T, H, I) = P (H T, I)P (T I)P (I), P (T i) is an independent parameter in te unrestricted P (T, H, I), and te OCs reduce te number of independent parameters of te joint by L l=1 (N l 1). We want to express tis parameter-space reduction in terms of te α, β and γ parameterization and ten apply te proof of te Auxiliary Teorem. To do tis, for eac observational class l, coose a representative image î l suc tat P (T i l m) = P (T î l ) m 1 Nk. Ten for eac i l m î l it olds tat P (T, i l m) = P (T î l )P (i l m) or P (T,, i l m) = P (T î l ) P (, i l m). Picking an arbitrary 0, we can separate te left-and side as P (T, 0, i l m) = P (T î l ) P (, i l m) P (T,, i l m). 0 Finally, tis equation can be rewritten in terms of α, β and γ as α 0,iβ i,0 γ 0 = P (T î l ) β,i l m γ α,i l m β i l m γ, 0 or (P (T î l ) β,i γ lm ) α 0,i β lm i γ lm α 0,i = β i,0 γ 0 for any i l m î l. Tere are precisely L l=1 (N l 1) suc equations, altogeter equivalent to te observational constraints. Tus we can express any P (T, H, I) distribution tat is consistent wit a given observational partition in terms of te full range of β and γ parameters, and a restricted number of independent α parameters. Te rest of te proof now follows similarily to te proof of te Auxiliary Teorem and sows tat witin tis restricted parameter space, te parameters for wic te (fixed) observational partition is not a refinement of te causal partition is measure zero. 7 APPENDIX: CCT EXAMPLES AND COUNTER-EXAMPLES In Fig. 7 we provide examples of tree distributions over binary variables H, T and tree-valued I. Te first model induces a causal partition tat is a proper coarsening of te observational partition, and tus agrees wit te CCT. Te second model induces an observational partition tat is a proper coarsening of te causal partition CCT implies tat tis is a measure-zero case and tat, after fixing te observational partition, we ad to carefully tweak te parameters to align te causal partition as it is. Te tird model induces causal and observational partitions tat are incompatible tat is, neiter is a coarsening of te oter. Tis is also a measure-zero case. We provide a Tetrad (ttp:// file tat contains tese tree models at ttp://vision. caltec.edu/ kcalupk/code.tml. It can be used to verify our observational and causal partition computations. 8 APPENDIX: PROOF OF THE COMPLETE MACRO-VARIABLE DESCRIPTION THEOREM Teorem (Complete Macro-variable Description) Te following two statements old for C and S as defined in te main text: 1. P (T I) = P (T C, S). 2. Any oter variable X suc tat P (T I) = P (T X) as Sannon entropy H(X) H(C, S). Proof. Te first part follows by construction of S. For te second part, note tat by te CCT tere is a bijective correspondence between te pairs of values (c, s) and te observational probabilities P (T I). Call tis correspondence f, tat is f(c, s) = P (T c, s) and f 1 (p) = (c, s s.t. P (T c, s) = p). Furter, define g as te function on X, wit g : x P (T x). But since P (T X) = P (T I), we ave (c, s) = f 1 (g(x)). Tat is, te value of C and S is a function of te value of X, and tus te entropy of C and S is smaller tan te entropy of X. 9 APPENDIX: PREDICTIVE NON-CAUSAL INFORMATION IN CAUSAL VARIABLE C In some cases C retains predictive information tat is not causal. Consider te following example: We ave a causal grap consisting of tree variables {I, T, H} were te causal relations are I T and I H T. All tree variables are binary and we ave a positive distribution over

Visual Causal Feature Learning

Visual Causal Feature Learning Visual Causal Feature Learning Krzysztof Chalupka Computation and Neural Systems California Institute of Technology Pasadena, CA, USA Pietro Perona Electrical Engineering California Institute of Technology

More information

arxiv: v1 [stat.ml] 25 Dec 2015

arxiv: v1 [stat.ml] 25 Dec 2015 Multi-Level Cause-Effect Systems Krzysztof Calupka Pietro Perona Frederick Eberardt California Institute of Tecnology arxiv:1512.07942v1 [stat.ml] 25 Dec 2015 Abstract We present a domain-general account

More information

Multi-Level Cause-Effect Systems

Multi-Level Cause-Effect Systems Multi-Level Cause-Effect Systems Krzysztof Calupka Pietro Perona Frederick Eberardt California Institute of Tecnology Pasadena, CA, USA Abstract We present a domain-general account of causation tat applies

More information

Learning based super-resolution land cover mapping

Learning based super-resolution land cover mapping earning based super-resolution land cover mapping Feng ing, Yiang Zang, Giles M. Foody IEEE Fellow, Xiaodong Xiuua Zang, Siming Fang, Wenbo Yun Du is work was supported in part by te National Basic Researc

More information

Efficient algorithms for for clone items detection

Efficient algorithms for for clone items detection Efficient algoritms for for clone items detection Raoul Medina, Caroline Noyer, and Olivier Raynaud Raoul Medina, Caroline Noyer and Olivier Raynaud LIMOS - Université Blaise Pascal, Campus universitaire

More information

Copyright c 2008 Kevin Long

Copyright c 2008 Kevin Long Lecture 4 Numerical solution of initial value problems Te metods you ve learned so far ave obtained closed-form solutions to initial value problems. A closedform solution is an explicit algebriac formula

More information

lecture 26: Richardson extrapolation

lecture 26: Richardson extrapolation 43 lecture 26: Ricardson extrapolation 35 Ricardson extrapolation, Romberg integration Trougout numerical analysis, one encounters procedures tat apply some simple approximation (eg, linear interpolation)

More information

Volume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households

Volume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households Volume 29, Issue 3 Existence of competitive equilibrium in economies wit multi-member ouseolds Noriisa Sato Graduate Scool of Economics, Waseda University Abstract Tis paper focuses on te existence of

More information

Teaching Differentiation: A Rare Case for the Problem of the Slope of the Tangent Line

Teaching Differentiation: A Rare Case for the Problem of the Slope of the Tangent Line Teacing Differentiation: A Rare Case for te Problem of te Slope of te Tangent Line arxiv:1805.00343v1 [mat.ho] 29 Apr 2018 Roman Kvasov Department of Matematics University of Puerto Rico at Aguadilla Aguadilla,

More information

A = h w (1) Error Analysis Physics 141

A = h w (1) Error Analysis Physics 141 Introduction In all brances of pysical science and engineering one deals constantly wit numbers wic results more or less directly from experimental observations. Experimental observations always ave inaccuracies.

More information

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these.

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these. Mat 11. Test Form N Fall 016 Name. Instructions. Te first eleven problems are wort points eac. Te last six problems are wort 5 points eac. For te last six problems, you must use relevant metods of algebra

More information

The derivative function

The derivative function Roberto s Notes on Differential Calculus Capter : Definition of derivative Section Te derivative function Wat you need to know already: f is at a point on its grap and ow to compute it. Wat te derivative

More information

Regularized Regression

Regularized Regression Regularized Regression David M. Blei Columbia University December 5, 205 Modern regression problems are ig dimensional, wic means tat te number of covariates p is large. In practice statisticians regularize

More information

2.1 THE DEFINITION OF DERIVATIVE

2.1 THE DEFINITION OF DERIVATIVE 2.1 Te Derivative Contemporary Calculus 2.1 THE DEFINITION OF DERIVATIVE 1 Te grapical idea of a slope of a tangent line is very useful, but for some uses we need a more algebraic definition of te derivative

More information

2.11 That s So Derivative

2.11 That s So Derivative 2.11 Tat s So Derivative Introduction to Differential Calculus Just as one defines instantaneous velocity in terms of average velocity, we now define te instantaneous rate of cange of a function at a point

More information

Exam 1 Review Solutions

Exam 1 Review Solutions Exam Review Solutions Please also review te old quizzes, and be sure tat you understand te omework problems. General notes: () Always give an algebraic reason for your answer (graps are not sufficient),

More information

Symmetry Labeling of Molecular Energies

Symmetry Labeling of Molecular Energies Capter 7. Symmetry Labeling of Molecular Energies Notes: Most of te material presented in tis capter is taken from Bunker and Jensen 1998, Cap. 6, and Bunker and Jensen 2005, Cap. 7. 7.1 Hamiltonian Symmetry

More information

Differentiation in higher dimensions

Differentiation in higher dimensions Capter 2 Differentiation in iger dimensions 2.1 Te Total Derivative Recall tat if f : R R is a 1-variable function, and a R, we say tat f is differentiable at x = a if and only if te ratio f(a+) f(a) tends

More information

Chapter 2 Limits and Continuity

Chapter 2 Limits and Continuity 4 Section. Capter Limits and Continuity Section. Rates of Cange and Limits (pp. 6) Quick Review.. f () ( ) () 4 0. f () 4( ) 4. f () sin sin 0 4. f (). 4 4 4 6. c c c 7. 8. c d d c d d c d c 9. 8 ( )(

More information

Notes on Neural Networks

Notes on Neural Networks Artificial neurons otes on eural etwors Paulo Eduardo Rauber 205 Consider te data set D {(x i y i ) i { n} x i R m y i R d } Te tas of supervised learning consists on finding a function f : R m R d tat

More information

MVT and Rolle s Theorem

MVT and Rolle s Theorem AP Calculus CHAPTER 4 WORKSHEET APPLICATIONS OF DIFFERENTIATION MVT and Rolle s Teorem Name Seat # Date UNLESS INDICATED, DO NOT USE YOUR CALCULATOR FOR ANY OF THESE QUESTIONS In problems 1 and, state

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximatinga function fx, wose values at a set of distinct points x, x, x,, x n are known, by a polynomial P x suc

More information

Math 312 Lecture Notes Modeling

Math 312 Lecture Notes Modeling Mat 3 Lecture Notes Modeling Warren Weckesser Department of Matematics Colgate University 5 7 January 006 Classifying Matematical Models An Example We consider te following scenario. During a storm, a

More information

2.8 The Derivative as a Function

2.8 The Derivative as a Function .8 Te Derivative as a Function Typically, we can find te derivative of a function f at many points of its domain: Definition. Suppose tat f is a function wic is differentiable at every point of an open

More information

Numerical Differentiation

Numerical Differentiation Numerical Differentiation Finite Difference Formulas for te first derivative (Using Taylor Expansion tecnique) (section 8.3.) Suppose tat f() = g() is a function of te variable, and tat as 0 te function

More information

Fundamentals of Concept Learning

Fundamentals of Concept Learning Aims 09s: COMP947 Macine Learning and Data Mining Fundamentals of Concept Learning Marc, 009 Acknowledgement: Material derived from slides for te book Macine Learning, Tom Mitcell, McGraw-Hill, 997 ttp://www-.cs.cmu.edu/~tom/mlbook.tml

More information

Combining functions: algebraic methods

Combining functions: algebraic methods Combining functions: algebraic metods Functions can be added, subtracted, multiplied, divided, and raised to a power, just like numbers or algebra expressions. If f(x) = x 2 and g(x) = x + 2, clearly f(x)

More information

Exercises for numerical differentiation. Øyvind Ryan

Exercises for numerical differentiation. Øyvind Ryan Exercises for numerical differentiation Øyvind Ryan February 25, 2013 1. Mark eac of te following statements as true or false. a. Wen we use te approximation f (a) (f (a +) f (a))/ on a computer, we can

More information

A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES

A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES Ronald Ainswort Hart Scientific, American Fork UT, USA ABSTRACT Reports of calibration typically provide total combined uncertainties

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximating a function f(x, wose values at a set of distinct points x, x, x 2,,x n are known, by a polynomial P (x

More information

The Complexity of Computing the MCD-Estimator

The Complexity of Computing the MCD-Estimator Te Complexity of Computing te MCD-Estimator Torsten Bernolt Lerstul Informatik 2 Universität Dortmund, Germany torstenbernolt@uni-dortmundde Paul Fiscer IMM, Danisc Tecnical University Kongens Lyngby,

More information

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator.

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator. Lecture XVII Abstract We introduce te concept of directional derivative of a scalar function and discuss its relation wit te gradient operator. Directional derivative and gradient Te directional derivative

More information

. If lim. x 2 x 1. f(x+h) f(x)

. If lim. x 2 x 1. f(x+h) f(x) Review of Differential Calculus Wen te value of one variable y is uniquely determined by te value of anoter variable x, ten te relationsip between x and y is described by a function f tat assigns a value

More information

Mathematics 5 Worksheet 11 Geometry, Tangency, and the Derivative

Mathematics 5 Worksheet 11 Geometry, Tangency, and the Derivative Matematics 5 Workseet 11 Geometry, Tangency, and te Derivative Problem 1. Find te equation of a line wit slope m tat intersects te point (3, 9). Solution. Te equation for a line passing troug a point (x

More information

NUMERICAL DIFFERENTIATION. James T. Smith San Francisco State University. In calculus classes, you compute derivatives algebraically: for example,

NUMERICAL DIFFERENTIATION. James T. Smith San Francisco State University. In calculus classes, you compute derivatives algebraically: for example, NUMERICAL DIFFERENTIATION James T Smit San Francisco State University In calculus classes, you compute derivatives algebraically: for example, f( x) = x + x f ( x) = x x Tis tecnique requires your knowing

More information

Introduction to Machine Learning. Recitation 8. w 2, b 2. w 1, b 1. z 0 z 1. The function we want to minimize is the loss over all examples: f =

Introduction to Machine Learning. Recitation 8. w 2, b 2. w 1, b 1. z 0 z 1. The function we want to minimize is the loss over all examples: f = Introduction to Macine Learning Lecturer: Regev Scweiger Recitation 8 Fall Semester Scribe: Regev Scweiger 8.1 Backpropagation We will develop and review te backpropagation algoritm for neural networks.

More information

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist Mat 1120 Calculus Test 2. October 18, 2001 Your name Te multiple coice problems count 4 points eac. In te multiple coice section, circle te correct coice (or coices). You must sow your work on te oter

More information

1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point

1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point MA00 Capter 6 Calculus and Basic Linear Algebra I Limits, Continuity and Differentiability Te concept of its (p.7 p.9, p.4 p.49, p.55 p.56). Limits Consider te function determined by te formula f Note

More information

Impact of Lightning Strikes on National Airspace System (NAS) Outages

Impact of Lightning Strikes on National Airspace System (NAS) Outages Impact of Ligtning Strikes on National Airspace System (NAS) Outages A Statistical Approac Aurélien Vidal University of California at Berkeley NEXTOR Berkeley, CA, USA aurelien.vidal@berkeley.edu Jasenka

More information

Introduction to Derivatives

Introduction to Derivatives Introduction to Derivatives 5-Minute Review: Instantaneous Rates and Tangent Slope Recall te analogy tat we developed earlier First we saw tat te secant slope of te line troug te two points (a, f (a))

More information

Financial Econometrics Prof. Massimo Guidolin

Financial Econometrics Prof. Massimo Guidolin CLEFIN A.A. 2010/2011 Financial Econometrics Prof. Massimo Guidolin A Quick Review of Basic Estimation Metods 1. Were te OLS World Ends... Consider two time series 1: = { 1 2 } and 1: = { 1 2 }. At tis

More information

CSCE 478/878 Lecture 2: Concept Learning and the General-to-Specific Ordering

CSCE 478/878 Lecture 2: Concept Learning and the General-to-Specific Ordering Outline Learning from eamples CSCE 78/878 Lecture : Concept Learning and te General-to-Specific Ordering Stepen D. Scott (Adapted from Tom Mitcell s slides) General-to-specific ordering over ypoteses Version

More information

Robotic manipulation project

Robotic manipulation project Robotic manipulation project Bin Nguyen December 5, 2006 Abstract Tis is te draft report for Robotic Manipulation s class project. Te cosen project aims to understand and implement Kevin Egan s non-convex

More information

Bob Brown Math 251 Calculus 1 Chapter 3, Section 1 Completed 1 CCBC Dundalk

Bob Brown Math 251 Calculus 1 Chapter 3, Section 1 Completed 1 CCBC Dundalk Bob Brown Mat 251 Calculus 1 Capter 3, Section 1 Completed 1 Te Tangent Line Problem Te idea of a tangent line first arises in geometry in te context of a circle. But before we jump into a discussion of

More information

REVIEW LAB ANSWER KEY

REVIEW LAB ANSWER KEY REVIEW LAB ANSWER KEY. Witout using SN, find te derivative of eac of te following (you do not need to simplify your answers): a. f x 3x 3 5x x 6 f x 3 3x 5 x 0 b. g x 4 x x x notice te trick ere! x x g

More information

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY (Section 3.2: Derivative Functions and Differentiability) 3.2.1 SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY LEARNING OBJECTIVES Know, understand, and apply te Limit Definition of te Derivative

More information

Cubic Functions: Local Analysis

Cubic Functions: Local Analysis Cubic function cubing coefficient Capter 13 Cubic Functions: Local Analysis Input-Output Pairs, 378 Normalized Input-Output Rule, 380 Local I-O Rule Near, 382 Local Grap Near, 384 Types of Local Graps

More information

Lab 6 Derivatives and Mutant Bacteria

Lab 6 Derivatives and Mutant Bacteria Lab 6 Derivatives and Mutant Bacteria Date: September 27, 20 Assignment Due Date: October 4, 20 Goal: In tis lab you will furter explore te concept of a derivative using R. You will use your knowledge

More information

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx.

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx. Capter 2 Integrals as sums and derivatives as differences We now switc to te simplest metods for integrating or differentiating a function from its function samples. A careful study of Taylor expansions

More information

Time (hours) Morphine sulfate (mg)

Time (hours) Morphine sulfate (mg) Mat Xa Fall 2002 Review Notes Limits and Definition of Derivative Important Information: 1 According to te most recent information from te Registrar, te Xa final exam will be eld from 9:15 am to 12:15

More information

Pre-Calculus Review Preemptive Strike

Pre-Calculus Review Preemptive Strike Pre-Calculus Review Preemptive Strike Attaced are some notes and one assignment wit tree parts. Tese are due on te day tat we start te pre-calculus review. I strongly suggest reading troug te notes torougly

More information

Differential Calculus (The basics) Prepared by Mr. C. Hull

Differential Calculus (The basics) Prepared by Mr. C. Hull Differential Calculus Te basics) A : Limits In tis work on limits, we will deal only wit functions i.e. tose relationsips in wic an input variable ) defines a unique output variable y). Wen we work wit

More information

1 1. Rationalize the denominator and fully simplify the radical expression 3 3. Solution: = 1 = 3 3 = 2

1 1. Rationalize the denominator and fully simplify the radical expression 3 3. Solution: = 1 = 3 3 = 2 MTH - Spring 04 Exam Review (Solutions) Exam : February 5t 6:00-7:0 Tis exam review contains questions similar to tose you sould expect to see on Exam. Te questions included in tis review, owever, are

More information

How to Find the Derivative of a Function: Calculus 1

How to Find the Derivative of a Function: Calculus 1 Introduction How to Find te Derivative of a Function: Calculus 1 Calculus is not an easy matematics course Te fact tat you ave enrolled in suc a difficult subject indicates tat you are interested in te

More information

f a h f a h h lim lim

f a h f a h h lim lim Te Derivative Te derivative of a function f at a (denoted f a) is f a if tis it exists. An alternative way of defining f a is f a x a fa fa fx fa x a Note tat te tangent line to te grap of f at te point

More information

SECTION 1.10: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES

SECTION 1.10: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES (Section.0: Difference Quotients).0. SECTION.0: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES Define average rate of cange (and average velocity) algebraically and grapically. Be able to identify, construct,

More information

Derivatives. By: OpenStaxCollege

Derivatives. By: OpenStaxCollege By: OpenStaxCollege Te average teen in te United States opens a refrigerator door an estimated 25 times per day. Supposedly, tis average is up from 10 years ago wen te average teenager opened a refrigerator

More information

232 Calculus and Structures

232 Calculus and Structures 3 Calculus and Structures CHAPTER 17 JUSTIFICATION OF THE AREA AND SLOPE METHODS FOR EVALUATING BEAMS Calculus and Structures 33 Copyrigt Capter 17 JUSTIFICATION OF THE AREA AND SLOPE METHODS 17.1 THE

More information

Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm

Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm Probabilistic Grapical Models 10-708 Homework 1: Due January 29, 2014 at 4 pm Directions. Tis omework assignment covers te material presented in Lectures 1-3. You must complete all four problems to obtain

More information

Material for Difference Quotient

Material for Difference Quotient Material for Difference Quotient Prepared by Stepanie Quintal, graduate student and Marvin Stick, professor Dept. of Matematical Sciences, UMass Lowell Summer 05 Preface Te following difference quotient

More information

Chapter 5 FINITE DIFFERENCE METHOD (FDM)

Chapter 5 FINITE DIFFERENCE METHOD (FDM) MEE7 Computer Modeling Tecniques in Engineering Capter 5 FINITE DIFFERENCE METHOD (FDM) 5. Introduction to FDM Te finite difference tecniques are based upon approximations wic permit replacing differential

More information

HOMEWORK HELP 2 FOR MATH 151

HOMEWORK HELP 2 FOR MATH 151 HOMEWORK HELP 2 FOR MATH 151 Here we go; te second round of omework elp. If tere are oters you would like to see, let me know! 2.4, 43 and 44 At wat points are te functions f(x) and g(x) = xf(x)continuous,

More information

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines Lecture 5 Interpolation II Introduction In te previous lecture we focused primarily on polynomial interpolation of a set of n points. A difficulty we observed is tat wen n is large, our polynomial as to

More information

Adaptive Neural Filters with Fixed Weights

Adaptive Neural Filters with Fixed Weights Adaptive Neural Filters wit Fixed Weigts James T. Lo and Justin Nave Department of Matematics and Statistics University of Maryland Baltimore County Baltimore, MD 150, U.S.A. e-mail: jameslo@umbc.edu Abstract

More information

The Laws of Thermodynamics

The Laws of Thermodynamics 1 Te Laws of Termodynamics CLICKER QUESTIONS Question J.01 Description: Relating termodynamic processes to PV curves: isobar. Question A quantity of ideal gas undergoes a termodynamic process. Wic curve

More information

5.1 We will begin this section with the definition of a rational expression. We

5.1 We will begin this section with the definition of a rational expression. We Basic Properties and Reducing to Lowest Terms 5.1 We will begin tis section wit te definition of a rational epression. We will ten state te two basic properties associated wit rational epressions and go

More information

Continuity and Differentiability Worksheet

Continuity and Differentiability Worksheet Continuity and Differentiability Workseet (Be sure tat you can also do te grapical eercises from te tet- Tese were not included below! Typical problems are like problems -3, p. 6; -3, p. 7; 33-34, p. 7;

More information

ch (for some fixed positive number c) reaching c

ch (for some fixed positive number c) reaching c GSTF Journal of Matematics Statistics and Operations Researc (JMSOR) Vol. No. September 05 DOI 0.60/s4086-05-000-z Nonlinear Piecewise-defined Difference Equations wit Reciprocal and Cubic Terms Ramadan

More information

Optimal parameters for a hierarchical grid data structure for contact detection in arbitrarily polydisperse particle systems

Optimal parameters for a hierarchical grid data structure for contact detection in arbitrarily polydisperse particle systems Comp. Part. Mec. 04) :357 37 DOI 0.007/s4057-04-000-9 Optimal parameters for a ierarcical grid data structure for contact detection in arbitrarily polydisperse particle systems Dinant Krijgsman Vitaliy

More information

LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION

LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION LAURA EVANS.. Introduction Not all differential equations can be explicitly solved for y. Tis can be problematic if we need to know te value of y

More information

THE IDEA OF DIFFERENTIABILITY FOR FUNCTIONS OF SEVERAL VARIABLES Math 225

THE IDEA OF DIFFERENTIABILITY FOR FUNCTIONS OF SEVERAL VARIABLES Math 225 THE IDEA OF DIFFERENTIABILITY FOR FUNCTIONS OF SEVERAL VARIABLES Mat 225 As we ave seen, te definition of derivative for a Mat 111 function g : R R and for acurveγ : R E n are te same, except for interpretation:

More information

A Reconsideration of Matter Waves

A Reconsideration of Matter Waves A Reconsideration of Matter Waves by Roger Ellman Abstract Matter waves were discovered in te early 20t century from teir wavelengt, predicted by DeBroglie, Planck's constant divided by te particle's momentum,

More information

Section 2: The Derivative Definition of the Derivative

Section 2: The Derivative Definition of the Derivative Capter 2 Te Derivative Applied Calculus 80 Section 2: Te Derivative Definition of te Derivative Suppose we drop a tomato from te top of a 00 foot building and time its fall. Time (sec) Heigt (ft) 0.0 00

More information

Derivatives of trigonometric functions

Derivatives of trigonometric functions Derivatives of trigonometric functions 2 October 207 Introuction Toay we will ten iscuss te erivates of te si stanar trigonometric functions. Of tese, te most important are sine an cosine; te erivatives

More information

Notes on wavefunctions II: momentum wavefunctions

Notes on wavefunctions II: momentum wavefunctions Notes on wavefunctions II: momentum wavefunctions and uncertainty Te state of a particle at any time is described by a wavefunction ψ(x). Tese wavefunction must cange wit time, since we know tat particles

More information

Near-Optimal conversion of Hardness into Pseudo-Randomness

Near-Optimal conversion of Hardness into Pseudo-Randomness Near-Optimal conversion of Hardness into Pseudo-Randomness Russell Impagliazzo Computer Science and Engineering UC, San Diego 9500 Gilman Drive La Jolla, CA 92093-0114 russell@cs.ucsd.edu Ronen Saltiel

More information

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x)

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x) Calculus. Gradients and te Derivative Q f(x+) δy P T δx R f(x) 0 x x+ Let P (x, f(x)) and Q(x+, f(x+)) denote two points on te curve of te function y = f(x) and let R denote te point of intersection of

More information

3.4 Worksheet: Proof of the Chain Rule NAME

3.4 Worksheet: Proof of the Chain Rule NAME Mat 1170 3.4 Workseet: Proof of te Cain Rule NAME Te Cain Rule So far we are able to differentiate all types of functions. For example: polynomials, rational, root, and trigonometric functions. We are

More information

On the Identifiability of the Post-Nonlinear Causal Model

On the Identifiability of the Post-Nonlinear Causal Model UAI 9 ZHANG & HYVARINEN 647 On te Identifiability of te Post-Nonlinear Causal Model Kun Zang Dept. of Computer Science and HIIT University of Helsinki Finland Aapo Hyvärinen Dept. of Computer Science,

More information

7.1 Using Antiderivatives to find Area

7.1 Using Antiderivatives to find Area 7.1 Using Antiderivatives to find Area Introduction finding te area under te grap of a nonnegative, continuous function f In tis section a formula is obtained for finding te area of te region bounded between

More information

MAT 145. Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points

MAT 145. Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points MAT 15 Test #2 Name Solution Guide Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points Use te grap of a function sown ere as you respond to questions 1 to 8. 1. lim f (x) 0 2. lim

More information

1watt=1W=1kg m 2 /s 3

1watt=1W=1kg m 2 /s 3 Appendix A Matematics Appendix A.1 Units To measure a pysical quantity, you need a standard. Eac pysical quantity as certain units. A unit is just a standard we use to compare, e.g. a ruler. In tis laboratory

More information

Generic maximum nullity of a graph

Generic maximum nullity of a graph Generic maximum nullity of a grap Leslie Hogben Bryan Sader Marc 5, 2008 Abstract For a grap G of order n, te maximum nullity of G is defined to be te largest possible nullity over all real symmetric n

More information

Recall from our discussion of continuity in lecture a function is continuous at a point x = a if and only if

Recall from our discussion of continuity in lecture a function is continuous at a point x = a if and only if Computational Aspects of its. Keeping te simple simple. Recall by elementary functions we mean :Polynomials (including linear and quadratic equations) Eponentials Logaritms Trig Functions Rational Functions

More information

1 Introduction Radiative corrections can ave a significant impact on te predicted values of Higgs masses and couplings. Te radiative corrections invol

1 Introduction Radiative corrections can ave a significant impact on te predicted values of Higgs masses and couplings. Te radiative corrections invol RADCOR-2000-001 November 15, 2000 Radiative Corrections to Pysics Beyond te Standard Model Clint Eastwood 1 Department of Radiative Pysics California State University Monterey Bay, Seaside, CA 93955 USA

More information

Convergence and Descent Properties for a Class of Multilevel Optimization Algorithms

Convergence and Descent Properties for a Class of Multilevel Optimization Algorithms Convergence and Descent Properties for a Class of Multilevel Optimization Algoritms Stepen G. Nas April 28, 2010 Abstract I present a multilevel optimization approac (termed MG/Opt) for te solution of

More information

Derivatives of Exponentials

Derivatives of Exponentials mat 0 more on derivatives: day 0 Derivatives of Eponentials Recall tat DEFINITION... An eponential function as te form f () =a, were te base is a real number a > 0. Te domain of an eponential function

More information

CS522 - Partial Di erential Equations

CS522 - Partial Di erential Equations CS5 - Partial Di erential Equations Tibor Jánosi April 5, 5 Numerical Di erentiation In principle, di erentiation is a simple operation. Indeed, given a function speci ed as a closed-form formula, its

More information

Technology-Independent Design of Neurocomputers: The Universal Field Computer 1

Technology-Independent Design of Neurocomputers: The Universal Field Computer 1 Tecnology-Independent Design of Neurocomputers: Te Universal Field Computer 1 Abstract Bruce J. MacLennan Computer Science Department Naval Postgraduate Scool Monterey, CA 9393 We argue tat AI is moving

More information

Mathematics 105 Calculus I. Exam 1. February 13, Solution Guide

Mathematics 105 Calculus I. Exam 1. February 13, Solution Guide Matematics 05 Calculus I Exam February, 009 Your Name: Solution Guide Tere are 6 total problems in tis exam. On eac problem, you must sow all your work, or oterwise torougly explain your conclusions. Tere

More information

Domination Problems in Nowhere-Dense Classes of Graphs

Domination Problems in Nowhere-Dense Classes of Graphs LIPIcs Leibniz International Proceedings in Informatics Domination Problems in Nowere-Dense Classes of Graps Anuj Dawar 1, Stepan Kreutzer 2 1 University of Cambridge Computer Lab, U.K. anuj.dawar@cl.cam.ac.uk

More information

0.1 Differentiation Rules

0.1 Differentiation Rules 0.1 Differentiation Rules From our previous work we ve seen tat it can be quite a task to calculate te erivative of an arbitrary function. Just working wit a secon-orer polynomial tings get pretty complicate

More information

Problem Solving. Problem Solving Process

Problem Solving. Problem Solving Process Problem Solving One of te primary tasks for engineers is often solving problems. It is wat tey are, or sould be, good at. Solving engineering problems requires more tan just learning new terms, ideas and

More information

Bounds on the Moments for an Ensemble of Random Decision Trees

Bounds on the Moments for an Ensemble of Random Decision Trees Noname manuscript No. (will be inserted by te editor) Bounds on te Moments for an Ensemble of Random Decision Trees Amit Durandar Received: Sep. 17, 2013 / Revised: Mar. 04, 2014 / Accepted: Jun. 30, 2014

More information

RECOGNITION of online handwriting aims at finding the

RECOGNITION of online handwriting aims at finding the SUBMITTED ON SEPTEMBER 2017 1 A General Framework for te Recognition of Online Handwritten Grapics Frank Julca-Aguilar, Harold Moucère, Cristian Viard-Gaudin, and Nina S. T. Hirata arxiv:1709.06389v1 [cs.cv]

More information

7 Semiparametric Methods and Partially Linear Regression

7 Semiparametric Methods and Partially Linear Regression 7 Semiparametric Metods and Partially Linear Regression 7. Overview A model is called semiparametric if it is described by and were is nite-dimensional (e.g. parametric) and is in nite-dimensional (nonparametric).

More information

Investigating Euler s Method and Differential Equations to Approximate π. Lindsay Crowl August 2, 2001

Investigating Euler s Method and Differential Equations to Approximate π. Lindsay Crowl August 2, 2001 Investigating Euler s Metod and Differential Equations to Approximate π Lindsa Crowl August 2, 2001 Tis researc paper focuses on finding a more efficient and accurate wa to approximate π. Suppose tat x

More information

Dedicated to the 70th birthday of Professor Lin Qun

Dedicated to the 70th birthday of Professor Lin Qun Journal of Computational Matematics, Vol.4, No.3, 6, 4 44. ACCELERATION METHODS OF NONLINEAR ITERATION FOR NONLINEAR PARABOLIC EQUATIONS Guang-wei Yuan Xu-deng Hang Laboratory of Computational Pysics,

More information

Quantum Numbers and Rules

Quantum Numbers and Rules OpenStax-CNX module: m42614 1 Quantum Numbers and Rules OpenStax College Tis work is produced by OpenStax-CNX and licensed under te Creative Commons Attribution License 3.0 Abstract Dene quantum number.

More information

MA455 Manifolds Solutions 1 May 2008

MA455 Manifolds Solutions 1 May 2008 MA455 Manifolds Solutions 1 May 2008 1. (i) Given real numbers a < b, find a diffeomorpism (a, b) R. Solution: For example first map (a, b) to (0, π/2) and ten map (0, π/2) diffeomorpically to R using

More information