Spike train entropy-rate estimation using hierarchical Dirichlet process priors

Size: px
Start display at page:

Download "Spike train entropy-rate estimation using hierarchical Dirichlet process priors"

Transcription

1 publised in: Advances in Neural Information Processing Systems 26 (23), Spike train entropy-rate estimation using ierarcical Diriclet process priors Karin Knudson Department of Matematics Jonatan W. Pillow Center for Perceptual Systems Departments of Psycology & Neuroscience Te University of Texas at Austin Abstract Entropy rate quantifies te amount of disorder in a stocastic process. For spiking neurons, te entropy rate places an upper bound on te rate at wic te spike train can convey stimulus information, and a large literature as focused on te problem of estimating entropy rate from spike train data. Here we present Bayes least squares and empirical Bayesian entropy rate estimators for binary spike trains using ierarcical Diriclet process (HDP) priors. Our estimator leverages te fact tat te entropy rate of an ergodic Markov Cain wit known transition probabilities can be calculated analytically, and many stocastic processes tat are non-markovian can still be well approximated by Markov processes of sufficient dept. Coosing an appropriate dept of Markov model presents callenges due to possibly long time dependencies and sort data sequences: a deeper model can better account for long time dependencies, but is more difficult to infer from limited data. Our approac mitigates tis difficulty by using a ierarcical prior to sare statistical power across Markov cains of different depts. We present bot a fully Bayesian and empirical Bayes entropy rate estimator based on tis model, and demonstrate teir performance on simulated and real neural spike train data. Introduction Te problem of caracterizing te statistical properties of a spiking neuron is quite general, but two interesting questions one migt ask are: () wat kind of time dependencies are present? and (2) ow muc information is te neuron transmitting? Wit regard to te second question, information teory provides quantifications of te amount of information transmitted by a signal witout reference to assumptions about ow te information is represented or used. Te entropy rate is of interest as a measure of uncertainty per unit time, an upper bound on te rate of information transmission, and an intermediate step in computing mutual information rate between stimulus and neural response. Unfortunately, accurate entropy rate estimation is difficult, and estimates from limited data are often severely biased. We present a Bayesian metod for estimating entropy rates from binary data tat uses ierarcical Diriclet process priors (HDP) to reduce tis bias. Our metod proceeds by modeling te source of te data as a Markov cain, and ten using te fact tat te entropy rate of a Markov cain is a deterministic function of its transition probabilities. Fitting te model yields parameters relevant to bot questions () and (2) above: we obtain bot an approximation of te underlying stocastic process as a Markov cain, and an estimate of te entropy rate of te process. For binary data, te HDP reduces to a ierarcy of beta priors, were te prior probability over g, te probability of te next symbol given a long istory, is a beta distribution centered on te probability of tat symbol given a truncated, one-symbol-sorter, istory. Te posterior over symbols given a certain istory is tus smooted by te probability over symbols given a sorter istory. Tis smooting is a key feature of te model.

2 Te structure of te paper is as follows. In Section 2, we present definitions and callenges involved in entropy rate estimation, and discuss existing estimators. In Section 3, we discuss Markov models and teir relationsip to entropy rate. In Sections 4 and 5, we present two Bayesian estimates of entropy rate using te HDP prior, one involving a direct calculation of te posterior mean transition probabilities of a Markov model, te oter using Markov Cain Monte Carlo metods to sample from te posterior distribution of te entropy rate. In Section 6 we compare te HDP entropy rate estimators to existing entropy rate estimators including te context tree weigting entropy rate estimator from [], te string-parsing metod from [2], and finite-lengt block entropy rate estimators tat makes use of te entropy estimator of Nemenman, Bialek and Safee [3] and Miller and Madow [4]. We evaluate te results for simulated and real neural data. 2 Entropy Rate Estimation In information teory, te entropy of a random variable is a measure of te variable s average unpredictability. Te entropy of a discrete random variable X wit possible values {x,..., x n } is H(X) = n p(x i ) log(x i ) () i= Entropy can be measured in eiter nats or bits, depending on weter we use base 2 or e for te logaritm. Here, all logaritms discussed will be base 2, and all entropies will be given in bits. Wile entropy is a property of a random variable, entropy rate is a property of a stocastic process, suc as a time series, and quantifies te amount of uncertainty per symbol. Te neural and simulated data considered ere will be binary sequences representing te spike train of a neuron, were eac symbol represents eiter te presence of a spike in a bin () or te absence of a spike (). We view te data as a sample pat from an underlying stocastic process. To evaluate te average uncertainty of eac new symbol ( or ) given te previous symbols - or te amount of new information per symbol - we would like to compute te entropy rate of te process. For a stocastic process {X i } i= te entropy of te random vector (X,..., X k ) grows wit k; we are interested in ow it grows. If we define te block entropy H k to be te entropy of te distribution of lengt-k sequences of symbols, H k = H(X i+,...x i+k ), ten te entropy rate of a stocastic process {X i } i=is defined by = lim k k H k (2) wen te limit exists (wic, for stationary stocastic processes, it must). Tere are two oter definitions for entropy rate, wic are equivalent to te first for stationary processes: = = lim k+ H k k (3) lim i+ X i, X i,...x i k ) k (4) We now briefly review existing entropy rate estimators, to wic we will compare our results. 2. Block Entropy Rate Estimators Since muc work as been done to accurately estimate entropy from data, Equations (2) and (3) suggest a simple entropy rate estimator, wic consists of coosing first a block size k and ten a suitable entropy estimator wit wic to estimate H k. A simple suc estimator is te plugin entropy estimator, wic approximates te probability of eac lengt-k block (x,..., x k ) by te proportion of total lengt-k blocks observed tat are equal to (x,..., x k ). For binary data tere are 2 k possible lengt-k blocks. Wen N denotes te data lengt and c i te number of observations of eac block in te data, we ave: Ĥ plugin = 2 k i= c i N log c i N (5) 2

3 from wic we can immediately estimate te entropy rate wit plugin,k = Ĥplugin /k, for some appropriately cosen k (te subject of appropriate coice will be taken up in more detail later). We would expect tat using better block entropy estimators would yield better entropy rate estimators, and so we also consider two oter block based entropy rate estimators. Te first uses te Bayesian entropy estimator H NSB from Nemenman, Safee and Bialek [3], wic gives a Bayesian least squares estimate for entropy given a mixture-of-diriclet prior. Te second uses te Miller and Madow estimator [4], wic gives a first-order correction to te (often significantly biased) plugin entropy estimator of Equation 5: Ĥ MM = 2 k i= c i N log c i N + A log(e) (6) 2N were A is te size of te alpabet of symbols (A = 2 for te binary data sequences presently considered). For a given k, we obtain entropy rate estimators NSB,k = ĤNSB/k and MM,k = ĤMM /k by applying te entropy estimators from [3] and [4] respectively to te empirical distribution of te lengt-k blocks. Wile we can improve te accuracy of tese block entropy rate estimates by coosing a better entropy estimator, coosing te block size k remains a callenge. If we coose k to be small, we miss long time dependencies in te data and tend to overestimate te entropy; intuitively, te time series will seem more unpredictable tan it actually is, because we are ignoring long-time dependencies. On te oter and, as we consider larger k, limited data leads to underestimates of te entropy rate. See te plots of plugin, NSB, and MM in Figure 2d for an instance of tis effect of block size on entropy rate estimates. We migt ope tat in between te overestimates of entropy rate for sort blocks and te te underestimates for longer blocks, tere is some plateau region were te entropy rate stays relatively constant wit respect to block size, wic we could use as a euristic to select te proper block lengt []. Unfortunately, te block entropy rate at tis plateau may still be biased, and for data sequences tat are sort wit respect to teir time dependencies, tere may be no discernible plateau at all ([], Figure ). 2.2 Oter Entropy Rate Estimators Not all existing tecniques for entropy rate estimation involve an explicit coice of block lengt. Te estimator from [2], for example, parses te full string of symbols in te data by starting from te first symbol, and sequentially removing and counting as a prase te sortest substring tat as not yet appeared. Wen M is te number of distinct prases counted in tis way, we obtain te estimator: LZ = M N log N, free from any explicit block lengt parameters. A fixed block lengt model like te ones described in te previous section uses te entropy of te distribution of all te blocks of a some lengt - e.g. all te blocks in te terminal nodes of a context tree like te one in Figure a. In te context tree weigting (CTW) framework of [], te autors instead use a minimum descriptive lengt criterion to weigt different tree topologies, wic ave witin te same tree terminal nodes corresponding to blocks of different lengts. Tey use tis weigting to generate Monte Carlo samples and approximate te integral (θ)p(θ T, data)p(t data) dθ dt, in wic T represents te tree topology, and θ represents transition probabilities associated wit te terminal nodes of te tree. In our approac, te HDP prior combined wit a Markov model of our data will be a key tool in overcoming some of te difficulties of coosing a block-lengt appropriately for entropy rate estimation. It will allow us to coose a block lengt tat is large enoug to capture possibly important long time dependencies, wile easing te difficulty of estimating te properties of tese long time dependencies from sort data. 3 Markov Models Te usefulness of approximating our data source wit a Markov model comes from () te flexibility of Markov models including teir ability to well approximate even many processes tat are not truly Markovian, and (2) te fact tat for a Markov cain wit known transition probabilities te entropy rate need not be estimated but is in fact a deterministic function of te transition probabilities. 3

4 Figure : A dept-3 ierarcical Diriclet prior for binary data A Markov cain is a sequence of random variables tat as te property tat te probability of te next state depends only on te present state, and not on any previous states. Tat is, P (X i+ X i,..., X ) = P (X i+ X i ). Note tat tis property does not mean tat for a binary sequence te probability of eac or depends only on te previous or, because we consider te state variables to be strings of symbols of lengt k rater tan individual s and s, Tus we will discuss dept-k Markov models, were te probability of te next state depends only previous k symbols, or wat we will call te lengt-k context of te symbol. Wit a binary alpabet, tere are 2 k states te cain can take, and from eac state s, transitions are possible only to two oter states. (So tat for, example, can transition to state or state, but not to any oter state). Because only two transitions are possible from eac state, te transition probability distribution from eac s is completely specified by only one parameter, wic we denote g s, te probability of observing a given te context s. Te entropy rate of an ergodic Markov cain wit finite state set A is given by: = s A p(s)h(x s), (7) were p(s) is te stationary probability associated wit state s, and H(x s) is te entropy of te distribution of possible transitions from state s. Te vector of stationary state probabilities p(s) for all s is computed as a left eigenvector of te transition matrix T: p(s)t = p(s), s p(s) = (8) Since eac row of te transition matrix T contains only two non-zero entries, g s, and g s, p(s) can be calculated relatively quickly. Wit equations 7 and 8, can be calculated analytically from te vector of all 2 k transition probabilities {g s }. A Bayesian estimator of entropy rate based on a Markov model of order k is given by ĥ Bayes = (g)p(g data)dg (9) were g = {g s : s = k}, is te deterministic function of g given by Equations 7 and 8, and p(g data) p(data g)p(g) given some appropriate prior over g. Modeling a time series as a Markov cain requires a coice of te dept of tat cain, so we ave not avoided te dept selection problem yet. Wat will actually mitigate te difficulty ere is te use of ierarcical Diriclet process priors. 4 Hierarcical Diriclet Process priors We describe a ierarcical beta prior, a special case of te ierarcical Diriclet process (HDP), wic was presented in [5] and applied to problems of natural language processing in [6] and [7]. Te true entropy rate = lim k H k /k captures time dependencies of infinite dept. Terefore to calculate te estimate ĥbayes in Equation 9 we would like to coose some large k. However, it is difficult to estimate transition probabilities for long blocks wit sort data sequences, so coosing large k may lead to inaccurate posterior estimates for te transition probabilities g. In particular, 4

5 sorter data sequences may not even ave observations of all possible symbol sequences of a given lengt. Tis motivates our use of ierarcical priors as follows. Suppose we ave a data sequence in wic te subsequence is never observed. Ten we would not expect to ave a very good estimate for g ; owever, we could improve tis by using te assumption tat, a priori, g sould be similar to g. Tat is, te probability of observing a after te context sequence sould be similar to tat of seeing a after, since it migt be reasonable to assume tat context symbols from te more distant past matter less. Tus we coose for our prior: g s g s Beta(α s g s, α s ( g s )) () were s denotes te context s wit te earliest symbol removed. Tis coice gives te prior distribution of g s mean g s, as desired. We continue constructing te prior wit g s g s Beta(α s g s, α s ( g s )) and so on until g [] Beta(α p, α ( p )) were g [] is te probability of a spike given no context information and p is a yperparameter reflecting our prior belief about te probability of a spike. Tis ierarcy gives our prior te tree structure as sown in in Figure. A priori, te distribution of eac transition probability is centered around te transition probability from a one-symbol-sorter block of symbols. As long as te assumption tat more distant contextual symbols matter less actually olds (at least to some degree), tis structure allows te saring of statistical information across different contextual depts. We can obtain reasonable estimates for te transition probabilities from long blocks of symbols, even from data tat is so sort tat we may ave few (or no) observations of eac of tese long blocks of symbols. We could use any number of distributions wit mean g s to center te prior distribution of g s at g s ; we use Beta distributions because tey are conjugate to te likeliood. Te α s are concentration parameters wic control ow concentrated te distribution is about its mean, and can also be estimated from te data. We assume tat tere is one value of α for eac level in te ierarcy, but one could also fix alpa to be constant trougout all levels, or let it vary witin eac level. Tis ierarcy of beta distributions is a special case of te ierarcical Diriclet process. A Diriclet process (DP) is a stocastic process wose sample pats are eac probability distributions. Formally, if G is a finite measure on a set S, ten X DP (α, G) if for any finite measurable partition of te sample space (A,...A n ) we ave tat X(A ),...X(A n ) Diriclet(αG(A ),..., αg(a n )). Tus for a partition into only two sets, te Diriclet process reduces to a beta distribution, wic is wy wen we specialize te HDP to binary data, we obtain a ierarcical beta distribution. In [5] te autors present a ierarcy of DPs were te base measure for eac DP is again a DP. In our case, for example, we ave G = {g, g } DP (α 3, G ), or more generally, G s DP (α s, G s ). 5 Empirical Bayesian Estimator One can generate a sequence from an HDP by drawing eac subsequent symbol from te transition probability distribution associated wit its context, wic is given recursively by [6] : p( s) = { cs α s +c s + α s c α + α +N α p +N α s +c s p( s ) if s if s = were N is te lengt of te data string, p is a yperparameter representing te a prior probability of observing a given no contextual information, c s is te number of times te symbol sequence s followed by a was observed, and c s is te number of times te symbol sequence s was observed. We can calculate te posterior predictive distribution ĝ pr wic is specified by te 2 k values {g s = p( s) : s = k} by using counts c from te data and performing te above recursive calculation to estimate g s for eac of te 2 k states s. Given te estimated Markov transition probabilities ĝ pr we ten ave an empirical Bayesian entropy rate estimate via Equations 7 and 8. We denote tis estimator emphdp. Note tat wile ĝ pr is te posterior mean of te transition probabilities, te entropy rate estimator emphdp is no longer a fully Bayesian estimate, and is not equivalent to te ĥbayes of equation 9. We tus lose some clarity and te ability to easily compute Bayesian confidence intervals. However, we gain a good deal of computational efficiency because calculating emphdp from ĝ pr involves only one eigenvector computation, instead of te many needed for te MC approximation to te integral in Equation 9. We present a fully Bayesian estimate next. () 5

6 6 Fully Bayesian Estimator Here we return to te Bayes least squares estimator ĥbayes of Equation 9. Te integral is not analytically tractable, but we can approximate it using Markov Cain Monte Carlo tecniques. We use Gibbs sampling to simulate N MC samples g (i) g data from te posterior distribution and ten calculate (i) from eac g (i) via Equations 7 and 8 to obtain te Bayesian estimate: HDP = N MC (i) (2) N MC To perform te Gibbs sampling, we need te posterior conditional probabilities of eac g s. Because te parameters of te model ave te structure of a tree, eac g s for s < k is conditionally independent from all but its immediate ancestor in te tree, g s, and its two descendants, g s and g s. We ave: p(g s g s, g s, g s.α s, α s = ) Beta(g s ; α s g s, α s ( g s ))Beta(g s ; α s + g s, α s + ( g s )) i= Beta(g s ; α s + g s, α s + ( g s )) and we can compute tese probabilities on a discrete grid since tey are eac one dimensional, ten sample te posterior g s via tis grid. We used a uniform grid of points on te interval [,] for our computation. For te transition probabilities from te bottom level of te tree {g s : s = k}, te conjugacy of te beta distributions wit binomial likeliood function gives te posterior conditional of g s a recognizable form: p(g s g s, data) = Beta(α k g s + c s, α k ( g s ) + c s ). In te HDP model we may treat eac α as a fixed yperparameter, but it is also straigtforward to set a prior over eac α and ten sample α along wit te oter model parameters wit eac pass of te Gibbs sampler. Te full posterior conditional for α i wit a uniform prior is (from Bayes teorem): p(α i g s, g s, g s : s = i ) {s: s =i } (3) (g s g s ) αigs (( g s )( g s )) αi( gs) Beta(α i g s, α i ( g s )) 2 (4) We sampled α by computing te probabilities above on a grid of values spanning te range [, 2]. Tis upper bound on α is rater arbitrary, but we verified tat increasing te range for α ad little effect on te entropy rate estimate, at least for te ranges and block sizes considered. In some applications, te Markov transition probabilities g, and not just te entropy rate, may be of interest as a description of te time dependencies present in te data. Te Gibbs sampler above yields samples from te distribution g data, and averaging tese N MC samples yields a Bayes least squares estimator of transition probabilities, ĝ gibbshdp. Note tat tis estimate is closely related to te estimate ĝ pr from te previous section; wit more MC samples, ĝ gibbshdp converges to te posterior mean ĝ pr (wen te α are fixed rater tan sampled, to matc te fixed α per level used in Equation ). 7 Results We applied te model to bot simulated data wit a known entropy rate and to neural data, were te entropy rate is unknown. We examine te accuracy of te fully Bayesian and empirical Bayesian entropy rate estimators HDP and emphdp, and compare te entropy rate estimators plugin, NSB, MM, LZ [2], and CT W [], wic are described in Section 2. We also consider estimates of te Markov transition probabilities g produced by bot inference metods. 7. Simulation We considered data simulated from a Markov model wit transition probabilities set so tat transition probabilities from states wit similar suffixes are similar (i.e. te process actually does ave te property tat more distant context symbols matter less tan more recent ones in determining transitions). We used a dept-5 Markov model, wose true transition probabilities are sown in black in 6

7 (a) p( s) (b) Data Lengt (c) Absolute Error (d) Data Lengt.9.7 true NSB MM plugin LZ ctw emphdp HDP Block Lengt Figure 2: Comparison of estimated (a) transition probability and (b,c,d) entropy rate for data simulated from a Markov model of dept 5. In (a) and (d), data sets are 5 symbols long. Te block-based and HDP estimators in (b) and (c) use block size k = 8. In (b,c,d) results were averaged over 5 data sequences, and (c) plots te average absolute value of te difference between true and estimated entropy rates. Figure 2a, were eac of te 32 points on te x axis represents true te probability tat te next symbol NSB is a.9 given te specified 5-symbol context. MM In Figure 2a we compare HDP estimates of transition plugin probabilities of tis simulated data to te plugin estimator of transition probabilities ĝ LZ s = cs c s calculated ctw from a 5-symbol sequence. (Te oter estimators do not include calculating transition probabilities emphdp as an intermediate step, and so.7 cannot be2 included ere.) 2 Wit a series of 5 symbols, HDP we do not expect enoug observations of Block Lengt eac of possible transitions to adequately estimate te 2 k transition probabilities, even for rater modest depts suc as k = 5. And indeed, te plugin estimates of transition probabilities do not matc te true transition probabilities well. On te oter and, te transition probabilities estimated using te HDP prior sow te kind of smooting te prior was meant to encourage, were states corresponding to contexts wit same suffixes ave similar estimated transition probabilities. Lastly, we plot te convergence of te entropy rate estimators wit increased lengt of te data sequence and te associated error in Figures 2b,c. If te true dept of te model is no larger tan te dept k considered in te estimators, all te estimators considered sould converge. We see in Figure 2c tat te HDP-based entropy rate estimates converge quickly wit increasing data, relative to oter models. Te motivation of te ierarcical prior was to allow observations of transitions from sorter contexts to inform estimates of transitions from longer contexts. Tis, it was oped, would mitigate te drop-off wit larger block-size seen in block-entropy based entropy rate estimators. Figure 2d indicates tat for simulated data tat is indeed te case, altoug we do see some bias te fully Bayesian entropy rate estimator for large block lengts. Te empirical Bayes and and fully Bayesian entropy rate estimators wit HDP priors produce estimates tat are close to te true entropy rate across a wider range of block-size. 7.2 Neural Data We applied te same analysis to neural spike train data collected from primate retinal ganglion cells stimulated wit binary full-field movies refresed at Hz [8]. In tis case, te true transition probabilities are unknown (and indeed te process may not be exactly Markovian). However, we calculate te plug-in transition probabilities from a longer data sequence (67, bins) so tat te estimates are approximately converged (black trace in Figure 3a), and note tat transition probabilities from contexts wit te same most-recent context symbols do appear to be similar. Tus te estimated transition probabilities reflect te idea tat more distant context cues matter less, and te smooting of te HDP prior appears to be appropriate for tis neural data. Te true entropy rate is also unknown, but again we estimate it using te plugin estimator on a large data set. We again note te relatively fast convergence of HDP and emphdp in Figures 3b,c, and te long plateau of te estimators in Figure 3d indicating te relative stability of te HDP entropy rate estimators wit respect to coice of model dept. 7

8 (a) p( s) (b) 2 4 Data Lengt (c) Absolute Error Data Lengt (d) converged NSB MM plugin LZ ctw emphdp HDP Block Lengt Figure 3: Comparison of estimated (a) transition probability and (b,c,d) entropy rate for neural data. Te converged estimates are calculated from 7s of data wit 4ms bins (67, symbols). In (a) and (d), training data sequences are 5 symbols (2s) long. Te block-based and HDP estimators in (b) and (c) use block size k = 8. In (b,c,d), results were averaged over 5 data sequences sampled randomly from te full dataset. 8 Discussion LZ. to take into account long time dependencies ctw and small enoug relative to te data at and to avoid a severe downward bias of te estimate. emphdp We ave approaced tis problem by modeling te data as 2 a Markov cain Data Lengt and estimating transition HDP probabilities using a ierarcical prior tat links transition Mean Absolute Error We ave presented two estimators converged of te entropy rate of a spike train or arbitrary binary sequence..3 Te true entropy rate of a stocastic NSB process involves consideration of infinitely long time depen- MM dencies. To make entropy rateestimation tractable, one can try to fix a maximum dept of time plugin dependencies to be considered, but it is difficult to coose an appropriate dept tat is large enoug Estimated Entropy Rate probabilities from longer contexts to transition probabilities from sorter contexts. Tis allowed us to coose a large dept even in te presence of limited data, since te structure of te prior allowed observations of transitions from sorter contexts (of wic we ave many instances in te data) to inform estimates of transitions from longer contexts (of wic we may ave only a few instances) We presented 2 Block bot Lengta fully Bayesian estimator, wic Data Lengtallows for Bayesian confidence intervals, and an empirical Bayesian estimator, wic provides computational efficiency. Bot estimators sow excellent performance on simulated and neural data in terms of teir robustness to te coice of model dept, teir accuracy on sort data sequences, and teir convergence wit increased data. Bot metods of entropy rate estimation also yield estimates of te transition probabilities wen te data is modeled as a Markov cain, parameters wic may be of interest in te own rigt as descriptive of te statistical structure and time dependencies in a spike train. Our results indicate tat tools from modern Bayesian nonparametric statistics old great promise for revealing te structure of neural spike trains despite te callenges of limited data. Acknowledgments We tank V. J. Uzzell and E. J. Cicilnisky for retinal data. Tis work was supported by a Sloan Researc Fellowsip, McKnigt Scolar s Award, and NSF CAREER Award IIS

9 References [] Mattew B Kennel, Jonaton Slens, Henry DI Abarbanel, and EJ Cicilnisky. Estimating entropy rates wit bayesian confidence intervals. Neural Computation, 7(7):53 576, 25. [2] Abraam Lempel and Jacob Ziv. On te complexity of finite sequences. Information Teory, IEEE Transactions on, 22():75 8, 976. [3] Ilya Nemenman, Fariel Safee, and William Bialek. Entropy and inference, revisited. arxiv preprint pysics/825, 2. [4] George Armitage Miller and William Gregory Madow. On te Maximum Likeliood Estimate of te Sannon-Weiner Measure of Information. Operational Applications Laboratory, Air Force Cambridge Researc Center, Air Researc and Development Command, Bolling Air Force Base, 954. [5] Yee Wye Te, Micael I Jordan, Mattew J Beal, and David M Blei. Hierarcical diriclet processes. Journal of te American Statistical Association, (476), 26. [6] Yee Wye Te. A ierarcical bayesian language model based on pitman-yor processes. In Proceedings of te 2st International Conference on Computational Linguistics and te 44t annual meeting of te Association for Computational Linguistics, pages Association for Computational Linguistics, 26. [7] Frank Wood, Cédric Arcambeau, Jan Gastaus, Lancelot James, and Yee Wye Te. A stocastic memoizer for sequence data. In Proceedings of te 26t Annual International Conference on Macine Learning, pages ACM, 29. [8] V. J. Uzzell and E. J. Cicilnisky. Precision of spike trains in primate retinal ganglion cells. Journal of Neuropysiology, 92:78 789, 24. 9

Regularized Regression

Regularized Regression Regularized Regression David M. Blei Columbia University December 5, 205 Modern regression problems are ig dimensional, wic means tat te number of covariates p is large. In practice statisticians regularize

More information

Te comparison of dierent models M i is based on teir relative probabilities, wic can be expressed, again using Bayes' teorem, in terms of prior probab

Te comparison of dierent models M i is based on teir relative probabilities, wic can be expressed, again using Bayes' teorem, in terms of prior probab To appear in: Advances in Neural Information Processing Systems 9, eds. M. C. Mozer, M. I. Jordan and T. Petsce. MIT Press, 997 Bayesian Model Comparison by Monte Carlo Caining David Barber D.Barber@aston.ac.uk

More information

A = h w (1) Error Analysis Physics 141

A = h w (1) Error Analysis Physics 141 Introduction In all brances of pysical science and engineering one deals constantly wit numbers wic results more or less directly from experimental observations. Experimental observations always ave inaccuracies.

More information

lecture 26: Richardson extrapolation

lecture 26: Richardson extrapolation 43 lecture 26: Ricardson extrapolation 35 Ricardson extrapolation, Romberg integration Trougout numerical analysis, one encounters procedures tat apply some simple approximation (eg, linear interpolation)

More information

Notes on Neural Networks

Notes on Neural Networks Artificial neurons otes on eural etwors Paulo Eduardo Rauber 205 Consider te data set D {(x i y i ) i { n} x i R m y i R d } Te tas of supervised learning consists on finding a function f : R m R d tat

More information

Numerical Differentiation

Numerical Differentiation Numerical Differentiation Finite Difference Formulas for te first derivative (Using Taylor Expansion tecnique) (section 8.3.) Suppose tat f() = g() is a function of te variable, and tat as 0 te function

More information

Minimizing D(Q,P) def = Q(h)

Minimizing D(Q,P) def = Q(h) Inference Lecture 20: Variational Metods Kevin Murpy 29 November 2004 Inference means computing P( i v), were are te idden variables v are te visible variables. For discrete (eg binary) idden nodes, exact

More information

How to Find the Derivative of a Function: Calculus 1

How to Find the Derivative of a Function: Calculus 1 Introduction How to Find te Derivative of a Function: Calculus 1 Calculus is not an easy matematics course Te fact tat you ave enrolled in suc a difficult subject indicates tat you are interested in te

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximatinga function fx, wose values at a set of distinct points x, x, x,, x n are known, by a polynomial P x suc

More information

Combining functions: algebraic methods

Combining functions: algebraic methods Combining functions: algebraic metods Functions can be added, subtracted, multiplied, divided, and raised to a power, just like numbers or algebra expressions. If f(x) = x 2 and g(x) = x + 2, clearly f(x)

More information

Efficient algorithms for for clone items detection

Efficient algorithms for for clone items detection Efficient algoritms for for clone items detection Raoul Medina, Caroline Noyer, and Olivier Raynaud Raoul Medina, Caroline Noyer and Olivier Raynaud LIMOS - Université Blaise Pascal, Campus universitaire

More information

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY (Section 3.2: Derivative Functions and Differentiability) 3.2.1 SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY LEARNING OBJECTIVES Know, understand, and apply te Limit Definition of te Derivative

More information

Function Composition and Chain Rules

Function Composition and Chain Rules Function Composition and s James K. Peterson Department of Biological Sciences and Department of Matematical Sciences Clemson University Marc 8, 2017 Outline 1 Function Composition and Continuity 2 Function

More information

REVIEW LAB ANSWER KEY

REVIEW LAB ANSWER KEY REVIEW LAB ANSWER KEY. Witout using SN, find te derivative of eac of te following (you do not need to simplify your answers): a. f x 3x 3 5x x 6 f x 3 3x 5 x 0 b. g x 4 x x x notice te trick ere! x x g

More information

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist Mat 1120 Calculus Test 2. October 18, 2001 Your name Te multiple coice problems count 4 points eac. In te multiple coice section, circle te correct coice (or coices). You must sow your work on te oter

More information

Preface. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Preface. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. Preface Here are my online notes for my course tat I teac ere at Lamar University. Despite te fact tat tese are my class notes, tey sould be accessible to anyone wanting to learn or needing a refreser

More information

Homework 1 Due: Wednesday, September 28, 2016

Homework 1 Due: Wednesday, September 28, 2016 0-704 Information Processing and Learning Fall 06 Homework Due: Wednesday, September 8, 06 Notes: For positive integers k, [k] := {,..., k} denotes te set of te first k positive integers. Wen p and Y q

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximating a function f(x, wose values at a set of distinct points x, x, x 2,,x n are known, by a polynomial P (x

More information

Order of Accuracy. ũ h u Ch p, (1)

Order of Accuracy. ũ h u Ch p, (1) Order of Accuracy 1 Terminology We consider a numerical approximation of an exact value u. Te approximation depends on a small parameter, wic can be for instance te grid size or time step in a numerical

More information

Financial Econometrics Prof. Massimo Guidolin

Financial Econometrics Prof. Massimo Guidolin CLEFIN A.A. 2010/2011 Financial Econometrics Prof. Massimo Guidolin A Quick Review of Basic Estimation Metods 1. Were te OLS World Ends... Consider two time series 1: = { 1 2 } and 1: = { 1 2 }. At tis

More information

Sin, Cos and All That

Sin, Cos and All That Sin, Cos and All Tat James K. Peterson Department of Biological Sciences and Department of Matematical Sciences Clemson University Marc 9, 2017 Outline Sin, Cos and all tat! A New Power Rule Derivatives

More information

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines Lecture 5 Interpolation II Introduction In te previous lecture we focused primarily on polynomial interpolation of a set of n points. A difficulty we observed is tat wen n is large, our polynomial as to

More information

1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point

1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point MA00 Capter 6 Calculus and Basic Linear Algebra I Limits, Continuity and Differentiability Te concept of its (p.7 p.9, p.4 p.49, p.55 p.56). Limits Consider te function determined by te formula f Note

More information

2.8 The Derivative as a Function

2.8 The Derivative as a Function .8 Te Derivative as a Function Typically, we can find te derivative of a function f at many points of its domain: Definition. Suppose tat f is a function wic is differentiable at every point of an open

More information

Introduction to Derivatives

Introduction to Derivatives Introduction to Derivatives 5-Minute Review: Instantaneous Rates and Tangent Slope Recall te analogy tat we developed earlier First we saw tat te secant slope of te line troug te two points (a, f (a))

More information

Exam 1 Review Solutions

Exam 1 Review Solutions Exam Review Solutions Please also review te old quizzes, and be sure tat you understand te omework problems. General notes: () Always give an algebraic reason for your answer (graps are not sufficient),

More information

2.11 That s So Derivative

2.11 That s So Derivative 2.11 Tat s So Derivative Introduction to Differential Calculus Just as one defines instantaneous velocity in terms of average velocity, we now define te instantaneous rate of cange of a function at a point

More information

Derivation Of The Schwarzschild Radius Without General Relativity

Derivation Of The Schwarzschild Radius Without General Relativity Derivation Of Te Scwarzscild Radius Witout General Relativity In tis paper I present an alternative metod of deriving te Scwarzscild radius of a black ole. Te metod uses tree of te Planck units formulas:

More information

Differential Calculus (The basics) Prepared by Mr. C. Hull

Differential Calculus (The basics) Prepared by Mr. C. Hull Differential Calculus Te basics) A : Limits In tis work on limits, we will deal only wit functions i.e. tose relationsips in wic an input variable ) defines a unique output variable y). Wen we work wit

More information

Mathematics 5 Worksheet 11 Geometry, Tangency, and the Derivative

Mathematics 5 Worksheet 11 Geometry, Tangency, and the Derivative Matematics 5 Workseet 11 Geometry, Tangency, and te Derivative Problem 1. Find te equation of a line wit slope m tat intersects te point (3, 9). Solution. Te equation for a line passing troug a point (x

More information

232 Calculus and Structures

232 Calculus and Structures 3 Calculus and Structures CHAPTER 17 JUSTIFICATION OF THE AREA AND SLOPE METHODS FOR EVALUATING BEAMS Calculus and Structures 33 Copyrigt Capter 17 JUSTIFICATION OF THE AREA AND SLOPE METHODS 17.1 THE

More information

Recall from our discussion of continuity in lecture a function is continuous at a point x = a if and only if

Recall from our discussion of continuity in lecture a function is continuous at a point x = a if and only if Computational Aspects of its. Keeping te simple simple. Recall by elementary functions we mean :Polynomials (including linear and quadratic equations) Eponentials Logaritms Trig Functions Rational Functions

More information

Phase space in classical physics

Phase space in classical physics Pase space in classical pysics Quantum mecanically, we can actually COU te number of microstates consistent wit a given macrostate, specified (for example) by te total energy. In general, eac microstate

More information

Practice Problem Solutions: Exam 1

Practice Problem Solutions: Exam 1 Practice Problem Solutions: Exam 1 1. (a) Algebraic Solution: Te largest term in te numerator is 3x 2, wile te largest term in te denominator is 5x 2 3x 2 + 5. Tus lim x 5x 2 2x 3x 2 x 5x 2 = 3 5 Numerical

More information

(a) At what number x = a does f have a removable discontinuity? What value f(a) should be assigned to f at x = a in order to make f continuous at a?

(a) At what number x = a does f have a removable discontinuity? What value f(a) should be assigned to f at x = a in order to make f continuous at a? Solutions to Test 1 Fall 016 1pt 1. Te grap of a function f(x) is sown at rigt below. Part I. State te value of eac limit. If a limit is infinite, state weter it is or. If a limit does not exist (but is

More information

Teaching Differentiation: A Rare Case for the Problem of the Slope of the Tangent Line

Teaching Differentiation: A Rare Case for the Problem of the Slope of the Tangent Line Teacing Differentiation: A Rare Case for te Problem of te Slope of te Tangent Line arxiv:1805.00343v1 [mat.ho] 29 Apr 2018 Roman Kvasov Department of Matematics University of Puerto Rico at Aguadilla Aguadilla,

More information

5 Ordinary Differential Equations: Finite Difference Methods for Boundary Problems

5 Ordinary Differential Equations: Finite Difference Methods for Boundary Problems 5 Ordinary Differential Equations: Finite Difference Metods for Boundary Problems Read sections 10.1, 10.2, 10.4 Review questions 10.1 10.4, 10.8 10.9, 10.13 5.1 Introduction In te previous capters we

More information

IEOR 165 Lecture 10 Distribution Estimation

IEOR 165 Lecture 10 Distribution Estimation IEOR 165 Lecture 10 Distribution Estimation 1 Motivating Problem Consider a situation were we ave iid data x i from some unknown distribution. One problem of interest is estimating te distribution tat

More information

Adaptive Neural Filters with Fixed Weights

Adaptive Neural Filters with Fixed Weights Adaptive Neural Filters wit Fixed Weigts James T. Lo and Justin Nave Department of Matematics and Statistics University of Maryland Baltimore County Baltimore, MD 150, U.S.A. e-mail: jameslo@umbc.edu Abstract

More information

WYSE Academic Challenge 2004 Sectional Mathematics Solution Set

WYSE Academic Challenge 2004 Sectional Mathematics Solution Set WYSE Academic Callenge 00 Sectional Matematics Solution Set. Answer: B. Since te equation can be written in te form x + y, we ave a major 5 semi-axis of lengt 5 and minor semi-axis of lengt. Tis means

More information

Exercises for numerical differentiation. Øyvind Ryan

Exercises for numerical differentiation. Øyvind Ryan Exercises for numerical differentiation Øyvind Ryan February 25, 2013 1. Mark eac of te following statements as true or false. a. Wen we use te approximation f (a) (f (a +) f (a))/ on a computer, we can

More information

Basics of Statistical Estimation

Basics of Statistical Estimation Basics of Statistical Estimation Doug Downey, Nortwestern EECS 395/495, Spring 206 (several illustrations from P. Domingos, University of Wasington CSE Bayes Rule P(A B = P(B A P(A / P(B Example: P(symptom

More information

MVT and Rolle s Theorem

MVT and Rolle s Theorem AP Calculus CHAPTER 4 WORKSHEET APPLICATIONS OF DIFFERENTIATION MVT and Rolle s Teorem Name Seat # Date UNLESS INDICATED, DO NOT USE YOUR CALCULATOR FOR ANY OF THESE QUESTIONS In problems 1 and, state

More information

Continuity and Differentiability of the Trigonometric Functions

Continuity and Differentiability of the Trigonometric Functions [Te basis for te following work will be te definition of te trigonometric functions as ratios of te sides of a triangle inscribed in a circle; in particular, te sine of an angle will be defined to be te

More information

Cubic Functions: Local Analysis

Cubic Functions: Local Analysis Cubic function cubing coefficient Capter 13 Cubic Functions: Local Analysis Input-Output Pairs, 378 Normalized Input-Output Rule, 380 Local I-O Rule Near, 382 Local Grap Near, 384 Types of Local Graps

More information

LIMITS AND DERIVATIVES CONDITIONS FOR THE EXISTENCE OF A LIMIT

LIMITS AND DERIVATIVES CONDITIONS FOR THE EXISTENCE OF A LIMIT LIMITS AND DERIVATIVES Te limit of a function is defined as te value of y tat te curve approaces, as x approaces a particular value. Te limit of f (x) as x approaces a is written as f (x) approaces, as

More information

Bounds on the Moments for an Ensemble of Random Decision Trees

Bounds on the Moments for an Ensemble of Random Decision Trees Noname manuscript No. (will be inserted by te editor) Bounds on te Moments for an Ensemble of Random Decision Trees Amit Durandar Received: Sep. 17, 2013 / Revised: Mar. 04, 2014 / Accepted: Jun. 30, 2014

More information

= 0 and states ''hence there is a stationary point'' All aspects of the proof dx must be correct (c)

= 0 and states ''hence there is a stationary point'' All aspects of the proof dx must be correct (c) Paper 1: Pure Matematics 1 Mark Sceme 1(a) (i) (ii) d d y 3 1x 4x x M1 A1 d y dx 1.1b 1.1b 36x 48x A1ft 1.1b Substitutes x = into teir dx (3) 3 1 4 Sows d y 0 and states ''ence tere is a stationary point''

More information

Average Rate of Change

Average Rate of Change Te Derivative Tis can be tougt of as an attempt to draw a parallel (pysically and metaporically) between a line and a curve, applying te concept of slope to someting tat isn't actually straigt. Te slope

More information

Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm

Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm Probabilistic Grapical Models 10-708 Homework 1: Due January 29, 2014 at 4 pm Directions. Tis omework assignment covers te material presented in Lectures 1-3. You must complete all four problems to obtain

More information

Fractional Derivatives as Binomial Limits

Fractional Derivatives as Binomial Limits Fractional Derivatives as Binomial Limits Researc Question: Can te limit form of te iger-order derivative be extended to fractional orders? (atematics) Word Count: 669 words Contents - IRODUCIO... Error!

More information

CS340: Bayesian concept learning. Kevin Murphy Based on Josh Tenenbaum s PhD thesis (MIT BCS 1999)

CS340: Bayesian concept learning. Kevin Murphy Based on Josh Tenenbaum s PhD thesis (MIT BCS 1999) CS340: Bayesian concept learning Kevin Murpy Based on Jos Tenenbaum s PD tesis (MIT BCS 1999) Concept learning (binary classification) from positive and negative examples Concept learning from positive

More information

Problem Solving. Problem Solving Process

Problem Solving. Problem Solving Process Problem Solving One of te primary tasks for engineers is often solving problems. It is wat tey are, or sould be, good at. Solving engineering problems requires more tan just learning new terms, ideas and

More information

HOW TO DEAL WITH FFT SAMPLING INFLUENCES ON ADEV CALCULATIONS

HOW TO DEAL WITH FFT SAMPLING INFLUENCES ON ADEV CALCULATIONS HOW TO DEAL WITH FFT SAMPLING INFLUENCES ON ADEV CALCULATIONS Po-Ceng Cang National Standard Time & Frequency Lab., TL, Taiwan 1, Lane 551, Min-Tsu Road, Sec. 5, Yang-Mei, Taoyuan, Taiwan 36 Tel: 886 3

More information

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these.

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these. Mat 11. Test Form N Fall 016 Name. Instructions. Te first eleven problems are wort points eac. Te last six problems are wort 5 points eac. For te last six problems, you must use relevant metods of algebra

More information

CS522 - Partial Di erential Equations

CS522 - Partial Di erential Equations CS5 - Partial Di erential Equations Tibor Jánosi April 5, 5 Numerical Di erentiation In principle, di erentiation is a simple operation. Indeed, given a function speci ed as a closed-form formula, its

More information

Notes on wavefunctions II: momentum wavefunctions

Notes on wavefunctions II: momentum wavefunctions Notes on wavefunctions II: momentum wavefunctions and uncertainty Te state of a particle at any time is described by a wavefunction ψ(x). Tese wavefunction must cange wit time, since we know tat particles

More information

EDML: A Method for Learning Parameters in Bayesian Networks

EDML: A Method for Learning Parameters in Bayesian Networks : A Metod for Learning Parameters in Bayesian Networks Artur Coi, Kaled S. Refaat and Adnan Darwice Computer Science Department University of California, Los Angeles {aycoi, krefaat, darwice}@cs.ucla.edu

More information

Quaternion Dynamics, Part 1 Functions, Derivatives, and Integrals. Gary D. Simpson. rev 01 Aug 08, 2016.

Quaternion Dynamics, Part 1 Functions, Derivatives, and Integrals. Gary D. Simpson. rev 01 Aug 08, 2016. Quaternion Dynamics, Part 1 Functions, Derivatives, and Integrals Gary D. Simpson gsim1887@aol.com rev 1 Aug 8, 216 Summary Definitions are presented for "quaternion functions" of a quaternion. Polynomial

More information

A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES

A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES Ronald Ainswort Hart Scientific, American Fork UT, USA ABSTRACT Reports of calibration typically provide total combined uncertainties

More information

NUMERICAL DIFFERENTIATION. James T. Smith San Francisco State University. In calculus classes, you compute derivatives algebraically: for example,

NUMERICAL DIFFERENTIATION. James T. Smith San Francisco State University. In calculus classes, you compute derivatives algebraically: for example, NUMERICAL DIFFERENTIATION James T Smit San Francisco State University In calculus classes, you compute derivatives algebraically: for example, f( x) = x + x f ( x) = x x Tis tecnique requires your knowing

More information

Continuity and Differentiability Worksheet

Continuity and Differentiability Worksheet Continuity and Differentiability Workseet (Be sure tat you can also do te grapical eercises from te tet- Tese were not included below! Typical problems are like problems -3, p. 6; -3, p. 7; 33-34, p. 7;

More information

SFU UBC UNBC Uvic Calculus Challenge Examination June 5, 2008, 12:00 15:00

SFU UBC UNBC Uvic Calculus Challenge Examination June 5, 2008, 12:00 15:00 SFU UBC UNBC Uvic Calculus Callenge Eamination June 5, 008, :00 5:00 Host: SIMON FRASER UNIVERSITY First Name: Last Name: Scool: Student signature INSTRUCTIONS Sow all your work Full marks are given only

More information

Chapter 1. Density Estimation

Chapter 1. Density Estimation Capter 1 Density Estimation Let X 1, X,..., X n be observations from a density f X x. Te aim is to use only tis data to obtain an estimate ˆf X x of f X x. Properties of f f X x x, Parametric metods f

More information

The derivative function

The derivative function Roberto s Notes on Differential Calculus Capter : Definition of derivative Section Te derivative function Wat you need to know already: f is at a point on its grap and ow to compute it. Wat te derivative

More information

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx.

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx. Capter 2 Integrals as sums and derivatives as differences We now switc to te simplest metods for integrating or differentiating a function from its function samples. A careful study of Taylor expansions

More information

WHEN GENERALIZED SUMSETS ARE DIFFERENCE DOMINATED

WHEN GENERALIZED SUMSETS ARE DIFFERENCE DOMINATED WHEN GENERALIZED SUMSETS ARE DIFFERENCE DOMINATED VIRGINIA HOGAN AND STEVEN J. MILLER Abstract. We study te relationsip between te number of minus signs in a generalized sumset, A + + A A, and its cardinality;

More information

3.1 Extreme Values of a Function

3.1 Extreme Values of a Function .1 Etreme Values of a Function Section.1 Notes Page 1 One application of te derivative is finding minimum and maimum values off a grap. In precalculus we were only able to do tis wit quadratics by find

More information

CHAPTER 3: Derivatives

CHAPTER 3: Derivatives CHAPTER 3: Derivatives 3.1: Derivatives, Tangent Lines, and Rates of Cange 3.2: Derivative Functions and Differentiability 3.3: Tecniques of Differentiation 3.4: Derivatives of Trigonometric Functions

More information

RECOGNITION of online handwriting aims at finding the

RECOGNITION of online handwriting aims at finding the SUBMITTED ON SEPTEMBER 2017 1 A General Framework for te Recognition of Online Handwritten Grapics Frank Julca-Aguilar, Harold Moucère, Cristian Viard-Gaudin, and Nina S. T. Hirata arxiv:1709.06389v1 [cs.cv]

More information

Excluded Volume Effects in Gene Stretching. Pui-Man Lam Physics Department, Southern University Baton Rouge, Louisiana

Excluded Volume Effects in Gene Stretching. Pui-Man Lam Physics Department, Southern University Baton Rouge, Louisiana Excluded Volume Effects in Gene Stretcing Pui-Man Lam Pysics Department, Soutern University Baton Rouge, Louisiana 7083 Abstract We investigate te effects excluded volume on te stretcing of a single DNA

More information

New Distribution Theory for the Estimation of Structural Break Point in Mean

New Distribution Theory for the Estimation of Structural Break Point in Mean New Distribution Teory for te Estimation of Structural Break Point in Mean Liang Jiang Singapore Management University Xiaou Wang Te Cinese University of Hong Kong Jun Yu Singapore Management University

More information

ch (for some fixed positive number c) reaching c

ch (for some fixed positive number c) reaching c GSTF Journal of Matematics Statistics and Operations Researc (JMSOR) Vol. No. September 05 DOI 0.60/s4086-05-000-z Nonlinear Piecewise-defined Difference Equations wit Reciprocal and Cubic Terms Ramadan

More information

Why gravity is not an entropic force

Why gravity is not an entropic force Wy gravity is not an entropic force San Gao Unit for History and Pilosopy of Science & Centre for Time, SOPHI, University of Sydney Email: sgao7319@uni.sydney.edu.au Te remarkable connections between gravity

More information

Copyright c 2008 Kevin Long

Copyright c 2008 Kevin Long Lecture 4 Numerical solution of initial value problems Te metods you ve learned so far ave obtained closed-form solutions to initial value problems. A closedform solution is an explicit algebriac formula

More information

Physically Based Modeling: Principles and Practice Implicit Methods for Differential Equations

Physically Based Modeling: Principles and Practice Implicit Methods for Differential Equations Pysically Based Modeling: Principles and Practice Implicit Metods for Differential Equations David Baraff Robotics Institute Carnegie Mellon University Please note: Tis document is 997 by David Baraff

More information

Time (hours) Morphine sulfate (mg)

Time (hours) Morphine sulfate (mg) Mat Xa Fall 2002 Review Notes Limits and Definition of Derivative Important Information: 1 According to te most recent information from te Registrar, te Xa final exam will be eld from 9:15 am to 12:15

More information

1 Proving the Fundamental Theorem of Statistical Learning

1 Proving the Fundamental Theorem of Statistical Learning THEORETICAL MACHINE LEARNING COS 5 LECTURE #7 APRIL 5, 6 LECTURER: ELAD HAZAN NAME: FERMI MA ANDDANIEL SUO oving te Fundaental Teore of Statistical Learning In tis section, we prove te following: Teore.

More information

MATH745 Fall MATH745 Fall

MATH745 Fall MATH745 Fall MATH745 Fall 5 MATH745 Fall 5 INTRODUCTION WELCOME TO MATH 745 TOPICS IN NUMERICAL ANALYSIS Instructor: Dr Bartosz Protas Department of Matematics & Statistics Email: bprotas@mcmasterca Office HH 36, Ext

More information

Bootstrap confidence intervals in nonparametric regression without an additive model

Bootstrap confidence intervals in nonparametric regression without an additive model Bootstrap confidence intervals in nonparametric regression witout an additive model Dimitris N. Politis Abstract Te problem of confidence interval construction in nonparametric regression via te bootstrap

More information

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x)

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x) Calculus. Gradients and te Derivative Q f(x+) δy P T δx R f(x) 0 x x+ Let P (x, f(x)) and Q(x+, f(x+)) denote two points on te curve of te function y = f(x) and let R denote te point of intersection of

More information

Logarithmic functions

Logarithmic functions Roberto s Notes on Differential Calculus Capter 5: Derivatives of transcendental functions Section Derivatives of Logaritmic functions Wat ou need to know alread: Definition of derivative and all basic

More information

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator.

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator. Lecture XVII Abstract We introduce te concept of directional derivative of a scalar function and discuss its relation wit te gradient operator. Directional derivative and gradient Te directional derivative

More information

THE STURM-LIOUVILLE-TRANSFORMATION FOR THE SOLUTION OF VECTOR PARTIAL DIFFERENTIAL EQUATIONS. L. Trautmann, R. Rabenstein

THE STURM-LIOUVILLE-TRANSFORMATION FOR THE SOLUTION OF VECTOR PARTIAL DIFFERENTIAL EQUATIONS. L. Trautmann, R. Rabenstein Worksop on Transforms and Filter Banks (WTFB),Brandenburg, Germany, Marc 999 THE STURM-LIOUVILLE-TRANSFORMATION FOR THE SOLUTION OF VECTOR PARTIAL DIFFERENTIAL EQUATIONS L. Trautmann, R. Rabenstein Lerstul

More information

Technology-Independent Design of Neurocomputers: The Universal Field Computer 1

Technology-Independent Design of Neurocomputers: The Universal Field Computer 1 Tecnology-Independent Design of Neurocomputers: Te Universal Field Computer 1 Abstract Bruce J. MacLennan Computer Science Department Naval Postgraduate Scool Monterey, CA 9393 We argue tat AI is moving

More information

Dynamics and Relativity

Dynamics and Relativity Dynamics and Relativity Stepen Siklos Lent term 2011 Hand-outs and examples seets, wic I will give out in lectures, are available from my web site www.damtp.cam.ac.uk/user/stcs/dynamics.tml Lecture notes,

More information

Digital Filter Structures

Digital Filter Structures Digital Filter Structures Te convolution sum description of an LTI discrete-time system can, in principle, be used to implement te system For an IIR finite-dimensional system tis approac is not practical

More information

Differentiation in higher dimensions

Differentiation in higher dimensions Capter 2 Differentiation in iger dimensions 2.1 Te Total Derivative Recall tat if f : R R is a 1-variable function, and a R, we say tat f is differentiable at x = a if and only if te ratio f(a+) f(a) tends

More information

Test 2 Review. 1. Find the determinant of the matrix below using (a) cofactor expansion and (b) row reduction. A = 3 2 =

Test 2 Review. 1. Find the determinant of the matrix below using (a) cofactor expansion and (b) row reduction. A = 3 2 = Test Review Find te determinant of te matrix below using (a cofactor expansion and (b row reduction Answer: (a det + = (b Observe R R R R R R R R R Ten det B = (((det Hence det Use Cramer s rule to solve:

More information

Near-Optimal conversion of Hardness into Pseudo-Randomness

Near-Optimal conversion of Hardness into Pseudo-Randomness Near-Optimal conversion of Hardness into Pseudo-Randomness Russell Impagliazzo Computer Science and Engineering UC, San Diego 9500 Gilman Drive La Jolla, CA 92093-0114 russell@cs.ucsd.edu Ronen Saltiel

More information

Generic maximum nullity of a graph

Generic maximum nullity of a graph Generic maximum nullity of a grap Leslie Hogben Bryan Sader Marc 5, 2008 Abstract For a grap G of order n, te maximum nullity of G is defined to be te largest possible nullity over all real symmetric n

More information

Chapter 2 Ising Model for Ferromagnetism

Chapter 2 Ising Model for Ferromagnetism Capter Ising Model for Ferromagnetism Abstract Tis capter presents te Ising model for ferromagnetism, wic is a standard simple model of a pase transition. Using te approximation of mean-field teory, te

More information

MAT244 - Ordinary Di erential Equations - Summer 2016 Assignment 2 Due: July 20, 2016

MAT244 - Ordinary Di erential Equations - Summer 2016 Assignment 2 Due: July 20, 2016 MAT244 - Ordinary Di erential Equations - Summer 206 Assignment 2 Due: July 20, 206 Full Name: Student #: Last First Indicate wic Tutorial Section you attend by filling in te appropriate circle: Tut 0

More information

LEAST-SQUARES FINITE ELEMENT APPROXIMATIONS TO SOLUTIONS OF INTERFACE PROBLEMS

LEAST-SQUARES FINITE ELEMENT APPROXIMATIONS TO SOLUTIONS OF INTERFACE PROBLEMS SIAM J. NUMER. ANAL. c 998 Society for Industrial Applied Matematics Vol. 35, No., pp. 393 405, February 998 00 LEAST-SQUARES FINITE ELEMENT APPROXIMATIONS TO SOLUTIONS OF INTERFACE PROBLEMS YANZHAO CAO

More information

Math 31A Discussion Notes Week 4 October 20 and October 22, 2015

Math 31A Discussion Notes Week 4 October 20 and October 22, 2015 Mat 3A Discussion Notes Week 4 October 20 and October 22, 205 To prepare for te first midterm, we ll spend tis week working eamples resembling te various problems you ve seen so far tis term. In tese notes

More information

Chapter 5 FINITE DIFFERENCE METHOD (FDM)

Chapter 5 FINITE DIFFERENCE METHOD (FDM) MEE7 Computer Modeling Tecniques in Engineering Capter 5 FINITE DIFFERENCE METHOD (FDM) 5. Introduction to FDM Te finite difference tecniques are based upon approximations wic permit replacing differential

More information

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Statistica Sinica 24 2014, 395-414 doi:ttp://dx.doi.org/10.5705/ss.2012.064 EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Jun Sao 1,2 and Seng Wang 3 1 East Cina Normal University,

More information

Material for Difference Quotient

Material for Difference Quotient Material for Difference Quotient Prepared by Stepanie Quintal, graduate student and Marvin Stick, professor Dept. of Matematical Sciences, UMass Lowell Summer 05 Preface Te following difference quotient

More information

Long Term Time Series Prediction with Multi-Input Multi-Output Local Learning

Long Term Time Series Prediction with Multi-Input Multi-Output Local Learning Long Term Time Series Prediction wit Multi-Input Multi-Output Local Learning Gianluca Bontempi Macine Learning Group, Département d Informatique Faculté des Sciences, ULB, Université Libre de Bruxelles

More information

Optimal parameters for a hierarchical grid data structure for contact detection in arbitrarily polydisperse particle systems

Optimal parameters for a hierarchical grid data structure for contact detection in arbitrarily polydisperse particle systems Comp. Part. Mec. 04) :357 37 DOI 0.007/s4057-04-000-9 Optimal parameters for a ierarcical grid data structure for contact detection in arbitrarily polydisperse particle systems Dinant Krijgsman Vitaliy

More information