GaGa: a Parsimonious and Flexible Hierarchical Model for Microarray Data Analysis
|
|
- Hubert Walsh
- 6 years ago
- Views:
Transcription
1 GaGa: a Parsimnius and Fleible Hierarchical Mdel fr Micrarray Data Analysis David Rssell Department f Bistatistics M.D. Andersn Cancer Center, Hustn, TX 77030, USA rsselldavid@gmail.cm Abstract Bayesian hierarchical mdels are attractive fr micrarray data since they allw sharing infrmatin acrss genes and acrss different analyses in a cherent manner. Kendzirski et al. (2003) and Newtn et al. (2004) intrduced the gamma-gamma hierarchical mdel. The mdel parsimniusly describes the epressin f thusands f genes with a small number f hyper-parameters. This makes the mdel easy t interpret and analytically tractable. Hwever, we find imprtant limitatins f the mdel when fitting real datasets. The limitatins are due t sme f the assumptins being t restrictive. We prpse a simple etensin f the mdel that imprves the fit substantially with almst n increase in cmpleity. The mdel allws cmparing nt nly mean epressin between grups but als the distributinal shape, which we argue t be f bilgical relevance. We prpse a secnd etensin that uses a miture f gamma distributins t further imprve the fit, at the epense f increased cmputatinal burden. We prpse several apprimatins that significantly reduce the cmputatinal cst. We use ur apprach fr inference abut differential gene epressin and fr class predictin, bth in simulated and real datasets. We find that bth etensins are preferable t the riginal frmulatin f the mdel, and that they prvide advantages ver several ther ppular methds, especially fr small samples. 1 Intrductin Tw cmmn inference prblems with micrarray data are differential epressin analysis, i.e. the cmparisn f sme measure f gene epressin between grups, and class predictin, i.e. the predictin f an unknwn grup label fr a new sample. Fr bth prcedures, ne f the challenges is that the number f genes greatly eceeds the number f replicated measurements that are btained fr each gene. That is, data is abundant at an verall level but it is scarce at the gene level, and therefre there is much ptential fr methds that allw fr the sharing f infrmatin acrss genes. T whm crrespndence shuld be addressed 1
2 Hierarchical mdels naturally allw fr the sharing f infrmatin between genes. Typical eamples are Lönnstedt and Speed (2002) and Smyth (2004), wh mdeled genespecific parameter estimates via hierarchical empirical Bayes methds t btain imprved testing prcedures. Kendzirski et al. (2003), Newtn et al. (2001) and Newtn and Kendzirski (2003) prpsed hierarchical mdels that depend n few parameters i.e. they greatly reduce the dimensinality f the prblem. This feature is particularly imprtant fr small sample sizes. We build n the gamma-gamma hierarchical mdel f Kendzirski et al. (2003). The mdel is used etensively in recent literature (Parmigiani et al., 2003; Müller et al., 2004; Zha et al., 2005; Chigna et al., 2007). The gamma-gamma mdel assumes that the bservatins fr each gene arise frm a gamma distributin with cmmn shape parameter acrss all genes and a scale parameter that arises frm a hierarchical gamma prir. Since the mdel uses a single gamma prir, we refer t it as the Ga mdel. We find the gamma chice appealing, fr it is a fleible family that can capture a variety f distributinal shapes. In this paper we shw that, althugh this mdel is elegant and parsimnius, it fails t prvide an adequate fit in real datasets, and that this is partly due t the assumptin that the shape parameter is cmmn acrss genes. We prpse a simple etensin f the mdel that specifies a gamma prir n bth the shape and the inverse mean parameters (GaGa mdel). The etensin is still parsimnius, requiring nly ne additinal hyper-parameter, and it can be fit bth via empirical Bayes and a fully Bayesian apprach. We develp an algrithm that implements psterir inference with a cmputatinal effrt that is cmparable t the Ga mdel. We then develp a secnd etensin that specifies a gamma prir n the shape parameter and a miture f gamma prirs n the inverse mean (MiGaGa mdel). This prvides additinal fleibility, albeit at the epense f reduced mdel parsimny and increased cmputatinal cst. An advantage f the GaGa and MiGaGa hierarchical mdels is that they allw t cmpare bth the shape and mean parameters between grups. That is, ne nt nly cmpares mean epressin levels but als the distributinal shape. This is imprtant because, as argued by Lapinte et al. (2004) and Cmbes et al. (2007), the latter cmparisn may be bilgically mre meaningful than cmparing mean epressins. As an eample, cnsider the case f cancer bimarkers. Many mutatins, deletins and translcatins affect nly a prprtin f the diseased individuals. When cmparing nrmal and cancer cells, nly that prprtin ehibit a mdified epressin pattern, i.e. the tail behavir r the variability will be different acrss grups even thugh the mean epressin may be almst unchanged. The paper is structured as fllws. In Sectin 2 we review the Ga mdel and we etend it t the GaGa mdel. We derive epressins fr psterir prbabilities f interest and an MCMC sampling scheme t fit the mdel. In Sectin 3 we prpse as a further generalizatin the MiGaGa. Fr bth etensins, the psterir distributins f the gamma shape parameters are knwn nly up t a cnstant. We refer t this distributin, which t ur knwledge has nt been described befre, as the gamma shape distributin. In Sectin 4 we derive useful apprimatins fr this distributin. In Sectin 5 we eplain hw t find differentially epressed (DE) genes and perfrm class predictin. In Sectin 6 we apply ur apprach t simulated data and real datasets. Finally, in Sectin 7 we present sme cncluding remarks. The methds described in this paper are implemented in the R library gaga. 2
3 2 The GaGa mdel We assume that the data has been backgrund crrected, nrmalized and quantified in a sensible manner (Dudit et al., 2002). Let ij be the measure f epressin fr gene i, i = 1... n, in micrarray j, j = 1... J. Let z j {1... K} indicate a grup membership, e.g. z j = 1 fr nrmal cells and z j = 2 fr cancer cells. We dente the vectr f bservatins fr gene i as i and the whle dataset as. We use Ga( ) t dente a gamma distributin, IGa( ) fr the inverse gamma, Mult( ) fr the multinmial, Dirichlet( ) fr the Dirichlet and GaS( ) fr the gamma shape distributin. The GaS distributin is defined in Sectin 4. In the differential epressin prblem the investigatr is interested in determining the epressin pattern that each gene fllws. This inference prblem can be viewed as a hypthesis testing prblem. Thrughut we use the terms hypthesis and epressin pattern interchangeably. A simplest setup is having K = 2 grups and 2 hyptheses: pattern 0 under which bth grups are equally epressed (null hypthesis) and pattern 1 under which they are differentially epressed (alternative hypthesis). Fr K > 2 we may want t cnsider mre than 2 patterns. Fr eample, if grup 1 crrespnds t nrmal cells, grup 2 t cells with type A cancer and grup 3 t type B cancer, ne may be interested in assigning each gene t ne f the fllwing patterns: Pattern 0: Nrmal = Cancer A = Cancer B Pattern 1: Nrmal Cancer A = Cancer B Pattern 2: Nrmal Cancer A Cancer B. (1) Dente by H the number f hyptheses, and let the latent variable δ i {0, 1... H 1} indicate the true epressin pattern fr gene i. We refer t genes with δ i = 0 as equally epressed (EE) and genes with δ i 0 as differentially epressed (DE), and we dente δ = (δ 1... δ n ). 2.1 The mdel The Ga mdel (Kendzirski et al., 2003; Newtn et al., 2001; Newtn and Kendzirski, 2003), assumes that ij are independent realizatins frm Ga(α i,zj, λ i,zj ) (i.e. the mean is α i,zj /λ i,zj ). The mdel assumes δ i Mult(1, π), fies α i,zj = α fr all i, j and specifies the hierarchical prir λ i,zj Ga(α 0, ν) fr all distinct scale parameters under pattern δ i. Here (α 0, ν, α, π) are hyper-parameters cmmn t all genes. Fr δ i = 0 (EE genes) we have λ i,1 =... = λ i,k, and fr δ i 0 sme f the λ i,zj are different frm each ther, accrding t the specificatin f the hyptheses. The Ga mdel impses the restrictin that 1/ α i,zj, the within-grups cefficients f variatin (CV), must be cnstant acrss all genes and grups. The assumptin is analytically cnvenient, but we have fund it nt t be reasnable in typical datasets. Figure 1 shws empirical CVs fr tw datasets described in Sectin 6. The CVs ehibit a skewed distributin with a substantial amunt f variability that cntradicts that CVs are cnstant. Figure 1 highlights the genes declared DE by the Ga mdel in bth datasets. These are ften genes with abve average CV. This is due, we believe, t the cnstant CV assumptin, which makes the Ga mdel view atypical CVs as evidence fr differential epressin. 3
4 Cefficient f variatin Cefficient f variatin Mean epressin (lg scale) Mean epressin (lg scale) Figure 1: Sample mean and CV fr each gene ( dentes genes declared DE by the Ga mdel). (a): Guld dataset; (b): Armstrng dataset When impsing cnstant α i,zj ne can nly cmpare the scale parameters λ i,zj between grups, while in practice it can be mre relevant bilgically t cmpare the full distributin (Lapinte et al., 2004; Cmbes et al., 2007). We prpse a generalizatin that addresses this limitatin. We intrduce gene and pattern-specific shape parameters α i,zj and assume ij Ga(α i,zj, α i,zj /λ i,zj ) (i.e. λ i,zj is the mean), with the fllwing hierarchical prir λ i,k δ i, α 0, ν IGa(α 0, α 0 /ν), indep. fr i = 1... n α i,k δ i, β, µ Ga(β, β/µ), indep. fr i = 1... n, (2) and a prir fr δ i as befre. We refer t (2) as the GaGa mdel. As in the Ga mdel, the values f (α i1, λ i1 )... (α ik, λ ik ) are tied when δ i = 0, whereas under δ i 0 sme f them are different frm each ther (althugh they still arise frm the same marginal distributin). The GaGa mdel replaces the hyper-parameter α f the Ga mdel by the pair (β, µ). That is, the additinal fleibility is achieved with nly ne mre hyperparameter. We cmplete the Bayesian mdel with hyper-prirs: α 0 Ga(a α0, b α0 ); ν IGa(a ν, b ν ); β Ga(a β, b β ); µ IGa(a µ, b µ ); π Dirichlet(p). (3) We believe that eliciting prir distributins can be advantageus in micrarray studies, since there usually is sme degree f prir knwledge. Fr eample, the investigatr may have an idea abut what prprtin f DE genes t epect. As anther eample, many nrmalizatin prcedures result in values that are belw 15 n the lg-scale. The use f hyper-prirs is nt critical. Alternatively, (α 0, ν, β, µ, π) can be fied by an empirical Bayes argument, using an epectatin-maimizatin algrithm cmpletely 4
5 analgus t that fr the Ga mdel (Kendzirski et al. (2003), Appendi). We implement bth methds in ur gaga library. Micrarray datasets are strngly infrmative abut the parameters in (3), since they are cmmn t all genes. In the datasets in Sectin 6 we fund fairly similar results fr tw reasnable prir specificatins and the empirical Bayes methd. 2.2 Psterir distributins We derive the psterir distributin f the first-stage parameters, assuming fied hyperparameters ω = (α 0, ν, β, µ, π). Frm (2) we see that, cnditinal n ω, the gene-specific parameters (δ i, α i1... α ik, λ i1... λ ik ) are independent a psteriri acrss genes i = 1... n. Therefre, it suffices t derive the psterir fr each gene separately. We dente the vectr f parameters fr a single gene as λ i = (λ i,1... λ i,k ), α i = (α i,1... α i,k ) and we let λ = (λ 1... λ n ), α = (α 1... α n ) be the cllectin f these parameters. Let N δi be the number f grups that are distinct under pattern δ i. In ur eample in (1) we have H = 3 patterns: under pattern 0 we have N 0 = 1 distinct grups, and similarly N 1 = 2, N 2 = 3. Let S i,δi,k fr i = 1... n, δ i = 0... H 1 and k = 1... N δi be the sum f bservatins frm gene i that under pattern δ i crrespnd t the k th distinct grup, P i,δi,k be the prduct f the same bservatins and J i,δi,k be the number f terms in the sum. In ur eample S 10,0,1 dentes the sum f all bservatins frm gene 10 (since under pattern 0 there is nly ne distinct grup), S 10,1,1 dentes the sum f bservatins frm nrmal samples (since it is the first distinct grup under pattern 1) and S 10,1,2 the sum frm cancers f type A and B. The psterir prbability that gene i fllws epressin pattern l, which we dente as v il, is given by v il = P (δ i = l, ω) f( i δ i = l, ω)π l fr l = 0... H 1, where [ (α0 /ν) α 0 (β/µ) β f( i δ i, ω) = Γ(α 0 )Γ(β) ] Nδi N δi k=1 1 C(J i,δi,k, β, β/µ lg(p i,δi,k), α 0, α 0 /ν, S i,δi,k), (4) and C( ) is the gamma shape nrmalizatin cnstant, defined in Sectin 4. The psterir distributin f (α ik, λ ik ) cnditinal n δ i is α i,k δ i, ω, GaS(J i,δi,k, β, β/µ lg(p i,δi,k), α 0, α 0 /ν, S i,δi,k) λ i,k α i,k, δ i, ω, IGa(α i,k J i,δi,k α 0, α 0 /ν α i,k S i,δi,k). (5) Fr any given ω, (5) can be used t btain psterir credibility intervals in the usual fashin. Nte that α i,k and λ i,k are nt cnditinally independent a psteriri given (δ i, ω) as they are a priri. 5
6 2.3 Mdel fitting The psterir distributin f ω = (α 0, ν, β, µ, π) given (δ, α, λ) and is characterized as fllws. We find ( n ) α 0 δ, α, λ, GaS N δi, a α0, b α0 S λ, a ν, b ν, S λ ( ν α 0, δ, α, λ, IGa a ν α 0 ) n N δi, b ν α 0 S λ. (6) The hyper-parameters (β, µ, π) are cnditinally independent f (α 0, ν) given (δ, α, λ, ), with ( n β δ, α, λ, GaS ( µ β, δ, α, λ, IGa a µ β N δi, a β, b β S α, a µ, b µ, S α ) ) n N δi, b µ βs α (7) and π δ, α, λ, Dirichlet ( p 1 n I(δ i = 0),..., p H 1 ) n I(δ i = H 1) (8) cnditinally independent f (β, µ). Here S λ = n Nδi k=1 α i,k and S α = n S α = n α i,k. Nδi k=1 λ i,k, S λ = n Nδi k=1 1/λ i,k, Nδi k=1 lg(α i,k) are sums ver all distinct λ i,k and Tgether with the psterirs given in Sectin 2.2, this allws us t implement a Gibbs sampling scheme t fit the mdel (Gelfand and Smith, 1990). The Gibbs sampler is defined by iterative sampling f (δ, α, λ) given ω, and sampling ω given (δ, α, λ). 3 The MiGaGa mdel The GaGa mdel addresses the prblem illustrated in Figure 1 by allwing varying CVs acrss genes. Hwever anther limitatin remains. A well-knwn feature f the GCRMA nrmalizatin prcedure (Wu et al., 2004) is that it creates a distinctly bimdal distributin fr the gene epressins. Figure 2(a) shws the empirical distributin f ij fr the Armstrng dataset (see Sectin 6.2), and cmpares it with the prir-predictive under the GaGa mdel. The mdel des nt capture the bimdality. T address this limitatin we intrduce a further generalizatin, by letting λ i,k arise frm a miture λ i,k δ i, ρ, α 0, ν M ρ m IGa(α 0m, α 0m /ν m ) m=0 ρ Dirichlet(r) (9) 6
7 (a) (b) Prbe 1110_at Density Observed data GaGa MiGaGa Density ALL MLL Epressin levels (lg scale) Epressin levels (lg scale) Figure 2: Armstrng dataset. (a): marginal distributin f data vs. prir predictive f GaGa and MiGaGa with M = 2 and ω = ˆω; (b): gene with evidence fr change in shift and n change in lcatin. Psterir predictive under GaGa mdel with 10 arrays per grup. ( indicates the 24 ALL bservatins; the 18 AML) and specifying the fllwing prirs α 0m Ga(a α0, b α0 ), fr m = 1... M indep. ν m IGa(a ν, b ν ), fr m = 1... M indep. (10) The rest f the mdel is as in (2) and (3). The statement f psterir distributins and the Gibbs sampler are largely analgus t that fr the GaGa prir. The main difference is that fr the MiGaGa ne intrduces latent variables indicating the cluster t which each gene belngs. Cmpared t the GaGa prir, the additinal fleibility in MiGaGa ptentially allws us t btain a better fit t the data, albeit this cmes at the cst f increased mdel cmpleity and cmputatinal burden. Figure 2(a) shws hw the MiGaGa prir predictive imprves the GaGa fit substantially. 4 The Gamma shape distributin We define the distributin that arises as the psterir f the shape parameter f the gamma distributin under independent sampling and a gamma prir, as in (2). We assume that the gamma distributin is indeed by the shape and mean parameters. This distributin, which we refer t as gamma shape distributin, has nt been described befre. It is similar t the distributin that arises when the parameterizatin is in terms f the shape and scale parameters (Damsleth, 1975; Miller, 1980). T simplify ntatin 7
8 we dente by y a psitive cntinuus randm variable that fllws this distributin. Its prbability density functin, indeed by the parameters a 0, b, d, r, s > 0, c > alg(s/a), can be written as: ( ) ayd Γ(ay d) y f(y a, b, c, d, r, s) = C(a, b, c, d, r, s) y b d 1 e yc I(y > 0), (11) Γ(y) a r sy where C(a, b, c, d, r, s) is the nrmalizatin cnstant and Γ( ) is the gamma functin. Fr a = d = 0, (11) simplifies t a gamma distributin. In general, t btain randm draws frm (11) r t evaluate C(a, b, c, d, r, s) ne has t resrt t numerical methds. This is impractical in ur setup, since the psterir simulatin requires repeated and cmputatinally efficient simulatin frm a GaS distributin. Apprimatins are required t decreases the cmputatinal burden. We start by deriving an apprimatin t (11) that is apprpriate fr large values f y. By apprimating Γ( ) with Stirling s frmula and evaluating the limit f the resulting epressin as y we find that (11) is apprimately prprtinal t y a/2b 1/2 1 ep{ y(c alg(s/a))}. (12) One can btain apprimate randm draws frm (11) by drawing frm a Ga(a/2 b 1/2, calg(s/a)). T apprimate C(a, b, c, d, r, s), dente as g(y) the prbability density functin f the gamma apprimatin, and let m be its mde. Evaluating g and (11) at m gives Γ(m) a C(a, b, c, d, r, s) g(m) Γ(am d) ( ) (amd) m m bd1 e mc. (13) r sm Figure 3 shws the gamma shape distributin and its gamma apprimatin fr tw randmly selected cases that were encuntered in psterir simulatins fr the MiGaGa mdel. The density in panel (a) arises as the psterir f the shape parameter fr a single gene with a sample size f 5 bservatins per grup, whereas panel (b) results as the psterir f a parameter shared by a large number f genes i.e. it represents a situatin with a large sample size. In bth cases the apprimatin is very clse. In the micrarray datasets that we have analyzed s far the apprimatin wrked well. In sme rare cases we detected that the mde f the apprimatin did nt match that f (11) (indicated by the first derivative f lg [f(y a, b, c, d, r, s)] nt being clse t zer). In these cases we used a few Newtn-Raphsn steps t lcate the mde and used the gamma apprimatin that matches the lcatin f the mde as well as the value f the secnd derivative f lg [f(y a, b, c, d, r, s)] evaluated at the mde. 5 Inference 5.1 Differentially epressed genes We frmalize inference fr differential epressin by minimizing the Bayesian false negative rate (BFNR) subject t an upper bund α n the Bayesian false discvery rate (BFDR) (Müller et al., 2007). Briefly, BFNR is the psterir epected prprtin f 8
9 (a) (b) Density Gamma shape Gamma appr Density Gamma shape Gamma appr y y Figure 3: Gamma apprimatin t the gamma shape distributin. Parameter values are (a): a=10; b=0.90; c= ; d= ; r= ; s=65.02 (b): a=1532,b=.16,c=3469,d=.16,r=.016,s=159.5 genes declared EE (i.e. assigned t pattern 0) that are actually DE (i.e. nt fllw pattern 0), and BFDR is the epected prprtin f genes declared DE that are actually EE. This definitin remains valid fr mre than tw hyptheses. The Bayes rule is t declare a gene as DE whenever its psterir prbability f DE is abve a certain threshld. The result etends trivially t ur multiple hyptheses setup with a slight adjustment: given that a gene is nt classified int pattern 0, we prpse assigning it t the pattern with the highest psterir prbability. That is, fr given BFDR and BFNR we maimize the number f genes crrectly classified int their epressin pattern. Since the psterir prbabilities in Sectin 2.2 are derived under an assumed prbability mdel, deviatins frm the assumptins may result in pr perfrmance f the prcedure. We prpse t assess its frequentist perating characteristics fllwing the btstrap scheme intrduced by Strey (2007), which allws t estimate the frequentist FDR fr any given α. In Sectin 6.3 we apply this prcedure t a real dataset. Fr mre details, see Strey (2007) and the supplementary material at T ease the cmputatinal burden f the btstrap-based prcedure, we use an apprimatin. Instead f using P (δ i = l ) we use v il = P (δ i = l, ˆω) as given in (4), where ˆω is the psterir mean f ω. We have fund this strategy t deliver very similar results t thse frm the eact versin f the algrithm, but at a much lwer cmputatinal cst. 9
10 5.2 Class predictin We set the gal f maimizing the number f future samples = ( 1... n) that are crrectly classified as type z. The Bayes rule is t assign the new sample t the type k that has the highest psterir prbability P (z = k, ). As in Sectin 5.1, we use the apprimatin u k = P (z = k,, ˆω) f( z = k, ˆω, )P (z = k), where ˆω is the psterir mean f ω, f( z = k, ˆω, ) is the predictive distributin fr the measurements f a sample f type k and P (z = k) is the prir prbability. The prir prbabilities can be based, fr eample, n the prevalence f the disease in the ppulatin under study, the presence f risk factrs r the utcme f previus tests. Fr the GaGa mdel we find where f( z = k, ˆω, ) = f( i z = k,δ i = l, ω, i ) = 1 i n H 1 l=0 f( i z = k, δ i = l, ˆω, i )v il, (14) C(J i,δi,k, β, β/µ lg(p i,δi,k), α 0, α 0 /ν, S i,δi,k) C(J i,δi,k 1, β, β/µ lg(p i,δi,k) lg( i ), α 0, α 0 /ν, i S i,δ i,k). (15) An interesting feature f (14) is that the classifier weights the cntributin f each gene accrding t the psterir prbabilities v il, and that in particular genes with zer psterir prbability f being DE d nt cntribute t the classifier. This is imprtant frm a practical standpint, since frequently ne wants t use nly a subset f the genes. The classifier is rbust with respect t hw many genes are chsen. Similar epressins can be btained fr the MiGaGa mdel by averaging accrding t the cluster weights ρ. 6 Results We assess the perfrmance f the GaGa and MiGaGa mdels in simulated and real data. In Sectin 6.1 we revisit the Guld dataset that (Kendzirski et al., 2003) riginally used t illustrate the Ga mdel. In Sectin 6.2 we analyze the leukemia dataset f Armstrng et al. (2002), and in Sectin 6.3 we cnduct tw simulatin studies based n this dataset. In all analyses we tried tw different prir specificatins. Under the first prir we use a α0 = b α0 = a ν = b ν = a β = b β = a µ = b µ = and all the elements f p and r equal t 0.1. Secnd, we defined a slightly mre infrmative prir taking int accunt that lgepressin levels are rarely abve 15 and that cefficients f variatin (CV) are usually centered arund with large variance. We then fund prir parameter values that were cnsistent with this infrmatin and that at the same time allwed fr substantial prir uncertainty: a α0 = , b α0 = 10 4, a ν = 0.016, b ν = , a β = 0.004, b β = 10 3, a µ = and b µ = 20. The GaGa mdel yielded very similar parameter estimates and lists f differentially epressed genes under bth prirs. The MiGaGa mdel was slightly mre sensitive t the prir specificatin, with the nn-infrmative prir resulting in a higher psterir epectatin fr π 0 and a shrter list f DE genes. We present nly the results arising frm the mre infrmative prir. T fit the mdel we btain 5,000 psterir samples, assess the cnvergence with trajectry plts and save 10
11 the Mnte Carl when cnvergence is judged t have been reached (typically well befre 1,000 samples fr the GaGa mdel and 2,500 samples fr the MiGaGa). We run the Markv chain a secnd time with different starting values and verified that it cnverged t the same target distributin. Fr differential epressin analysis, we cmpare ur methdlgy with the empirical Bayes prcedure f Smyth (2005), adjusting the p-values via the beta-unifrm miture apprach Punds and Mrris (2003) (EBayes-BUM), and with tw-sample Wilcn tests adjusting p-values bth via BUM (Wilcn-BUM) and the Benjamini and Hchberg (1995) methd (Wilcn-BH). We als perfrm class predictin, cmparing ur methdlgy with Fisher s linear discriminant analysis (LDA). We use EBayes as implementated in the R/Bicnductr functins lmfit and ebayes, BUM as implemented in Bum and BH as in p.adjust (libraries limma (Smyth, 2005), ClassCmparisn (Cmbes, 2005) and stats, respectively). All methds were set up t cntrl the FDR belw Guld data We used the already pre-prcessed versin f the Guld dataset prvided with the R library EBarrays. We fit the mdel t lg-transfrmed data, since this reduced the effect f utliers and resulted in a better perfrmance f the mdel. Data is available fr 5,000 genes and 4 inbred lines: 2 parental and 2 ffspring. The parental lines are a Cpenhagen (COP) rat strain resistant t mammary carcingenesis and a Wistar-Furth (WF) rat strain that is highly susceptible. The tw ffspring lines are btained by crssing the parental lines, in such a way that they are hmzygus COP/COP thrughut the genme ecept fr a small regin in which they are hmzygus WF/WF. In ne f the ffspring lines the COP/COP regin is apprimately 30cM (line CI) and in the ther it is 1.5cM (line CII). Therefre, it seems reasnable t epect gene epressin fr the CI and CII lines t be smewhere between COP and WF. Further, CI shuld be clser t the COP line than CII while CII shuld be clser t WF. The dataset cntains 1 micrarray fr the COP grup, 2 fr CI, 5 fr CII and 2 fr WF. T illustrate ur apprach we analyze a randmly chsen subset cntaining 2 micrarrays frm CI, CII and WF (we ignre the COP grup fr lack f replicates). This will allw us t assess hw well the mdel fits the 3 CII micrarrays that are nt used t fit the mdel. Fr each gene, the eperimenters cnsidered fur epressin patterns: Mdel fit Pattern 0: CI = CII = WF Pattern 1: CI CII = WF Pattern 2: CI = CII WF Pattern 3: CI CII WF. (16) The Ga mdel estimates ˆα 0 = , ˆν = 0.463, ˆπ = (0.980, 0.001, 0.001, 0.018) and fies ˆα i,j = ˆα = The GaGa etensin yields psterir means ˆα 0 = , ˆν = 0.152, ˆβ = 1.089, ˆµ = and ˆπ = (0.999, 0, 0.001, 0). That is, the Ga mdel views 98.0% f the genes as being EE, while fr the GaGa mdel it is a 99.9%. Als, Ga assumes a 11
12 (a) (b) Ga mdel GaGa mdel Prbe rc.ai at CI CII WF Prbe rc.ai at CI CII WF Prbe rc.ai at Prbe rc.ai at Figure 4: Guld data. Observed epressin values vs. predictive distributin fr the tw genes with highest psterir prbability f DE accrding t the Ga mdel. Large black symbls are actual bservatins, small gray symbls are draws frm the psterir predictive. cnstant within-grups CV f 1/ = 0.047, while GaGa estimates it t vary acrss genes as a 1/ Ga(1.089, ), indicating that the CVs are nt cnstant. In Figure 1(a) we detected lack f fit f the Ga mdel at an verall level which the GaGa mdel vercmes. Net we cnsider the fit fr individual genes. In particular, ne shuld make sure that the fit is reasnable fr the genes that are declared t be DE. Otherwise the inference wuld be suspect. We select the tw genes are deemed the mst interesting by the Ga mdel, i.e. thse with lwest prbability f being EE. We cmpare their bserved epressin levels with draws frm the psterir predictive distributin fr these tw genes. Figure 4(a) reveals that the Ga mdel seriusly underestimates the variability, inaccurately predicting that the bservatins fall int 3 clearly separated grups. Figure 4(b) presents the same plt fr the GaGa mdel. Here the mdel-based predictins apprpriately reflect the variability. We can n lnger see any separatin between the 3 grups. We cnclude that these genes are fund by Ga due t mdel lack-f-fit Differential epressin analysis The Ga mdel allcates 2 genes t pattern 1, 1 t pattern 2 and 78 t pattern 3 while GaGa allcates all genes t pattern 0. Under the GaGa mdel the largest prbability f DE fr any gene is As we saw in Figure 1, the genes fund by the Ga mdel tend t be thse with CV abve average. Fr cmparisn, perfrming F-tests via EBayes-BUM did nt identify any DE genes either. 12
13 6.1.3 Class predictin Since there is little evidence that any gene is differentially epressed, we d nt epect this dataset t allw us t build a gd classifier. We cnducted a small simulatin study that cnfirmed the lack f predictive pwer, regardless f hw many genes were used t build the classifier. 6.2 Armstrng data The data, btained frm cnsisted f 24 Affymetri U95A arrays frm acute lymphblastic leukemia (ALL) samples, 18 U95A arrays frm lymphblastic leukemia with MLL translcatins (MLL), and 2 U95Av2 arrays als frm the MLL grup. The U95Av2 arrays were btained at a later date than the rest, pssibly under different eperimental cnditins, s we ecluded them frm the analysis. The dataset als cntained samples with acute myelgenus leukemia, but fr illustratin we restrict attentin t the ALL and MLL grups. The data was backgrund crrected, nrmalized and summarized using the functin just.gcrma frm the R library gcrma (Wu and Irizarry, 2007). Nte that different pre-prcessing algrithms result in different distributinal shapes fr the bserved gene epressin quantificatins, and hence the quality f the mdel fit can be affected by the chice f pre-prcessing methd. T eplre the effect f the distributinal shape, in additin t analyzing the GCRMA nrmalized data we apply a mntnic transfrmatin t enfrce unimdality and we analyze it with a GaGa mdel. Within each micrarray, the transfrmatin maps sample quantiles t the crrespnding quantiles f a gamma distributin with matching mean and variance Mdel fit Figure 1(b) reveals a vilatin f the cnstant CV assumptin f the Ga mdel, and that the mdel tends t flag genes with large CVs as DE. Figure 2(a) shws that a MiGaGa fit with M = 2 cmpnents describes the data better than a single-cmpnent GaGa. An analgus plt shws hw the mntne transfrmatin imprves the fit f the GaGa mdel substantially. This plt and further assessment f gdness-f-fit can be fund in supplementary material at T study the behavir f the methds under small sample sizes and evaluate the reliability f the results, we start by fitting the mdel t 5 randmly chsen arrays frm each grup. We then add 5 mre arrays per grup, then 10 and finally we analyze the full dataset. Fr the GCRMA data with 5 arrays per grup the GaGa psterir means are ˆα 0 = 4.520, ˆν = 0.314, ˆβ = 0.826, ˆµ = and ˆπ = (0.954, 0.046). That is, the genespecific shape parameters are estimated t arise frm a gamma with mean and standard deviatin With mre than 5 arrays similar (ˆα 0, ˆν, ˆβ, ˆµ) were btained. Hwever, ˆπ 1 increased t 0.121, and when analyzing 10 arrays, 15 arrays and the full dataset, respectively. The MiGaGa estimates behaved in a similar manner, and s did the GaGa estimates fr the mntnically transfrmed data, btaining similar ˆπ 1 in all mdels. Since we did nt bserve this phenmenn n simulated data, we believe that it is due t sme cmpnent f the mdel being miss-specified, e.g. assuming cnditinal 13
14 5 arrays 10 arrays 15 arrays All data # DE % rep. # DE % rep. # DE % rep. # DE GaGa MiGaGa (M =2) GaGa (transf.) EBayes-BUM Wilcn-BUM Wilcn-BH Table 1: Gene discveries in the Armstrng dataset. # DE: number f genes declared DE; % rep.: percentage f # DE als fund when analyzing the full dataset. ODP reprts the mean f tw analyses, each using B=100 permutatins. independence f δ i given ω. In ur eperience, in real datasets many methds prvide estimated prprtins f DE genes that change widely with sample size. Fr instance, the EBayes-BUM and Wilcn-BUM estimates increase frm t and frm t 0.333, respectively. Cmpared t this ther tw prcedures, under the GaGa and MiGaGa mdels ˆπ 1 is relatively stable acrss sample size Differential epressin analysis Table 1 shws the number f genes declared DE when analyzing a subset f 5, 10 and 15 arrays per grup, as well as the full dataset. The table als prvides the percentage f reprducibility, i.e. hw many amngst thse genes were fund again when analyzing the full dataset. Fr instance, with 5 arrays per grup MiGaGa fund 339 genes, 79.6% f which were cnfirmed in the full data. We see that the three variants f ur mdel find mre genes than EBayes-BUM, Wilcn-BUM and Wilcn-BH, with a reprducibility arund 80%. The reprducibility f the cmpeting methds is very high fr small sample sizes but it diminishes as mre data becmes available. In fact, they find fewer DE genes in the full dataset than in a subset with 15 arrays per grup. One shuld be careful in cmparing the number f genes detected by each methd, fr ur mdels cmpare nt nly mean epressin but als the distributinal shape between ALL and MLL samples. T highlight this feature inspect the 1096 genes that were reprted as DE under the GaGa mdel with 10 arrays per grup. 2(b) shws the predictive distributin fr a gene that has 95% credibility interval fr λ i1 λ i2 cntaining 0 and the interval fr α i1 /α i2 nt cntaining 1. This is a gene with strnger evidence f a difference in shape between grups than in lcatin. The bserved data, which includes the 22 samples nt used t fit the mdel, indeed suggests a larger tail fr ALL than fr AML samples Class predictin We select the 100 genes with the largest psterir prbability f being differentially epressed accrding t each f the three mdels (GaGa, MiGaGa and GaGa applied t the transfrmed data) when fit with 5 bservatins per grup, and predict the class fr the rest f the dataset. The three mdels crrectly classify all samples. The same result is 14
15 fund when nly using the tp 10 genes t build the classifier. Fr cmparisn, using Fisher s Linear Discriminant Analysis with the 5 genes having smallest p-values accrding t Wilcn (BH) crrectly classifies 16 ALL and 13 MLL samples. 6.3 Simulatin study We cnduct tw simulatin studies. First, we assess the frequentist perating characteristics f the Bayesian prcedure. Fr this purpse, we apply the btstrap prcedure described in Strey (2007) t the Armstrng dataset, s that in the generated data all genes are equally epressed. We then cmpute psterir prbabilities f differential epressin fr each gene (setting ω t its psterir mean) and apply the Bayes rule Müller et al. (2004), as described in Sectin 5.1, cntrlling the Bayesian FDR at 5%. We repeat this prcess 500 times and we btain an estimate fr the frequentist FDR, as described in Strey (2007). We find that the GaGa mdel apprpriately cntrls the frequentist FDR belw the desired 5%, bth when applied t the riginal and the mntnically transfrmed data. MiGaGa cntrlled the FDR belw 5% when analyzing 5, 10 and 15 arrays per grup, but when analyzing the full dataset the estimated frequentist FDR was 6.4%. Fr mre details, see the supplementary material at Fr the secnd simulatin study, we generate 12,626 gene epressin values fr 5 MLL and 5 ALL samples frm a GaGa mdel. We set ω t its estimated value fr the Armstrng dataset: α 0 = 4.520, ν = 0.314, β = 0.826, µ = and π = (0.954, 0.046). Fitting the GaGa mdel t the simulated data prvides the psterir means ˆα 0 = 4.630, ˆν = 0.312, ˆβ = 0.906, ˆµ = and ˆπ = (0.955, 0.045). All the 95% credibility intervals cntained the true value, ecept the ne fr β which was (0.882, 0.930). MiGaGa with M = 2 estimates ˆρ = (0.004, ), i.e. it crrectly assigns a negligible psterir prbability t ne f the clusters. The psterir means are ˆα = 4.618, ˆν = (fr the secnd cluster), ˆβ = 0.870, ˆµ = 514.6, ˆπ = (0.956, 0.044). GaGa and MiGaGa tagged 469 and 468 genes as differentially epressed, respectively, 4.5% f which were false psitives. EBayes-BUM fund 449 genes (7.1% false psitives), whereas Wilcn-BUM and Wilcn-BH did nt find any significant genes. 7 Discussin We have intrduced tw simple etensins f the Ga mdel. The GaGa mdel relaes the cnstant cefficient f variatin assumptin. This results in a parsimnius mdel, which substantially imprves the quality f the mdel fit and reliability f the resulting inference. The increased generality cmes at a negligible cmputatinal cst. We derived an apprimatin fr the psterir distributin f the gamma shape parameter that further simplifies cmputatin. The secnd etensin, the MiGaGa, increases the mdel fleibility by incrprating a miture prir, at the epense f mdel parsimny. In practice, a miture with as few as tw cmpnents may suffice t prvide a satisfactry fit, as we have illustrated with the Armstrng dataset. Further, we have shwn hw t imprve the quality f the GaGa fit by using a simple mntnic transfrmatin that guarantees unimdality f the marginal distributin f the data. 15
16 The hierarchical nature f the mdels allws fr the sharing f infrmatin acrss genes. In simulatins and in real data we have shwn hw bth GaGa and MiGaGa find mre genes than three cmpeting methds. Fr instance, when analyzing a subset with 5 arrays per grup frm the Armstrng dataset we detect arund 200 differentially epressed genes, while the mst that any cmpeting methd finds is 47. The fact that arund 80% f these genes were fund again when analyzing the full data gives us cnfidence that these are nt spurius findings. Als, we cnducted simulatins under the null hyptheses which estimated the FDR t be arund the desired 5%. The differences in the number f genes fund by each methd are partly due t the fact that they are testing different hyptheses. Our mdels nt nly seek differences in mean epressins as the cmpeting methds d, but als test fr differences in the distributinal shape. We believe that this may frequently be f bilgical interest, since many mutatins, deletins and translcatins affect nly a prprtin f the diseased individuals, and hence ne epects t see differences in the tail behavir between grups. Since ur mdels are built t be sensitive t tail behavir, the presence f utlying values can have an effect n the inference. If the eperimenter believes that differences in shape lack bilgical relevance and hence that this sensitivity is undesirable, ne can easily use the mdel utput t fcus n genes with lcatin shift. Fr instance, ne can cmpute psterir credibility intervals fr the difference between grup means and disregard thse genes fr which it cntains zer. In additin, we have shwn a fully Bayesian apprach fr class predictin. The apprach allws t specify prir prbabilities that take int accunt the prevalence f the disease under study and the utcme f previus tests, fr instance. In the Guld dataset the mdel revealed that the data lacked predictive pwer, while in the Armstrng dataset the prpsed apprach crrectly classified all the 32 samples that had nt been used t fit the mdel. In bth cases, we shwed hw the classifier is rbust with respect t the number f genes used t build it. As a limitatin, we have nt eplicitely mdeled the dependence between genes. In datasets with strng crrelatins, we epect that this may have a strnger effect n class predictin than n inference abut gene epressin. Nt learning abut the dependence structure als limits the use f the mdel in finding gene netwrks r gene interactins. Interesting future wrk will be t include dependence. Other pssibilities are etending the mdel t include cvariate infrmatin and study-specific randm effects, which wuld make it appealing fr meta-analysis purpses, r using the mdel fr sample size calculatin as in Müller et al. (2004) r sequential sample size calculatin. In the latter applicatin, the cmputatinal efficiency f the GaGa mdel shuld prve a majr asset. Acknwledgments We thank Peter Müller fr his very useful cmments. References S.A. Armstrng, J.E. Stauntn, L.B. Silverman, R. Pieters, M.L. Ber, M.D. Minden, E.S. Sallan, E.S. Lander, T.R. Glub, and S.J. Krsmeyer. Mll translcatins specify a 16
17 distinct gene epressin prfile that distinguishes a unique leukemia. Nature Genetics, 30:41 47, Y. Benjamini and Y. Hchberg. Cntrlling the false discvery rate: A practical and pwerful apprach t multiple testing. Jurnal f the Ryal Statistical Sciety B, 57: , M. Chigna, M.S. Massa, and C. Rmualdi. Effect f nrmalizatins n detecting differentially epressed genes with cdna micrarray eperiments. Technical reprt, Universita degli studi di Padva, Dipartiment di scienze statistiche, K. R. Cmbes, J. Wang, and K.A. Baggerly. A statistical methd fr finding bimarkers frm micrarray data, with applicatin t prstate cancer. Technical reprt, M.D. Andersn Cancer Center, URL utmdabtr00704.pdf. Kevin R. Cmbes. ClassCmparisn: Classes and methds fr class cmparisn prblems n micrarrays, R package versin 1.1. E. Damsleth. Cnjugate classes fr gamma distributins. Scandinavian Jurnal f Statistics, 2:80 84, S. Dudit, H.Y. Yang, M.J. Callw, and T.P. Speed. Statistical methds fr identifying differentially epressed genes in replicated cdna micrarray eperiments. Statistica Sinica, 12: , A.E. Gelfand and A.F.M. Smith. Sampling based appraches t calculating marginal densities. Jurnal f the American Statistical Assciatin, 85: , C.M. Kendzirski, M.A. Newtn, H. Lan, and M.N. Guld. On parametric empirical bayes methds fr cmparing multiple grups using replicated gene epressin prfiles. Statistics in Medicine, 22: , J. Lapinte, C. Li, J.P Higgins, M. Rijn, E. Bair, K. Mntgmery, M. Ferrari, L. Egevad, W. Rayfrd, U. Bergerheim, P. Ekman, A.M. DeMarz, R. Tibshirani, D. Btstein, P.O. Brwn, J.D. Brks, and J.R. Pllack. Gene epressin prfiling identifies clinically relevant subtypes f prstate cancer. Prceedings f the Natinal Academy f Science, 101: , I. Lönnstedt and T. Speed. Replicated micrarray data. Statistica Sinica, 12(1), R.B. Miller. Bayesian analysis f the tw-parameter gamma distributin. Technmetrics, 22:65 69, P. Müller, G. Parmigiani, C. Rbert, and J. Russeau. Optimal sample size fr multiple testing: the case f gene epressin micrarrays. Jurnal f the American Statistical Assciatin, 99: , P. Müller, G. Parmigiani, and K. Rice. FDR and Bayesian Multiple Cmparisns Rules. Ofrd University Press,
18 M.A. Newtn and C.M. Kendzirski. Parametric Empirical Bayes Methds fr Micrarrays. Springer Verlag, New Yrk, M.A. Newtn, C.M. Kendzirski, C.S Richmnd, F.R. Blattner, and K.W. Tsui. On differential variability f epressin ratis: Imprving statistical inference abut gene epressin changes frm micrarray data. Jurnal f Cmputatinal Bilgy, 8:37 52, M.A. Newtn, A. Nueriry, D. Sarkar, and P. Ahlquist. Detecting differential gene epressin with a semiparametric hierarchical miture mdel. Bistatistics, 5: , G. Parmigiani, E.S. Garett, R.A. Irizarry, and S.L. Zeger, editrs. The Analysis f Gene Epressin Data. Springer, S. Punds and S.W. Mrris. Estimating the ccurrence f false psitives and false negatives in micrarray studies by apprimating and partitining the empirical distributin f p-values. Biinfrmatics, 10: , G.K. Smyth. Linear mdels and empirical Bayes methds fr assessing differential epressin in micrarray eperiments. Statistical Applicatins in Genetics and Mlecular Bilgy, 3, G.K. Smyth. Limma: linear mdels fr micrarray data. In R. Gentleman, V. Carey, S. Dudit, R. Irizarry, and W. Huber, editrs, Biinfrmatics and Cmputatinal Bilgy Slutins using R and Bicnductr, pages Springer, New Yrk, J.D. Strey. The ptimal discvery prcedure: A new apprach t simultaneus significance testing. Jurnal f the Ryal Statistical Sciety B, 69: , J. Wu and J.M.J. Irizarry, R. with cntributins frm Gentry. gcrma: Backgrund Adjustment Using Sequence Infrmatin, R package versin Z. Wu, R.A. Irizarry, R. Gentleman, F.M. Murill, and F. Spencer. A mdel based backgrund adjustment fr lignucletide epressin arrays. Technical reprt, Jhns Hpkins University, Dept. f Bistatistics, Y. Zha, M.C. Li, and R. Simn. An adaptive methd fr cdna micrarray nrmalizatin. Biinfrmatics, 6:28,
SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis
SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical mdel fr micrarray data analysis David Rssell Department f Bistatistics M.D. Andersn Cancer Center, Hustn, TX 77030, USA rsselldavid@gmail.cm
More informationBootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >
Btstrap Methd > # Purpse: understand hw btstrap methd wrks > bs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(bs) > mean(bs) [1] 21.64625 > # estimate f lambda > lambda = 1/mean(bs);
More informationPart 3 Introduction to statistical classification techniques
Part 3 Intrductin t statistical classificatin techniques Machine Learning, Part 3, March 07 Fabi Rli Preamble ØIn Part we have seen that if we knw: Psterir prbabilities P(ω i / ) Or the equivalent terms
More information, which yields. where z1. and z2
The Gaussian r Nrmal PDF, Page 1 The Gaussian r Nrmal Prbability Density Functin Authr: Jhn M Cimbala, Penn State University Latest revisin: 11 September 13 The Gaussian r Nrmal Prbability Density Functin
More informationAP Statistics Notes Unit Two: The Normal Distributions
AP Statistics Ntes Unit Tw: The Nrmal Distributins Syllabus Objectives: 1.5 The student will summarize distributins f data measuring the psitin using quartiles, percentiles, and standardized scres (z-scres).
More informationDistributions, spatial statistics and a Bayesian perspective
Distributins, spatial statistics and a Bayesian perspective Dug Nychka Natinal Center fr Atmspheric Research Distributins and densities Cnditinal distributins and Bayes Thm Bivariate nrmal Spatial statistics
More informationBiplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint
Biplts in Practice MICHAEL GREENACRE Prfessr f Statistics at the Pmpeu Fabra University Chapter 13 Offprint CASE STUDY BIOMEDICINE Cmparing Cancer Types Accrding t Gene Epressin Arrays First published:
More informationPattern Recognition 2014 Support Vector Machines
Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft
More informationHypothesis Tests for One Population Mean
Hypthesis Tests fr One Ppulatin Mean Chapter 9 Ala Abdelbaki Objective Objective: T estimate the value f ne ppulatin mean Inferential statistics using statistics in rder t estimate parameters We will be
More informationKinetic Model Completeness
5.68J/10.652J Spring 2003 Lecture Ntes Tuesday April 15, 2003 Kinetic Mdel Cmpleteness We say a chemical kinetic mdel is cmplete fr a particular reactin cnditin when it cntains all the species and reactins
More informationChapter 3: Cluster Analysis
Chapter 3: Cluster Analysis } 3.1 Basic Cncepts f Clustering 3.1.1 Cluster Analysis 3.1. Clustering Categries } 3. Partitining Methds 3..1 The principle 3.. K-Means Methd 3..3 K-Medids Methd 3..4 CLARA
More informationOn Huntsberger Type Shrinkage Estimator for the Mean of Normal Distribution ABSTRACT INTRODUCTION
Malaysian Jurnal f Mathematical Sciences 4(): 7-4 () On Huntsberger Type Shrinkage Estimatr fr the Mean f Nrmal Distributin Department f Mathematical and Physical Sciences, University f Nizwa, Sultanate
More informationResampling Methods. Chapter 5. Chapter 5 1 / 52
Resampling Methds Chapter 5 Chapter 5 1 / 52 1 51 Validatin set apprach 2 52 Crss validatin 3 53 Btstrap Chapter 5 2 / 52 Abut Resampling An imprtant statistical tl Pretending the data as ppulatin and
More informationComparing Several Means: ANOVA. Group Means and Grand Mean
STAT 511 ANOVA and Regressin 1 Cmparing Several Means: ANOVA Slide 1 Blue Lake snap beans were grwn in 12 pen-tp chambers which are subject t 4 treatments 3 each with O 3 and SO 2 present/absent. The ttal
More informationCHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS
CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS 1 Influential bservatins are bservatins whse presence in the data can have a distrting effect n the parameter estimates and pssibly the entire analysis,
More informationPerfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Key Wrds: Autregressive, Mving Average, Runs Tests, Shewhart Cntrl Chart
Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Sandy D. Balkin Dennis K. J. Lin y Pennsylvania State University, University Park, PA 16802 Sandy Balkin is a graduate student
More informationPSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa
There are tw parts t this lab. The first is intended t demnstrate hw t request and interpret the spatial diagnstics f a standard OLS regressin mdel using GeDa. The diagnstics prvide infrmatin abut the
More information3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression
3.3.4 Prstate Cancer Data Example (Cntinued) 3.4 Shrinkage Methds 61 Table 3.3 shws the cefficients frm a number f different selectin and shrinkage methds. They are best-subset selectin using an all-subsets
More informationPressure And Entropy Variations Across The Weak Shock Wave Due To Viscosity Effects
Pressure And Entrpy Variatins Acrss The Weak Shck Wave Due T Viscsity Effects OSTAFA A. A. AHOUD Department f athematics Faculty f Science Benha University 13518 Benha EGYPT Abstract:-The nnlinear differential
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationMATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank
MATCHING TECHNIQUES Technical Track Sessin VI Emanuela Galass The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Emanuela Galass fr the purpse f this wrkshp When can we use
More informationTree Structured Classifier
Tree Structured Classifier Reference: Classificatin and Regressin Trees by L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stne, Chapman & Hall, 98. A Medical Eample (CART): Predict high risk patients
More informationRevision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax
.7.4: Direct frequency dmain circuit analysis Revisin: August 9, 00 5 E Main Suite D Pullman, WA 9963 (509) 334 6306 ice and Fax Overview n chapter.7., we determined the steadystate respnse f electrical
More informationInternal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.
Sectin 7 Mdel Assessment This sectin is based n Stck and Watsn s Chapter 9. Internal vs. external validity Internal validity refers t whether the analysis is valid fr the ppulatin and sample being studied.
More informationThe blessing of dimensionality for kernel methods
fr kernel methds Building classifiers in high dimensinal space Pierre Dupnt Pierre.Dupnt@ucluvain.be Classifiers define decisin surfaces in sme feature space where the data is either initially represented
More informationVerification of Quality Parameters of a Solar Panel and Modification in Formulae of its Series Resistance
Verificatin f Quality Parameters f a Slar Panel and Mdificatin in Frmulae f its Series Resistance Sanika Gawhane Pune-411037-India Onkar Hule Pune-411037- India Chinmy Kulkarni Pune-411037-India Ojas Pandav
More informationBOUNDED UNCERTAINTY AND CLIMATE CHANGE ECONOMICS. Christopher Costello, Andrew Solow, Michael Neubert, and Stephen Polasky
BOUNDED UNCERTAINTY AND CLIMATE CHANGE ECONOMICS Christpher Cstell, Andrew Slw, Michael Neubert, and Stephen Plasky Intrductin The central questin in the ecnmic analysis f climate change plicy cncerns
More informationA Matrix Representation of Panel Data
web Extensin 6 Appendix 6.A A Matrix Representatin f Panel Data Panel data mdels cme in tw brad varieties, distinct intercept DGPs and errr cmpnent DGPs. his appendix presents matrix algebra representatins
More informationInference in the Multiple-Regression
Sectin 5 Mdel Inference in the Multiple-Regressin Kinds f hypthesis tests in a multiple regressin There are several distinct kinds f hypthesis tests we can run in a multiple regressin. Suppse that amng
More informationEnhancing Performance of MLP/RBF Neural Classifiers via an Multivariate Data Distribution Scheme
Enhancing Perfrmance f / Neural Classifiers via an Multivariate Data Distributin Scheme Halis Altun, Gökhan Gelen Nigde University, Electrical and Electrnics Engineering Department Nigde, Turkey haltun@nigde.edu.tr
More informationLead/Lag Compensator Frequency Domain Properties and Design Methods
Lectures 6 and 7 Lead/Lag Cmpensatr Frequency Dmain Prperties and Design Methds Definitin Cnsider the cmpensatr (ie cntrller Fr, it is called a lag cmpensatr s K Fr s, it is called a lead cmpensatr Ntatin
More informationthe results to larger systems due to prop'erties of the projection algorithm. First, the number of hidden nodes must
M.E. Aggune, M.J. Dambrg, M.A. El-Sharkawi, R.J. Marks II and L.E. Atlas, "Dynamic and static security assessment f pwer systems using artificial neural netwrks", Prceedings f the NSF Wrkshp n Applicatins
More informationComputational modeling techniques
Cmputatinal mdeling techniques Lecture 4: Mdel checing fr ODE mdels In Petre Department f IT, Åb Aademi http://www.users.ab.fi/ipetre/cmpmd/ Cntent Stichimetric matrix Calculating the mass cnservatin relatins
More informationBLAST / HIDDEN MARKOV MODELS
CS262 (Winter 2015) Lecture 5 (January 20) Scribe: Kat Gregry BLAST / HIDDEN MARKOV MODELS BLAST CONTINUED HEURISTIC LOCAL ALIGNMENT Use Cmmnly used t search vast bilgical databases (n the rder f terabases/tetrabases)
More informationCS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007
CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is
More informationSIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST. Mark C. Otto Statistics Research Division, Bureau of the Census Washington, D.C , U.S.A.
SIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST Mark C. Ott Statistics Research Divisin, Bureau f the Census Washingtn, D.C. 20233, U.S.A. and Kenneth H. Pllck Department f Statistics, Nrth Carlina State
More informationEric Klein and Ning Sa
Week 12. Statistical Appraches t Netwrks: p1 and p* Wasserman and Faust Chapter 15: Statistical Analysis f Single Relatinal Netwrks There are fur tasks in psitinal analysis: 1) Define Equivalence 2) Measure
More informationPure adaptive search for finite global optimization*
Mathematical Prgramming 69 (1995) 443-448 Pure adaptive search fr finite glbal ptimizatin* Z.B. Zabinskya.*, G.R. Wd b, M.A. Steel c, W.P. Baritmpa c a Industrial Engineering Prgram, FU-20. University
More informationName: Block: Date: Science 10: The Great Geyser Experiment A controlled experiment
Science 10: The Great Geyser Experiment A cntrlled experiment Yu will prduce a GEYSER by drpping Ments int a bttle f diet pp Sme questins t think abut are: What are yu ging t test? What are yu ging t measure?
More informationinitially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur
Cdewrd Distributin fr Frequency Sensitive Cmpetitive Learning with One Dimensinal Input Data Aristides S. Galanpuls and Stanley C. Ahalt Department f Electrical Engineering The Ohi State University Abstract
More informationAdmissibility Conditions and Asymptotic Behavior of Strongly Regular Graphs
Admissibility Cnditins and Asympttic Behavir f Strngly Regular Graphs VASCO MOÇO MANO Department f Mathematics University f Prt Oprt PORTUGAL vascmcman@gmailcm LUÍS ANTÓNIO DE ALMEIDA VIEIRA Department
More informationResampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017
Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with
More informationLecture 17: Free Energy of Multi-phase Solutions at Equilibrium
Lecture 17: 11.07.05 Free Energy f Multi-phase Slutins at Equilibrium Tday: LAST TIME...2 FREE ENERGY DIAGRAMS OF MULTI-PHASE SOLUTIONS 1...3 The cmmn tangent cnstructin and the lever rule...3 Practical
More informationPhysics 2B Chapter 23 Notes - Faraday s Law & Inductors Spring 2018
Michael Faraday lived in the Lndn area frm 1791 t 1867. He was 29 years ld when Hand Oersted, in 1820, accidentally discvered that electric current creates magnetic field. Thrugh empirical bservatin and
More informationWRITING THE REPORT. Organizing the report. Title Page. Table of Contents
WRITING THE REPORT Organizing the reprt Mst reprts shuld be rganized in the fllwing manner. Smetime there is a valid reasn t include extra chapters in within the bdy f the reprt. 1. Title page 2. Executive
More informationRelationship Between Amplifier Settling Time and Pole-Zero Placements for Second-Order Systems *
Relatinship Between Amplifier Settling Time and Ple-Zer Placements fr Secnd-Order Systems * Mark E. Schlarmann and Randall L. Geiger Iwa State University Electrical and Cmputer Engineering Department Ames,
More informationA Few Basic Facts About Isothermal Mass Transfer in a Binary Mixture
Few asic Facts but Isthermal Mass Transfer in a inary Miture David Keffer Department f Chemical Engineering University f Tennessee first begun: pril 22, 2004 last updated: January 13, 2006 dkeffer@utk.edu
More informationCOMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification
COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551
More informationMATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank
MATCHING TECHNIQUES Technical Track Sessin VI Céline Ferré The Wrld Bank When can we use matching? What if the assignment t the treatment is nt dne randmly r based n an eligibility index, but n the basis
More informationThe Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition
The Kullback-Leibler Kernel as a Framewrk fr Discriminant and Lcalized Representatins fr Visual Recgnitin Nun Vascncels Purdy H Pedr Mren ECE Department University f Califrnia, San Dieg HP Labs Cambridge
More informationTechnical Bulletin. Generation Interconnection Procedures. Revisions to Cluster 4, Phase 1 Study Methodology
Technical Bulletin Generatin Intercnnectin Prcedures Revisins t Cluster 4, Phase 1 Study Methdlgy Release Date: Octber 20, 2011 (Finalizatin f the Draft Technical Bulletin released n September 19, 2011)
More informationAP Statistics Practice Test Unit Three Exploring Relationships Between Variables. Name Period Date
AP Statistics Practice Test Unit Three Explring Relatinships Between Variables Name Perid Date True r False: 1. Crrelatin and regressin require explanatry and respnse variables. 1. 2. Every least squares
More informationNUROP CONGRESS PAPER CHINESE PINYIN TO CHINESE CHARACTER CONVERSION
NUROP Chinese Pinyin T Chinese Character Cnversin NUROP CONGRESS PAPER CHINESE PINYIN TO CHINESE CHARACTER CONVERSION CHIA LI SHI 1 AND LUA KIM TENG 2 Schl f Cmputing, Natinal University f Singapre 3 Science
More informationParticle Size Distributions from SANS Data Using the Maximum Entropy Method. By J. A. POTTON, G. J. DANIELL AND B. D. RAINFORD
3 J. Appl. Cryst. (1988). 21,3-8 Particle Size Distributins frm SANS Data Using the Maximum Entrpy Methd By J. A. PTTN, G. J. DANIELL AND B. D. RAINFRD Physics Department, The University, Suthamptn S9
More informationLHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers
LHS Mathematics Department Hnrs Pre-alculus Final Eam nswers Part Shrt Prblems The table at the right gives the ppulatin f Massachusetts ver the past several decades Using an epnential mdel, predict the
More informationChecking the resolved resonance region in EXFOR database
Checking the reslved resnance regin in EXFOR database Gttfried Bertn Sciété de Calcul Mathématique (SCM) Oscar Cabells OECD/NEA Data Bank JEFF Meetings - Sessin JEFF Experiments Nvember 0-4, 017 Bulgne-Billancurt,
More informationSupport-Vector Machines
Supprt-Vectr Machines Intrductin Supprt vectr machine is a linear machine with sme very nice prperties. Haykin chapter 6. See Alpaydin chapter 13 fr similar cntent. Nte: Part f this lecture drew material
More informationSimple Linear Regression (single variable)
Simple Linear Regressin (single variable) Intrductin t Machine Learning Marek Petrik January 31, 2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins
More informationMethods for Determination of Mean Speckle Size in Simulated Speckle Pattern
0.478/msr-04-004 MEASUREMENT SCENCE REVEW, Vlume 4, N. 3, 04 Methds fr Determinatin f Mean Speckle Size in Simulated Speckle Pattern. Hamarvá, P. Šmíd, P. Hrváth, M. Hrabvský nstitute f Physics f the Academy
More informationFall 2013 Physics 172 Recitation 3 Momentum and Springs
Fall 03 Physics 7 Recitatin 3 Mmentum and Springs Purpse: The purpse f this recitatin is t give yu experience wrking with mmentum and the mmentum update frmula. Readings: Chapter.3-.5 Learning Objectives:.3.
More informationENSC Discrete Time Systems. Project Outline. Semester
ENSC 49 - iscrete Time Systems Prject Outline Semester 006-1. Objectives The gal f the prject is t design a channel fading simulatr. Upn successful cmpletin f the prject, yu will reinfrce yur understanding
More informationCHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.
MATH 1342 Ch. 24 April 25 and 27, 2013 Page 1 f 5 CHAPTER 24: INFERENCE IN REGRESSION Chapters 4 and 5: Relatinships between tw quantitative variables. Be able t Make a graph (scatterplt) Summarize the
More informationCAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank
CAUSAL INFERENCE Technical Track Sessin I Phillippe Leite The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Phillippe Leite fr the purpse f this wrkshp Plicy questins are causal
More informationA New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation
III-l III. A New Evaluatin Measure J. Jiner and L. Werner Abstract The prblems f evaluatin and the needed criteria f evaluatin measures in the SMART system f infrmatin retrieval are reviewed and discussed.
More informationTesting Groups of Genes
Testing Grups f Genes Part II: Scring Gene Ontlgy Terms Manuela Hummel, LMU München Adrian Alexa, MPI Saarbrücken NGFN-Curses in Practical DNA Micrarray Analysis Heidelberg, March 6, 2008 Bilgical questins
More informationLeast Squares Optimal Filtering with Multirate Observations
Prc. 36th Asilmar Cnf. n Signals, Systems, and Cmputers, Pacific Grve, CA, Nvember 2002 Least Squares Optimal Filtering with Multirate Observatins Charles W. herrien and Anthny H. Hawes Department f Electrical
More informationSnow avalanche runout from two Canadian mountain ranges
Annals f Glacilgy 18 1993 Internati n al Glaciigicai Sciety Snw avalanche runut frm tw Canadian muntain ranges D.J. NIXON AND D. M. MCCLUNG Department f Civil Engineering, University f British Clumbia,
More information^YawataR&D Laboratory, Nippon Steel Corporation, Tobata, Kitakyushu, Japan
Detectin f fatigue crack initiatin frm a ntch under a randm lad C. Makabe," S. Nishida^C. Urashima,' H. Kaneshir* "Department f Mechanical Systems Engineering, University f the Ryukyus, Nishihara, kinawa,
More informationAPPLICATION OF THE BRATSETH SCHEME FOR HIGH LATITUDE INTERMITTENT DATA ASSIMILATION USING THE PSU/NCAR MM5 MESOSCALE MODEL
JP2.11 APPLICATION OF THE BRATSETH SCHEME FOR HIGH LATITUDE INTERMITTENT DATA ASSIMILATION USING THE PSU/NCAR MM5 MESOSCALE MODEL Xingang Fan * and Jeffrey S. Tilley University f Alaska Fairbanks, Fairbanks,
More informationLecture 24: Flory-Huggins Theory
Lecture 24: 12.07.05 Flry-Huggins Thery Tday: LAST TIME...2 Lattice Mdels f Slutins...2 ENTROPY OF MIXING IN THE FLORY-HUGGINS MODEL...3 CONFIGURATIONS OF A SINGLE CHAIN...3 COUNTING CONFIGURATIONS FOR
More informationSequential Allocation with Minimal Switching
In Cmputing Science and Statistics 28 (1996), pp. 567 572 Sequential Allcatin with Minimal Switching Quentin F. Stut 1 Janis Hardwick 1 EECS Dept., University f Michigan Statistics Dept., Purdue University
More informationBayesian nonparametric modeling approaches for quantile regression
Bayesian nnparametric mdeling appraches fr quantile regressin Athanasis Kttas Department f Applied Mathematics and Statistics University f Califrnia, Santa Cruz Department f Statistics Athens University
More informationYou need to be able to define the following terms and answer basic questions about them:
CS440/ECE448 Sectin Q Fall 2017 Midterm Review Yu need t be able t define the fllwing terms and answer basic questins abut them: Intr t AI, agents and envirnments Pssible definitins f AI, prs and cns f
More informationLecture 23: Lattice Models of Materials; Modeling Polymer Solutions
Lecture 23: 12.05.05 Lattice Mdels f Materials; Mdeling Plymer Slutins Tday: LAST TIME...2 The Bltzmann Factr and Partitin Functin: systems at cnstant temperature...2 A better mdel: The Debye slid...3
More informationarxiv:hep-ph/ v1 2 Jun 1995
WIS-95//May-PH The rati F n /F p frm the analysis f data using a new scaling variable S. A. Gurvitz arxiv:hep-ph/95063v1 Jun 1995 Department f Particle Physics, Weizmann Institute f Science, Rehvt 76100,
More informationA Regression Solution to the Problem of Criterion Score Comparability
A Regressin Slutin t the Prblem f Criterin Scre Cmparability William M. Pugh Naval Health Research Center When the criterin measure in a study is the accumulatin f respnses r behavirs fr an individual
More informationMaximum A Posteriori (MAP) CS 109 Lecture 22 May 16th, 2016
Maximum A Psteriri (MAP) CS 109 Lecture 22 May 16th, 2016 Previusly in CS109 Game f Estimatrs Maximum Likelihd Nn spiler: this didn t happen Side Plt argmax argmax f lg Mther f ptimizatins? Reviving an
More informationIN a recent article, Geary [1972] discussed the merit of taking first differences
The Efficiency f Taking First Differences in Regressin Analysis: A Nte J. A. TILLMAN IN a recent article, Geary [1972] discussed the merit f taking first differences t deal with the prblems that trends
More information5 th grade Common Core Standards
5 th grade Cmmn Cre Standards In Grade 5, instructinal time shuld fcus n three critical areas: (1) develping fluency with additin and subtractin f fractins, and develping understanding f the multiplicatin
More informationModule 4: General Formulation of Electric Circuit Theory
Mdule 4: General Frmulatin f Electric Circuit Thery 4. General Frmulatin f Electric Circuit Thery All electrmagnetic phenmena are described at a fundamental level by Maxwell's equatins and the assciated
More informationSimulation Based Optimal Design
BAYESIAN STATISTICS 6, pp. 000 000 J. M. Bernard, J. O. Berger, A. P. Dawid and A. F. M. Smith (Eds.) Oxfrd University Press, 1998 Simulatin Based Optimal Design PETER MULLER Duke University SUMMARY We
More informationLab 1 The Scientific Method
INTRODUCTION The fllwing labratry exercise is designed t give yu, the student, an pprtunity t explre unknwn systems, r universes, and hypthesize pssible rules which may gvern the behavir within them. Scientific
More informationA NOTE ON BAYESIAN ANALYSIS OF THE. University of Oxford. and. A. C. Davison. Swiss Federal Institute of Technology. March 10, 1998.
A NOTE ON BAYESIAN ANALYSIS OF THE POLY-WEIBULL MODEL F. Luzada-Net University f Oxfrd and A. C. Davisn Swiss Federal Institute f Technlgy March 10, 1998 Summary We cnsider apprximate Bayesian analysis
More informationx 1 Outline IAML: Logistic Regression Decision Boundaries Example Data
Outline IAML: Lgistic Regressin Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester Lgistic functin Lgistic regressin Learning lgistic regressin Optimizatin The pwer f nn-linear basis functins Least-squares
More informationHow do scientists measure trees? What is DBH?
Hw d scientists measure trees? What is DBH? Purpse Students develp an understanding f tree size and hw scientists measure trees. Students bserve and measure tree ckies and explre the relatinship between
More informationContributions to the Theory of Robust Inference
Cntributins t the Thery f Rbust Inference by Matías Salibián-Barrera Licenciad en Matemáticas, Universidad de Buens Aires, Argentina, 1994 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
More informationCHM112 Lab Graphing with Excel Grading Rubric
Name CHM112 Lab Graphing with Excel Grading Rubric Criteria Pints pssible Pints earned Graphs crrectly pltted and adhere t all guidelines (including descriptive title, prperly frmatted axes, trendline
More informationChE 471: LECTURE 4 Fall 2003
ChE 47: LECTURE 4 Fall 003 IDEL RECTORS One f the key gals f chemical reactin engineering is t quantify the relatinship between prductin rate, reactr size, reactin kinetics and selected perating cnditins.
More informationIntroduction: A Generalized approach for computing the trajectories associated with the Newtonian N Body Problem
A Generalized apprach fr cmputing the trajectries assciated with the Newtnian N Bdy Prblem AbuBar Mehmd, Syed Umer Abbas Shah and Ghulam Shabbir Faculty f Engineering Sciences, GIK Institute f Engineering
More informationCHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India
CHAPTER 3 INEQUALITIES Cpyright -The Institute f Chartered Accuntants f India INEQUALITIES LEARNING OBJECTIVES One f the widely used decisin making prblems, nwadays, is t decide n the ptimal mix f scarce
More informationMargin Distribution and Learning Algorithms
ICML 03 Margin Distributin and Learning Algrithms Ashutsh Garg IBM Almaden Research Center, San Jse, CA 9513 USA Dan Rth Department f Cmputer Science, University f Illinis, Urbana, IL 61801 USA ASHUTOSH@US.IBM.COM
More informationTHE LIFE OF AN OBJECT IT SYSTEMS
THE LIFE OF AN OBJECT IT SYSTEMS Persns, bjects, r cncepts frm the real wrld, which we mdel as bjects in the IT system, have "lives". Actually, they have tw lives; the riginal in the real wrld has a life,
More information7.0 Heat Transfer in an External Laminar Boundary Layer
7.0 Heat ransfer in an Eternal Laminar Bundary Layer 7. Intrductin In this chapter, we will assume: ) hat the fluid prperties are cnstant and unaffected by temperature variatins. ) he thermal & mmentum
More informationRandomized Quantile Residuals
Randmized Quantile Residuals Peter K. Dunn and Grdn K. Smyth Department f Mathematics, University f Queensland, Brisbane, Q 47, Australia. 4 April 996 Abstract In this paper we give a general definitin
More informationPhysical Layer: Outline
18-: Intrductin t Telecmmunicatin Netwrks Lectures : Physical Layer Peter Steenkiste Spring 01 www.cs.cmu.edu/~prs/nets-ece Physical Layer: Outline Digital Representatin f Infrmatin Characterizatin f Cmmunicatin
More informationAdministrativia. Assignment 1 due thursday 9/23/2004 BEFORE midnight. Midterm exam 10/07/2003 in class. CS 460, Sessions 8-9 1
Administrativia Assignment 1 due thursday 9/23/2004 BEFORE midnight Midterm eam 10/07/2003 in class CS 460, Sessins 8-9 1 Last time: search strategies Uninfrmed: Use nly infrmatin available in the prblem
More informationInterference is when two (or more) sets of waves meet and combine to produce a new pattern.
Interference Interference is when tw (r mre) sets f waves meet and cmbine t prduce a new pattern. This pattern can vary depending n the riginal wave directin, wavelength, amplitude, etc. The tw mst extreme
More informationEquilibrium of Stress
Equilibrium f Stress Cnsider tw perpendicular planes passing thrugh a pint p. The stress cmpnents acting n these planes are as shwn in ig. 3.4.1a. These stresses are usuall shwn tgether acting n a small
More informationDiscussion on Regularized Regression for Categorical Data (Tutz and Gertheiss)
Discussin n Regularized Regressin fr Categrical Data (Tutz and Gertheiss) Peter Bühlmann, Ruben Dezeure Seminar fr Statistics, Department f Mathematics, ETH Zürich, Switzerland Address fr crrespndence:
More information