Factorized Multi-Modal Topic Model

Size: px
Start display at page:

Download "Factorized Multi-Modal Topic Model"

Transcription

1 Factorize Multi-Moal Topic Moel Seppo Virtanen 1, Yangqing Jia 2, Arto Klami 1, Trevor Darrell 2 1 Helsini Institute for Information Technology HIIT Department of Information an Compute Science, Aalto University 2 UC Bereley EECS an ICSI Abstract Multi-moal ata collections, such as corpora of paire images an text snippets, require analysis methos beyon single-view component an topic moels. For continuous observations the current ominant approach is base on extensions of canonical correlation analysis, factorizing the variation into components share by the ifferent moalities an those private to each of them. For count ata, multiple variants of topic moels attempting to tie the moalities together have been presente. All of these, however, lac the ability to learn components private to one moality, an consequently will try to force epenencies even between minimally correlating moalities. In this wor we combine the two approaches by presenting a novel HDP-base topic moel that automatically learns both share an private topics. The moel is shown to be especially useful for querying the contents of one omain given samples of the other. 1 INTRODUCTION Analysis of objects represente by multiple moalities has been an active research irection over the past few years. If the analysis of a single moality is characterize as learning some sort of components that escribe the ata, the tas in analysis of multiple moalities can be summarize as learning components that escribe both the variation within each moality but also the variation share between them (Klami an Kasi, 2008; Jia et al., 2010). The funamental problem is in learning how to correctly factorize the variation into the share an private components, so that the components can be intuitively interprete. For continuous vector-value samples the problem can be solve efficiently by a structural sparsity assumption (Jia et al., 2010; Virtanen et al., 2011), resulting in an extension of canonical correlation analysis (CCA) that moels not only the correlations but also components private to each moality. One prototypical example of multi-moal analysis is that of moeling collections of images an associate text snippets, such as captions or contents of a web page. When both text an image content can naturally be represente with bag of wors -type vectors, the assumptions mae by the above methos fail. Instea, such count ata calls for topic moels such as latent Dirichlet allocation (LDA): several extensions of LDA have been presente for multi-moal setups, incluing Blei an Joran (2003); Mimno an McCallum (2008); Salomatin et al. (2009); Yahneno an Hovavar (2009); Rasiwasia et al. (2010) an Puttivihya et al. (2011). However, none of these extensions are able to fin share an private topics in the same sense as the CCA-base moels o for continuous ata. Instea, the moels attempt to enforce strong correlation between the moalities, which is a reasonable assumption when analyzing e.g. multi-lingual textual corpora with similar languages but that oes not hol for analysis of images associate with free-flowing text. In most cases, the images will contain a consierable amount of information not relate to the text snippet, an it is not even guarantee that the text is relate at all to the visual content of the image. In this wor, we introuce a novel topic moel that combines the two above lines of wor. It buils on the correlate topic moels (CTM) by Blei an Lafferty (2007) an Paisley et al. (2011), by moeling correlations between topic allocations an by using a hierarchical Dirichlet process (HDP) formulation for automatically learning the number of the topics. The propose factorize multi-moal topic moel integrates the technical improvements of these single-moality topic moels to the multi-moal application, an in particular automatically learns to mae some topics

2 specific to each of the moalities, implementing the factorization iea of Klami an Kasi (2008) an Jia et al. (2010) use for continuous ata. The component selection plays a crucial role in implementing this property, implying that the HDP-base technique for automatically selecting the complexity is even more important for factorize multi-moal moels than it woul be a for a regular topic moel. The primary avantage of the new moel is that is oes not enforce correlations between the moalities, lie the earlier multi-moal topic moels o, but instea factorizes the variation into interpretable topics escribing share an private structure. The moel is very flexible an oes not enforce any particular factorization structure, but instea learns it from the ata. For example, the moel can completely ignore the share topics in case the moalities are inepenent or fin almost solely share topics when they are strongly correlate. In this wor we emonstrate the moel in analyzing moalities that have only wea relationships, a scenario for which the previous moels woul not wor. In particular, we analyze a collection of Wiipeia pages that consist of images an the whole text on the page. Such a collection has relatively low between-moality correlation an in particular inclues consierable amount of text that is not relate to the image at all, necessitating topics private to the text moality. The propose moel is shown to clearly outperform alternative HDP-base topic moels as well as corresponence LDA (Blei an Joran, 2003) in the tas of inferring the contents of a missing moality. 2 BACKGROUND: TOPIC MODELS To briefly summarize the topic moels an to introuce the notation use in the paper, we escribe the stanar topic moel of Latent Dirichlet Allocation (LDA) (Blei et al., 2003) through its generative process. We assume that wors occurring in a ocument are rawn from K topics. Each topic specifies a multinomial probability istribution over the vocabulary, parameterize through η rawn from the Dirichlet istribution Dir(γ1), an the topic proportions are multinomial with parameters θ Dir(ν1). The ocuments are generate by repeately sampling a topic inicator z Multi(θ) an then rawing a wor from the corresponing topic as x Multi(η z ). We will also heavily epen on the concept of correlate topic moels (CTM) (Blei an Lafferty, 2007). In the stanar LDA the topic proportions θ rawn from the Dirichlet istribution become inepenent except for wea negative correlation stemming from the normalization constraint. CTM replaces this choice by logistic normal istribution, first rawing an auxiliary variable from a Gaussian istribution ξ N(µ, Σ) an specifying the topic istribution as θ exp(ξ). The topics become correlate when Σ is not iagonal, an empirical experiments show increase preictive accuracy. Finally, our moel will be formulate through a hierarchical Dirichlet process (HDP) formulation (Teh et al., 2006), to enable automatic choice of the number of topics. As mentione in the introuction, the choice is even more critical for multi-moal moels, since we will have several sets of topics instea of just a single one; specifying the complexity for all of those in avance woul not be feasible. Our moel will use elements from the recently introuce Discrete Infinite Logistic Normal (DILN) moel by Paisley et al. (2011), which incorporates HDP into CTM. The ey iea of DILN is that the topic istributions θ are mae sparse by multiplying the exp(ξ) by sparse topic-selection terms. The topic istribution is given by θ Gamma(βp, exp( ξ )), where both β an p come from a stic-breaing process: β is the secon level consentration parameter, an 1 p = V i=1 (1 V i), where V Beta(1, α) with α as the first level concentration parameter. The expecte value of θ is proportional to βp exp(ξ ), illustrating the way the ifferent parameters influence the topic weights. For any finite ata collection, p > 0 only for a finite subset of topics an hence the moel automatically selects the number of topics. 3 FACTORIZED MULTI-MODAL TOPIC MODEL Consier a collection of ocuments each containing M wealy correlate moalities, where each moality has its own vocabulary. In the application of this paper the two vocabularies are textual an visual wors collecte from Wiipeia pages with text an a single image (though the moel woul irectly generalize to multiple images). We introuce a novel multi-moal topic moel that can be use to learn epenencies between these moalities, enabling e.g. preicting the textual content associate with a novel image. The problem is mae particularly challenging by the wea relationship between the moalities; several of the ocuments will contain large amounts of text not relate to the image content. For moeling the ata, we will use M separate vocabularies, so that wors (or visual wors) for each moality are rawn from separate ictionaries η (m) specific to each view m. The topic proportions θ (m) will also be specific to each moality, whereas the actual wors are sample inepenently for each moal-

3 ity given the topic proportions. The essential moeling question is then how the topic proportions are tie with each other, in orer to achieve the factorization into share an private topics. In brief, we will o this by (i) moeling epenencies between topics both within an across moalities an (ii) automatically selecting the number of topics for each type (share or private to any of the moalities). The topic proportions θ (m) are mae epenent by introucing auxiliary variables ξ (m), enoting by ξ = (ξ (1),..., ξ (M) ) the concatenation of them, an using the CTM prior ξ N(µ, Σ). This part of the moel correspons to the multi-fiel CTM with ifferent topic sets by Salomatin et al. (2009), an the ifferent blocs in Σ escribe ifferent types of epenencies between the topic proportions. In particular, the blocs aroun the iagonal escribe epenencies between the topic proportions of each moality, whereas the off-iagonal blocs escribe epenencies in topic proportions between the moalities. Having a CTM for the joint topic istribution is not yet sufficient for separating the share topics from private ones, since we can only control the correlation between the topic proportions. A large correlation between two topics for ifferent moalities woul imply that it is share, but lac of correlation (that is, Σ l = 0) woul not mae either component private. Instea, the weights woul simply be etermine inepenently. To create separate sets of share an private topics we nee to be able to switch some of the topics off in one or more of the moalities, similarly to how Jia et al. (2010) an Virtanen et al. (2011) switch off components to mae the same istinction in continuous ata moels. In the case of multi-fiel CTM this coul only be one by riving µ (the mean of the Gaussian prior for ξ ) towars minus infinity, which is not encourage by the moel an is ifficult to achieve with mean-fiel upates. We implement the share/private choice by separate HDPs, one for each moality, switching a subset of topics off for each moality separately by a mechanism similar to how the single-view DILN moel (Paisley et al., 2011) selects the topics. We introuce β (m) an p (m) for each moality m = 1,..., M, an raw them from separate HDPs, resulting in θ (m) Gamma(β (m) p (m), exp( ξ (m) )) as the final topic proportions. The topic istributions are still share through ξ (m) that were rawn from a single high-imensional Gaussian, but for each moality the stic weights p (m) select ifferent subsets of topics to be switche off. In the en, a finite number of topics remain for each moality, an the private topics can be ientifie as ones that have non-zero weight for one moality an are not correlate with topics active in µ (m) V Figure 1: A graphical representation of the factorize multi-moal topic moel. The ata has D ocuments escribe by M moalities. For each moality, the wors x (m) are rawn from ictionary specific to that moality, accoring to topic proportions θ (m) also specific to the moality. The topic proportions are generate by logistic transformation of latent variables ξ (m) that moel the correlations between the topics both within an across moalities, followe by topic selection with a HDP (enote by V an β in the plate; see text for etails) for each moality. As a result, the moel learns both topics moeling correlations between the moalities as well as topics private to each moality. other moalities. The final generative moel motivate by the above iscussion results in a collection of M correlate BOW ata sets X (m), generate as follows (see Figure 1 for graphical representation). For the whole collection we: create a ictionary of T (m) topics for each moality by rawing η (m) Dir(γ (m) 1) for = 1,.., T (m) raw the parameters α (m), β (m), V (m) of the DILN istribution for each moality from the stic-breaing formulation an construct p (m). For each ocument we then raw ξ N(µ, Σ) an partition it into the ifferent moalities as ξ = (ξ (1),..., ξ (M) ). For each moality, we then generate the wors inepenently as follows: form the topic proportion by rawing Y (m) Gamma(β (m) p (m), exp( ξ (m) )) an set θ (m) = Y (m) T (m) i=1 Y (m) i raw N (m) wors by choosing a topic z Multi(θ (m) ) an rawing a wor x Multi(η (m) z ) z T x N m D M

4 3.1 INFERENCE For learning the moel parameters we use a truncate variational approximation following closely the algorithm given by Paisley et al. (2011), the main ifference being that we have M separate sets of η, β an p, one for each moality. The above generative process is truncate by setting V (m) = 1, forcing the stic T (m) lengths beyon the truncation level T (m) to be zero, an the resulting factorize approximation is given by Q = M D N m T m=1 =1 n m=1 =1 q(z (m) (m) n m )q(y q(v (m) )q(α m )q(β m )q(µ)q(σ), n m )q(ξ(m) )q(η(m) ) where to simplify notation we assume T m = T m. The algorithm procees by upating each factor in turn while eeping the others fixe, using either graient ascent or analytic solution for maximizing the lower boun of the approximation for each of the terms (see Paisley et al. (2011) for etails). The main ifference in the algorithms comes from upating ξ, since in our case it goes over M sets of topics instea of just one, yet the activities within each set are governe by separate HDPs. We use a iagonal Gaussian factor q(ξ) = N( ξ, iag(ṽ)), where ṽ enotes the variances of the imensions, an use graient ascent for jointly upating the parameters. To simplify notation we use ξ an v to enote the expectation an variance of the factorial istribution. The relevant part of the lower boun is L ξ,v = M β (m) p (m)t ξ (m) (1) m=1 M E[θ (m) ] T E[exp( ξ (m) )] m=1 (ξ µ) T Σ 1 (ξ µ)/2 iag(σ 1 ) T v/2 + log(v) T 1/2. Here Σ 1 couples the separate ξ (m) terms in the partial erivatives as L ξ,v ξ (m) = β(m) p (m) + E[θ (m) ]E[exp( ξ (m) )] (Σ 1 ) m,m (ξ (m) µ (m) ) j m(σ 1 ) m,j (ξ (j) µ (j) ), with (Σ 1 ) i,j enoting a bloc of Σ 1 corresponing to moalities i an j. The inverse of Σ remains constant uring the graient escent, an hence only nees to be evaluate once for every time the factor q(ξ) is upate. We use maximum marginal lielihoo to upate µ an Σ resulting in close form upates µ = 1 D Σ = D =1 ξ D ( (ξ µ)(ξ µ) T + iag(v ) ) /D. =1 3.2 PREDICTION The moel structure is well suite for preiction tass, where the tas is to infer missing moalities for a new ocument given that one of them is observe (e.g. infer the caption given the image content). This is because the correlations between the topic proportions provie a irect lin between the moalities, an the private topics explain away all the variation that is not useful for preictions. Here we present the etails of the preiction for the special case with just one observe moality (j) an one missing moality (i). Given the observe ata we first infer the topic proportions ˆθ (j) an then auxiliary variable ˆξ (j) by maximizing a cost similar to (1), but only using the newly inferre topic proportions of the observe moality an the corresponing part of Σ. As ˆξ comes from a Gaussian istribution we can infer ˆξ (i) given ˆξ (j) with the stanar conitional expectation as ˆξ (i) = µ (i) + Σ i,j Σ 1 j,j (ˆξ (j) µ (j) ) (2) = µ (i) + W(ˆξ (j) µ (j) ). Here W involves the corresponing part of the between-topic covariance matrix Σ as inicate above, an can be seen as a projection matrix transforming the components of one moality to another. Finally, the newly estimate ˆξ (i) for the missing views is converte bac to the expectee topic proportion ˆθ by exponentiation an multiplying with the corresponing stic lengths p (i). 3.3 SHARED AND PRIVATE TOPICS The ey novelty of the moel is its capability to learn both topics that are share an that are private to each moality, without neeing to specify them in avance. Since the way these topics appear is by no means transparent in the above formulation, we will here iscuss the property in more etail. In brief, the istinct nature for the topics comes from an interplay of the correlations between the topics of ifferent moalities an the HDP proceure that turns some of the topics off

5 for each moality. In particular, neither of these properties alone woul be sufficient. As mentione alreay in Section 3, merely having separate ξ (m) rawn from a single Gaussian is not sufficient for fining private topics. At best, the correlation structure can specify that the weights will be inepenent for the moalities. Next we explain how the other ey element of the moel, separate selection of active topics for each moality, is not sufficient alone either. We o that by consiering a special case of the moel that assumes equal ξ = ξ (m) for all views but has separate stic-breaing processes switching some of the topics off for each of the views. We call this alternative moel mmdiln, ue to the fact how it implementes multi-moal LDA of Blei an Joran (2003) with DILN-style component selection. Intuitively, mmdiln moel coul fin private topics simply by setting p (m) to small value for topics that are not neee in that moality. However, it cannot mae correct preictions from one moality to another, an hence fails in achieving one of the primary goals for share-private factorizations. If p (m) is small then the moel has no information for inferring ξ from that view, an hence also all other elements ξ l that correlate with ξ will be incorrect. If ξ was an important topic for the other view, the preictions will be severely biase. Our moel avois this issue by having the separate ξ (m) parameters, leaing to correct across-moality preictions as escribe in the previous section. In the experimental section we will empirically compare the propose moel with mmdiln, emonstrating how mmdiln inee has very poor preictive accuracy espite moeling the training ata almost as well. Hence, even though the structure is in principle sufficient for learning private topics, the moel has no practical value as a share-private factorization. In orer to recognize the nature of each of the topics, we nee to loo at both the covariance Σ between the topic weights an the moality-specific stic weights p (m). Since the topics can be (potentially strongly) correlate both within an across moalities, we can ientify private topics only by searching for topics that o not correlate with any topic that woul be active in any other moality. In the experiments we emonstrate how the topics can be rane accoring to how strongly they are share with another moality, by inspecting the elements of Σ. 4 RELATED WORK In this section we relate the moel to other approaches for moeling multi-moal count ata. 4.1 MULTI-MODAL TOPIC MODELS The multi-moal extension of LDA (mmlda) by Blei an Joran (2003) an its non-parametric version mmhdp by (Yahneno an Hovavar, 2009) assume all moalities to share the same topic proportions, an essentially exten LDA only by having separate ictionaries for each moality an generating the wors for the omains inepenently. For many real worl ata sets the assumption of ientical topic proportions is too strong, an the moel tries to enforce correlations even when they o not exist. While the assumption may help in picing up topics that woul be wea in either moality alone, it maes ientifying the true correlations almost impossible. Such moels fail especially when moeling ata having strong private topics in one moality. Since the topic proportions are share, the topic must be present in other moalities as well an becomes associate with a ictionary that merely replicates the overall istribution of the wors. Such topics are particularly harmful for preiction tass. When the ictionary of a topic matches that of the bacgroun wor istribution, it will be present in every ocument in that moality. For example, when preicting text from images we coul learn to associate politics (a strong topic private to the text moality) with the overall visual wor istribution, resulting in all of the preictions incluing terms from the politics topic. Salomatin et al. (2009) too a step towars our moel with their multi-fiel CTM. It extens CTM by introucing separate ξ (m) for each moality, similarly to our moel. However, as escribe in the previous section the separate topic proportions are not yet sufficient for separating the share topics from private ones. 4.2 CONDITIONAL TOPIC MODELS Lots of recent wor on multi-moal topic moeling framewor has focuse on builing conitional moels, largely for image annotation tas. Corresponence LDA (corrlda) propose simultaneously to mmlda in (Blei an Joran, 2003) is a prominent example, assuming that the image is generate first an the text epens on the image content. Both moalities are assume to share the same topic weights. While such moels are very useful for moeling the conitional relationship, they o not treat the moalities symmetrically as in our moel. Recently Puttivihya et al. (2011) propose an extension of corrlda, replacing the ientical topic istributions with a regression moule from image topics to the textual annotation topics. The ae flexibility results in better preictive performance, but the moel remains a irectional one,

6 in contrast to our moel that generates all moalities with equal importance. For applications treating only two moalities an having a specific tas that maes one of them more important (say, image annotation) the conitional moels often wor well. However, they o not easily generalize to multiple moalities an are not flexible in terms of the eventual application. Other conitional moels focus on conitioning on meta-ata, such as author or lin structure (Mimno an McCallum, 2008; Hennig et al., 2012). Such moels allow integrating ata that are not necessarily in count format, but the same istinction of irectional versus generative applies. However, this family of moels coul be integrate with our solution, incorporating a meta-ata lin into our multi-moal moel. In essence, the choice of whether meta-ata is moele or not is inepenent of the choice of how many count ata moalities the ata has. 4.3 CANONICAL CORRELATIONS As escribe earlier, the moel bears close resemblance to how CCA moels correlations between continuous ata, the similarities being most apparent with the recent re-interpretations of CCA as share-private factorization (Klami an Kasi, 2008; Jia et al., 2010). The technical etails of the solutions are, however, very ifferent as the normalization of topic proportions maes the techniques use for continuous ata not feasible for topic moels. Despite the mismatch of ata types, CCA can be use for moeling count ata as well. The most promising irection woul be to apply ernel-cca, but there are no obvious choices for the ernel function that woul irectly match the analysis of image-text pairs. As one practical remey, (Rasiwasia et al., 2010) combine CCA an LDA irectly by first estimating a separate LDA moel for each moality an then combining the resulting topic proportions with CCA. Our approach oes not rely on two separate analysis steps that o not result in irectly interpretable private topics. 5 EXPERIMENTS AND RESULTS 5.1 DATA AND MEASURES We valiate the moel on real ata collecte from Wiipeia 1. We constructe a ata collection with D = 20, 000 ocuments, each consisting of a single image represente with 5000 SIFT patches an text (the contents of the whole Wiipeia page) represente with a vocabulary of 7500 most frequent terms, after 1 Available from ~jiayq/ stopwor removal. We mae a ranom 50/50-split into test an train ata. To emonstrate the ability of the propose moel to correctly moel the relationships between the two moalities, we evaluate the moel with conitional perplexity of a missing moality for a new sample: ( D P (m) train = exp train log p(x (m) ) ) D train x (m) P (i) (j) test ( = exp D test log p(x (i) x(j) ) D test x (i) ), where x (m) enotes concatenation of N (m) wors. These quantities measure how well the moel can relate the visual content to the textual content, corresponing to the ocument completion tas of Wallach et al. (2009) but compute across moalities. We compare our moel to three alternatives representing various ins of multi-moal topic moels: mmdiln (Section 3.3), mmhdp (Section 4.1) an corrlda (Section 4.2). Both mmdiln an mmhdp are comparable to our moel in maing automatic topic number selection an moeling both moalities symmetrically. Consequently, the experiments will focus on emonstrating the importance of fining the correct factorization into share an private topics. The corrlda is inclue as an example of a conitional moel that gives an alternative approach to solving a similar preiction tas. Note that we nee to learn two separate corrlda moels, one for preicting text from images an one for the other irection, whereas the other moels can o both types of preictions. For corrlda we use 100 topics (the threshol we use for nonparametric moels). 5.2 INFERENCE SPEED First we show that the variational approximation use for inference is efficient. Figure 2 shows how the algorithm converges for both N = 400 an N = ocuments alreay after some tens of iterations. For both experiments we use a maximum of T = 100 topics. The convergence of mmhdp an mmdiln is similar (not shown). 5.3 PREDICTING TEXT FROM IMAGES AND VISE VERSA Figure 3 shows the evaluation for training an test sets for the propose moel an the comparison methos, measure as the perplexity on training ata an the conitional perplexity of images given the text an text given the images. The propose metho, which is more flexible than the alternatives, reaches better

7 Training perplexity text 400 image 400 text image Iterations Figure 2: Training perplexity as function of algorithm iterations. (lower) perplexity on the training an testing ata ue to being able to escribe both variation not share by the other moality without neeing to introuce noise topics. A notable observation is that the baseline methos perform worse at preicting text from images as the amount of training ata increases. This illustrates clearly the funamental problem in moeling multi-moal collections without separate private topics. Since the text ocuments are easier to moel than the images, the alternative moels start to focus more an more on moeling the text when there is large amount of ata. The ominant topics start escribing the text alone, yet they are also active in the image moality but with a topic that oes not contain any information. Given a new image sample, the estimate topic proportions will be arbitrary an hence o not enable meaningful preiction. The propose moel, however, learns to mae those textual topics private to the text moality, while capturing weaer correlations between the two moalities with share topics. The moel still cannot preict textual information not correlate with the image content, but it learns correctly not to even attempt that an manages to mae accurate preictions for the aspects that are correlate. 5.4 SHARED AND PRIVATE TOPICS To illustrate how the HDP-formulation chooses the topics, we visualize the stic parameters p in Figure 4. First, we notice that the last stics have close to zero weight, inicating that the chosen truncation level T = 100 is sufficient. More importantly, we see that the weights for the text an image topics are ifferent (the image topics are more sprea out), motivating the choice of separate weights for the moalities. To further unerstan how the propose moel is able to fin both share an private topics, we explore the nature of the iniviual topics. Since the SIFT vocabulary is not easily interpretable by visual inspection, we illustrate the property for the textual topics. For each textual topic we measure the amount of corre- Stic weights (a) Our moel: text (b) Our moel: image Figure 4: Visualization of stic parameter p of the propose moel for the text moality (a) an the image moality (b) reveals how they are not ientical for the two moalities. Both figures show the weights for two moels learne with 400 an 10, 000 ocuments, revealing how the istribution is learne fairly accurately alreay from a small collection. lation between the other moality by inspecting the correlation structure in Σ, an then ran the topics accoring to this measure. This results in a rane list of the text topics, the first ones being strongly share by the two moalities while the last ones are private to the text moality. More specifically, enoting the separate blocs in the covariance matrix as ( ) Σt,t Σ Σ = t,i, (3) Σ i,t Σ i,i we convert it to a correlation matrix, Ω, threshol small values out (we use a threshol of 0.2) an extract the cross-correlation between textual (rows) an visual topics (columns), to get Ω t,i. Then for each textual topic we efine visual relevance, ρ, as row mean of absolute values of Ω t,i, written as ρ = 1 T (Ω t,i) 1 2. This quantity captures general an rich visual combinations that co-occur with the textual topics, an it is worth noticing how the measure is very general: It allows multiple visual topics to correlate with one textual topic (an vise versa), an inclues both positive an negative correlations that are typically equally relevant (negative correlation can be seen as absence of a visual component) (See Figure 5 for emonstration). The textual topics are rane accoring to ρ in Figure 6. There are a few very strong share topics between text an image moalities, an at the en of the list we have several topics private to the text moality, inicate by zero correlation with the image moality. This matches with the intuition that the full text of a Wiipeia page cannot be mappe to the image content in all cases. Table 1 summarizes the six text topics most strongly correlating with the image moality, as well as six topics that are private 2 We also trie using the maximum element instea of the mean; it results in fairly similar raning.

8 Text train perplexity (a) Image train perplexity Our moel mmdiln mmhdp corrlda (b) Text preictive perplexity Number of training ocuments (c) Image preictive perplexity Number of training ocuments () Figure 3: Training an test perplexities (lower is better) for the two moalities. For training ata we show the perplexity of moeling the text (a) an images (b) separately. For test ata, we show the conitional perplexity of preicting text from images (c) an preicting images from text (), corresponing to the ocument completion tas use for evaluating topic moels. The propose metho outperforms the comparison ones in all respects. The comparison methos mmhdp, mmdiln an corrlda that are not able to extract topics private to either moality are not able to learn goo preictive moels, emonstrate especially by the error increasing as a function of training samples in (c). The image preiction perplexity for mmdiln is outsie the range epicte in (), above 5400 for all training set sizes. Text topics Image topics Figure 5: Illustration of part of cross-correlation between text topics an image topics corresponing to subset of Ω (t,i), where yellow represents positive correlations, an blue represents negative ones. The size of the boxes correspons to the absolute value. to the text moality, revealing very clear interpretations. The most strongly correlating topic covers airplanes, which are nown to be easy to recognize from the images ue to the istinct shapes an bacgroun. The secon topic is about maps that also have clear visual corresponence, an the other strongly correlate topics also cover clearly visual concepts lie builings, cars an railroas. The topics private to the text omain, in turn, are about concepts with no clear visual counterpart: economy, politics, history an research. In summary, the moel has separate the components nicely into share an private ones, an provies aitional interpretability beyon regular multi-moal topic moels. Image relevance Orere text topics Figure 6: Text topics orere accoring to visual relevance ρ. We see that there are a few strongly correlating topics, an that the moel has foun roughly 10 topics that are private to the text omain. Note that such topics may still be important for moeling the whole multi-moal corpus, whereas they o not contribute to the cross-moal information transfer. 6 DISCUSSION Our paper ties together two separate lines of wor for analysis of multi-moal ata. In particular, we create a novel multi-moal topic moel which extens earlier tools for analysis of multi-moal count ata by incorporating elements foun useful in the continuousvalue case. We explaine how learning topics private to each moality is of crucial importance while moeling moalities with potentially wea correlations, an

9 Table 1: Text topics rane accoring to visual relevance, summarize by the wors with highest probability. The topic inices match the raning in Figure 6. The share topics have clear visual counterparts, whereas the private ones o not relate with any in of visual content. Share topics T1 airport flight airlines air international aircraft aviation terminal passengers airline boeing flights airways service airports passenger accient T2 format ms lat m longm latm longs lats launche mi broen mill sol rename ec capture rapis class feet coorinates built lae locate T3 builing house built builings street hall st century tower houses west esigne esign castle south north east sie main square large en site T4 car engine cars moel moels for engines race rear series front racing wheel year river spee vehicles vehicle prouction hp motor rive T5 retrieve album song music vieo release single awars number billboar chart top release mtv songs meia love show u jacson hot albums T6 line railway station rail trains train service lines bus transport services system railways stations built railroa passenger main metro transit Topics private to the text omain T95 presient washington post unite american national states secretary ecember november september times military c enney press security T96 ottoman turish turey osovo armenian war gree serbia bulgarian serbian government borer bulgaria turs forces croatian albanian republic T97 research science evelopment institute university management scientific technology esign worl national engineering wor human international T98 government state national european policy council international states members act union political countries system nations article parliament T99 nuclear weapons anti power protest bomb people protests unite protesters government strie peace states march reactor atomic april test T100 economic trae economy worl prouction inustry oil million growth evelopment government agricultural maret agriculture inustrial emonstrate empirically how such a property can only be obtaine by combining two separate elements: moeling correlations between separate topic weights for each moality, an learning moality-specific inicators switching unnecessary topics off. For implementing these elements we combine state-of-art techniques in topic moels, integrating the DILN istribution (Paisley et al., 2011) into a moel similar to the multi-fiel correlate topic moel of Salomatin et al. (2009), to create an efficient learning algorithm reaily applicable for relatively large ocument collections. Acnowlegements AK an SK were supporte by the COIN Finnish Center of Excellence an the FuNeSoMo exchange project. AK was aitionally supporte by Acaemy of Finlan (ecision number ) an PASCAL2 European Networ of Excellence. References Blei, D., Ng, A. an Joran, M. (2003). Latent Dirichlet allocation. JMLR, 3: Blei, D. an Joran, M. (2003). Moeling annotate ata. In SIGIR. Blei, D. an Lafferty, J. (2007). A correlate topic moel of science. Annals of Applie Sciences, 1: Hennig, H., Stern, D., Herbrich, R. an Graepel,T. (2012). Kernel topic moels. In AISTATS. Jia, Y., Salzmann, M. an Darrell, T. (2010). Factorize latent spaces with structure sparsity. In NIPS 23. Klami, A. an Kasi, S. (2008). Probabilistic approach to etecting epenencies between ata sets. Neurocomputing, 72(1-3): Mimno, D. an McCallum A. (2008). Topic moels conitione on arbitrary features with Dirichletmultinomial regression. In UAI. Paisley, J., Wang C. an Blei, D. (2011). Discrete infinite logistic normal istribution. In AISTATS. Puttivihya, D., Attias, H. an Nagarajan, S. (2011). Topic-regression multi-moal latent Dirichlet allocation for image annotation. In CVPR. Puttivihya, D., Attias, H. an Nagarajan, S. (2009). Inepenent factor topic moels. In ICML. Rasiwasia, N., Pereira, J., Covielho, E., Doyle, G., Lancriet G., Levy, R. an Vasconcelos N. (2010). A new approach to cross-moal multimeia retrieval. In ACM Multimeia. Salomatin, K., Yang, Y. an La, A. (2009). Multifiel correlate topic moeling. In SDM. Teh, Y., Blei, D. an Joran, M. (2006). Hierarchical Dirichlet processes. JASA, 101(476): Virtanen, S., Klami, A. an Kasi, S. (2011). Bayesian CCA via structure sparsity. In ICML. Wallach, H.M., Murray, I., Salahutinov, R. an Mimno, D. (2009). Evaluation methos for topic moels. In ICML. Yahneno, O. an Honavar, V. (2009). Multi-moal hierarchical Dirichlet process moel for preicting image annotation an image-object label corresponence. In SDM.

Topic Modeling: Beyond Bag-of-Words

Topic Modeling: Beyond Bag-of-Words Hanna M. Wallach Cavenish Laboratory, University of Cambrige, Cambrige CB3 0HE, UK hmw26@cam.ac.u Abstract Some moels of textual corpora employ text generation methos involving n-gram statistics, while

More information

Collapsed Gibbs and Variational Methods for LDA. Example Collapsed MoG Sampling

Collapsed Gibbs and Variational Methods for LDA. Example Collapsed MoG Sampling Case Stuy : Document Retrieval Collapse Gibbs an Variational Methos for LDA Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 7 th, 0 Example

More information

Collapsed Variational Inference for HDP

Collapsed Variational Inference for HDP Collapse Variational Inference for HDP Yee W. Teh Davi Newman an Max Welling Publishe on NIPS 2007 Discussion le by Iulian Pruteanu Outline Introuction Hierarchical Bayesian moel for LDA Collapse VB inference

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences. S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

Lecture 2: Correlated Topic Model

Lecture 2: Correlated Topic Model Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables

More information

SYNCHRONOUS SEQUENTIAL CIRCUITS

SYNCHRONOUS SEQUENTIAL CIRCUITS CHAPTER SYNCHRONOUS SEUENTIAL CIRCUITS Registers an counters, two very common synchronous sequential circuits, are introuce in this chapter. Register is a igital circuit for storing information. Contents

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

IN the evolution of the Internet, there have been

IN the evolution of the Internet, there have been 1 Tag-Weighte Topic Moel For Large-scale Semi-Structure Documents Shuangyin Li, Jiefei Li, Guan Huang, Ruiyang Tan, an Rong Pan arxiv:1507.08396v1 [cs.cl] 30 Jul 2015 Abstract To ate, there have been massive

More information

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

inflow outflow Part I. Regular tasks for MAE598/494 Task 1 MAE 494/598, Fall 2016 Project #1 (Regular tasks = 20 points) Har copy of report is ue at the start of class on the ue ate. The rules on collaboration will be release separately. Please always follow the

More information

CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, and Tony Wu

CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, and Tony Wu CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, an Tony Wu Abstract Popular proucts often have thousans of reviews that contain far too much information for customers to igest. Our goal for the

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

Latent Dirichlet Allocation in Web Spam Filtering

Latent Dirichlet Allocation in Web Spam Filtering Latent Dirichlet Allocation in Web Spam Filtering István Bíró Jácint Szabó Anrás A. Benczúr Data Mining an Web search Research Group, Informatics Laboratory Computer an Automation Research Institute of

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This moule is part of the Memobust Hanbook on Methoology of Moern Business Statistics 26 March 2014 Metho: Balance Sampling for Multi-Way Stratification Contents General section... 3 1. Summary... 3 2.

More information

LDA Collapsed Gibbs Sampler, VariaNonal Inference. Task 3: Mixed Membership Models. Case Study 5: Mixed Membership Modeling

LDA Collapsed Gibbs Sampler, VariaNonal Inference. Task 3: Mixed Membership Models. Case Study 5: Mixed Membership Modeling Case Stuy 5: Mixe Membership Moeling LDA Collapse Gibbs Sampler, VariaNonal Inference Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox May 8 th, 05 Emily Fox 05 Task : Mixe

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

A Review of Multiple Try MCMC algorithms for Signal Processing

A Review of Multiple Try MCMC algorithms for Signal Processing A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications

More information

Necessary and Sufficient Conditions for Sketched Subspace Clustering

Necessary and Sufficient Conditions for Sketched Subspace Clustering Necessary an Sufficient Conitions for Sketche Subspace Clustering Daniel Pimentel-Alarcón, Laura Balzano 2, Robert Nowak University of Wisconsin-Maison, 2 University of Michigan-Ann Arbor Abstract This

More information

The Role of Models in Model-Assisted and Model- Dependent Estimation for Domains and Small Areas

The Role of Models in Model-Assisted and Model- Dependent Estimation for Domains and Small Areas The Role of Moels in Moel-Assiste an Moel- Depenent Estimation for Domains an Small Areas Risto Lehtonen University of Helsini Mio Myrsylä University of Pennsylvania Carl-Eri Särnal University of Montreal

More information

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y Ph195a lecture notes, 1/3/01 Density operators for spin- 1 ensembles So far in our iscussion of spin- 1 systems, we have restricte our attention to the case of pure states an Hamiltonian evolution. Toay

More information

Part I: Web Structure Mining Chapter 1: Information Retrieval and Web Search

Part I: Web Structure Mining Chapter 1: Information Retrieval and Web Search Part I: Web Structure Mining Chapter : Information Retrieval an Web Search The Web Challenges Crawling the Web Inexing an Keywor Search Evaluating Search Quality Similarity Search The Web Challenges Tim

More information

Quantum mechanical approaches to the virial

Quantum mechanical approaches to the virial Quantum mechanical approaches to the virial S.LeBohec Department of Physics an Astronomy, University of Utah, Salt Lae City, UT 84112, USA Date: June 30 th 2015 In this note, we approach the virial from

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

Diagonalization of Matrices Dr. E. Jacobs

Diagonalization of Matrices Dr. E. Jacobs Diagonalization of Matrices Dr. E. Jacobs One of the very interesting lessons in this course is how certain algebraic techniques can be use to solve ifferential equations. The purpose of these notes is

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

KNN Particle Filters for Dynamic Hybrid Bayesian Networks

KNN Particle Filters for Dynamic Hybrid Bayesian Networks KNN Particle Filters for Dynamic Hybri Bayesian Networs H. D. Chen an K. C. Chang Dept. of Systems Engineering an Operations Research George Mason University MS 4A6, 4400 University Dr. Fairfax, VA 22030

More information

A Sketch of Menshikov s Theorem

A Sketch of Menshikov s Theorem A Sketch of Menshikov s Theorem Thomas Bao March 14, 2010 Abstract Let Λ be an infinite, locally finite oriente multi-graph with C Λ finite an strongly connecte, an let p

More information

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture

More information

Similarity Measures for Categorical Data A Comparative Study. Technical Report

Similarity Measures for Categorical Data A Comparative Study. Technical Report Similarity Measures for Categorical Data A Comparative Stuy Technical Report Department of Computer Science an Engineering University of Minnesota 4-92 EECS Builing 200 Union Street SE Minneapolis, MN

More information

Quantum Mechanics in Three Dimensions

Quantum Mechanics in Three Dimensions Physics 342 Lecture 20 Quantum Mechanics in Three Dimensions Lecture 20 Physics 342 Quantum Mechanics I Monay, March 24th, 2008 We begin our spherical solutions with the simplest possible case zero potential.

More information

Logarithmic spurious regressions

Logarithmic spurious regressions Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate

More information

Calculus and optimization

Calculus and optimization Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

More information

CONTROL CHARTS FOR VARIABLES

CONTROL CHARTS FOR VARIABLES UNIT CONTOL CHATS FO VAIABLES Structure.1 Introuction Objectives. Control Chart Technique.3 Control Charts for Variables.4 Control Chart for Mean(-Chart).5 ange Chart (-Chart).6 Stanar Deviation Chart

More information

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France APPROXIMAE SOLUION FOR RANSIEN HEA RANSFER IN SAIC URBULEN HE II B. Bauouy CEA/Saclay, DSM/DAPNIA/SCM 91191 Gif-sur-Yvette Ceex, France ABSRAC Analytical solution in one imension of the heat iffusion equation

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Schrödinger s equation.

Schrödinger s equation. Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of

More information

Chapter 6: Energy-Momentum Tensors

Chapter 6: Energy-Momentum Tensors 49 Chapter 6: Energy-Momentum Tensors This chapter outlines the general theory of energy an momentum conservation in terms of energy-momentum tensors, then applies these ieas to the case of Bohm's moel.

More information

Robust Low Rank Kernel Embeddings of Multivariate Distributions

Robust Low Rank Kernel Embeddings of Multivariate Distributions Robust Low Rank Kernel Embeings of Multivariate Distributions Le Song, Bo Dai College of Computing, Georgia Institute of Technology lsong@cc.gatech.eu, boai@gatech.eu Abstract Kernel embeing of istributions

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

Lagrangian and Hamiltonian Mechanics

Lagrangian and Hamiltonian Mechanics Lagrangian an Hamiltonian Mechanics.G. Simpson, Ph.. epartment of Physical Sciences an Engineering Prince George s Community College ecember 5, 007 Introuction In this course we have been stuying classical

More information

Cascaded redundancy reduction

Cascaded redundancy reduction Network: Comput. Neural Syst. 9 (1998) 73 84. Printe in the UK PII: S0954-898X(98)88342-5 Cascae reunancy reuction Virginia R e Sa an Geoffrey E Hinton Department of Computer Science, University of Toronto,

More information

05 The Continuum Limit and the Wave Equation

05 The Continuum Limit and the Wave Equation Utah State University DigitalCommons@USU Founations of Wave Phenomena Physics, Department of 1-1-2004 05 The Continuum Limit an the Wave Equation Charles G. Torre Department of Physics, Utah State University,

More information

All s Well That Ends Well: Supplementary Proofs

All s Well That Ends Well: Supplementary Proofs All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee

More information

Sensors & Transducers 2015 by IFSA Publishing, S. L.

Sensors & Transducers 2015 by IFSA Publishing, S. L. Sensors & Transucers, Vol. 184, Issue 1, January 15, pp. 53-59 Sensors & Transucers 15 by IFSA Publishing, S. L. http://www.sensorsportal.com Non-invasive an Locally Resolve Measurement of Soun Velocity

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson JUST THE MATHS UNIT NUMBER 10.2 DIFFERENTIATION 2 (Rates of change) by A.J.Hobson 10.2.1 Introuction 10.2.2 Average rates of change 10.2.3 Instantaneous rates of change 10.2.4 Derivatives 10.2.5 Exercises

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

Introduction to Machine Learning

Introduction to Machine Learning How o you estimate p(y x)? Outline Contents Introuction to Machine Learning Logistic Regression Varun Chanola April 9, 207 Generative vs. Discriminative Classifiers 2 Logistic Regression 2 3 Logistic Regression

More information

Conservation Laws. Chapter Conservation of Energy

Conservation Laws. Chapter Conservation of Energy 20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action

More information

A simplified macroscopic urban traffic network model for model-based predictive control

A simplified macroscopic urban traffic network model for model-based predictive control Delft University of Technology Delft Center for Systems an Control Technical report 9-28 A simplifie macroscopic urban traffic network moel for moel-base preictive control S. Lin, B. De Schutter, Y. Xi,

More information

Gaussian processes with monotonicity information

Gaussian processes with monotonicity information Gaussian processes with monotonicity information Anonymous Author Anonymous Author Unknown Institution Unknown Institution Abstract A metho for using monotonicity information in multivariate Gaussian process

More information

The Exact Form and General Integrating Factors

The Exact Form and General Integrating Factors 7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily

More information

Estimating Causal Direction and Confounding Of Two Discrete Variables

Estimating Causal Direction and Confounding Of Two Discrete Variables Estimating Causal Direction an Confouning Of Two Discrete Variables This inspire further work on the so calle aitive noise moels. Hoyer et al. (2009) extene Shimizu s ientifiaarxiv:1611.01504v1 [stat.ml]

More information

A Novel Decoupled Iterative Method for Deep-Submicron MOSFET RF Circuit Simulation

A Novel Decoupled Iterative Method for Deep-Submicron MOSFET RF Circuit Simulation A Novel ecouple Iterative Metho for eep-submicron MOSFET RF Circuit Simulation CHUAN-SHENG WANG an YIMING LI epartment of Mathematics, National Tsing Hua University, National Nano evice Laboratories, an

More information

Predictive Control of a Laboratory Time Delay Process Experiment

Predictive Control of a Laboratory Time Delay Process Experiment Print ISSN:3 6; Online ISSN: 367-5357 DOI:0478/itc-03-0005 Preictive Control of a aboratory ime Delay Process Experiment S Enev Key Wors: Moel preictive control; time elay process; experimental results

More information

Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine

Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine Nitish Srivastava nitish@cs.toronto.edu Ruslan Salahutdinov rsalahu@cs.toronto.edu Geoffrey Hinton hinton@cs.toronto.edu

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

Collapsed Variational Inference for LDA

Collapsed Variational Inference for LDA Collapse Variational Inference for LDA BT Thomas Yeo LDA We shall follow the same notation as Blei et al. 2003. In other wors, we consier full LDA moel with hyperparameters α anη onβ anθ respectiely, whereθparameterizes

More information

The Principle of Least Action

The Principle of Least Action Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of

More information

Harmonic Modelling of Thyristor Bridges using a Simplified Time Domain Method

Harmonic Modelling of Thyristor Bridges using a Simplified Time Domain Method 1 Harmonic Moelling of Thyristor Briges using a Simplifie Time Domain Metho P. W. Lehn, Senior Member IEEE, an G. Ebner Abstract The paper presents time omain methos for harmonic analysis of a 6-pulse

More information

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations Optimize Schwarz Methos with the Yin-Yang Gri for Shallow Water Equations Abessama Qaouri Recherche en prévision numérique, Atmospheric Science an Technology Directorate, Environment Canaa, Dorval, Québec,

More information

State observers and recursive filters in classical feedback control theory

State observers and recursive filters in classical feedback control theory State observers an recursive filters in classical feeback control theory State-feeback control example: secon-orer system Consier the riven secon-orer system q q q u x q x q x x x x Here u coul represent

More information

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling Balancing Expecte an Worst-Case Utility in Contracting Moels with Asymmetric Information an Pooling R.B.O. erkkamp & W. van en Heuvel & A.P.M. Wagelmans Econometric Institute Report EI2018-01 9th January

More information

On the Surprising Behavior of Distance Metrics in High Dimensional Space

On the Surprising Behavior of Distance Metrics in High Dimensional Space On the Surprising Behavior of Distance Metrics in High Dimensional Space Charu C. Aggarwal, Alexaner Hinneburg 2, an Daniel A. Keim 2 IBM T. J. Watson Research Center Yortown Heights, NY 0598, USA. charu@watson.ibm.com

More information

Analytic Scaling Formulas for Crossed Laser Acceleration in Vacuum

Analytic Scaling Formulas for Crossed Laser Acceleration in Vacuum October 6, 4 ARDB Note Analytic Scaling Formulas for Crosse Laser Acceleration in Vacuum Robert J. Noble Stanfor Linear Accelerator Center, Stanfor University 575 San Hill Roa, Menlo Park, California 945

More information

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Proceeings of the 4th East-European Conference on Avances in Databases an Information Systems ADBIS) 200 Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Eleftherios Tiakas, Apostolos.

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

Construction of the Electronic Radial Wave Functions and Probability Distributions of Hydrogen-like Systems

Construction of the Electronic Radial Wave Functions and Probability Distributions of Hydrogen-like Systems Construction of the Electronic Raial Wave Functions an Probability Distributions of Hyrogen-like Systems Thomas S. Kuntzleman, Department of Chemistry Spring Arbor University, Spring Arbor MI 498 tkuntzle@arbor.eu

More information

Inverse Theory Course: LTU Kiruna. Day 1

Inverse Theory Course: LTU Kiruna. Day 1 Inverse Theory Course: LTU Kiruna. Day Hugh Pumphrey March 6, 0 Preamble These are the notes for the course Inverse Theory to be taught at LuleåTekniska Universitet, Kiruna in February 00. They are not

More information

Transmission Line Matrix (TLM) network analogues of reversible trapping processes Part B: scaling and consistency

Transmission Line Matrix (TLM) network analogues of reversible trapping processes Part B: scaling and consistency Transmission Line Matrix (TLM network analogues of reversible trapping processes Part B: scaling an consistency Donar e Cogan * ANC Eucation, 308-310.A. De Mel Mawatha, Colombo 3, Sri Lanka * onarecogan@gmail.com

More information

1 Heisenberg Representation

1 Heisenberg Representation 1 Heisenberg Representation What we have been ealing with so far is calle the Schröinger representation. In this representation, operators are constants an all the time epenence is carrie by the states.

More information

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation Tutorial on Maximum Likelyhoo Estimation: Parametric Density Estimation Suhir B Kylasa 03/13/2014 1 Motivation Suppose one wishes to etermine just how biase an unfair coin is. Call the probability of tossing

More information

A NONLINEAR SOURCE SEPARATION APPROACH FOR THE NICOLSKY-EISENMAN MODEL

A NONLINEAR SOURCE SEPARATION APPROACH FOR THE NICOLSKY-EISENMAN MODEL 6th European Signal Processing Conference EUSIPCO 28, Lausanne, Switzerlan, August 25-29, 28, copyright by EURASIP A NONLINEAR SOURCE SEPARATION APPROACH FOR THE NICOLSKY-EISENMAN MODEL Leonaro Tomazeli

More information

Capacity Analysis of MIMO Systems with Unknown Channel State Information

Capacity Analysis of MIMO Systems with Unknown Channel State Information Capacity Analysis of MIMO Systems with Unknown Channel State Information Jun Zheng an Bhaskar D. Rao Dept. of Electrical an Computer Engineering University of California at San Diego e-mail: juzheng@ucs.eu,

More information

Agmon Kolmogorov Inequalities on l 2 (Z d )

Agmon Kolmogorov Inequalities on l 2 (Z d ) Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

Hybrid Fusion for Biometrics: Combining Score-level and Decision-level Fusion

Hybrid Fusion for Biometrics: Combining Score-level and Decision-level Fusion Hybri Fusion for Biometrics: Combining Score-level an Decision-level Fusion Qian Tao Raymon Velhuis Signals an Systems Group, University of Twente Postbus 217, 7500AE Enschee, the Netherlans {q.tao,r.n.j.velhuis}@ewi.utwente.nl

More information

Polynomial Inclusion Functions

Polynomial Inclusion Functions Polynomial Inclusion Functions E. e Weert, E. van Kampen, Q. P. Chu, an J. A. Muler Delft University of Technology, Faculty of Aerospace Engineering, Control an Simulation Division E.eWeert@TUDelft.nl

More information

The Press-Schechter mass function

The Press-Schechter mass function The Press-Schechter mass function To state the obvious: It is important to relate our theories to what we can observe. We have looke at linear perturbation theory, an we have consiere a simple moel for

More information

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS Yannick DEVILLE Université Paul Sabatier Laboratoire Acoustique, Métrologie, Instrumentation Bât. 3RB2, 8 Route e Narbonne,

More information

Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis

Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis Chuang Wang, Yonina C. Elar, Fellow, IEEE an Yue M. Lu, Senior Member, IEEE Abstract We present a high-imensional analysis

More information

Number of wireless sensors needed to detect a wildfire

Number of wireless sensors needed to detect a wildfire Number of wireless sensors neee to etect a wilfire Pablo I. Fierens Instituto Tecnológico e Buenos Aires (ITBA) Physics an Mathematics Department Av. Maero 399, Buenos Aires, (C1106ACD) Argentina pfierens@itba.eu.ar

More information