RECURRENT SUPPORT VECTOR MACHINES FOR SPEECH RECOGNITION. Shi-Xiong Zhang, Rui Zhao, Chaojun Liu, Jinyu Li and Yifan Gong

Size: px
Start display at page:

Download "RECURRENT SUPPORT VECTOR MACHINES FOR SPEECH RECOGNITION. Shi-Xiong Zhang, Rui Zhao, Chaojun Liu, Jinyu Li and Yifan Gong"

Transcription

1 RECURRENT SUPPORT VECTOR MACHINES FOR SPEECH RECOGNITION Shi-Xiong Zhang, Rui Zhao, Chaojun Liu, Jinyu Li and Yifan Gong Microoft Corporation, One Microoft Way, Redmond, WA zhahi, ruzhao, chaojunl, jinyli, ABSTRACT Recurrent Neural Network (RNN) uing Long-Short Term Memory (LSTM) architecture have demontrated the tate-of-the-art performance on peech recognition. Mot of deep RNN ue the oftmax activation function in the lat layer for claification. Thi paper illutrate mall but conitent advantage of replacing the oftmax layer in RNN with Support Vector Machine (SVM). The parameter of RNN and SVM are jointly learned uing a equence-level maxmargin criteria, intead of cro-entropy. The reulting model i termed Recurrent SVM. The conventional SVM need to predefine a feature pace and do not have internal tate to deal with arbitrary long-term dependencie in equence. The propoed recurrent SVM ue LSTM to learn the feature pace and to capture temporal dependencie, while uing the SVM (in the lat layer) for equence claification. The model i evaluated on the Window phone tak for large vocabulary continuou peech recognition. Index Term Deep learning, LSTM, SVM, maximum margin, equence training 1. INTRODUCTION Recurrent Neural Network (RNN) [1] and Structured SVM [2, 3] are two ucceful model for equence claification, uch a peech recognition [3, 4]. The RNN are univeral model in the ene that they can approximate any equenceto-equence mapping to arbitrary accuracy [5, 6]. However, there are three major drawback of RNN. Firt, the training uually require to olve a highly nonlinear optimization problem which ha many local minima. Second, they tend to overfit given the limited data if training goe on too long [1]. Third, the number of neuron in RNN i fixed, empirically. Alternatively, Support Vector Machine (SVM) upplied the maximum margin claifier idea [7], ha attracted extenive reearch interet [8 10]. The SVM wa originally propoed for binary claification. It can be extended to handle the multicla claification or equence recognition. The modified SVM for multicla claification and equence recognition are known a the multicla SVM [11] and tructured SVM [2, 3, 12], repectively. The SVM ha erval prominent feature. Firt, it ha been proven that maximizing the margin i equivalent to minimiing an upper bound of generalization error [7]. Second, the optimization problem of SVM i convex, which i guaranteed to have a global optimal olution. Third, the ize of SVM model i determined by the number of upport vector [7] which i learned from training data, intead of a fixed deign in RNN. However, SVM are hallow architecture, wherea deep architecture of feedforward and recurrent neural network have hown tate-of-theart performance in peech recognition [4, 13 15]. Recently deep learning uing SVM in combination with CNN/DNN ha hown promiing reult [16, 17]. In thi work, a novel model uing SVM in combination with RNN i propoed. Conventional RNN ue the oftmax activation function at the lat layer for claification. Thi work illutrate the advantage of replacing the oftmax layer with SVM. Two training algorithm are propoed, at frame and equence-level, to jointly learn the parameter of SVM and RNN in the maxmargin criteria. The reulting model i termed Recurrent SVM. Conventional SVM ue a predefined feature pace and do not have internal tate to deal with arbitrary long-term dependencie. The propoed Recurrent SVM ue RNN to learn the feature pace and to capture temporal dependencie, while uing the SVM (in the lat layer) for claification. It decoding proce i imilar to the RNN-HMM hybrid ytem but with frame-level poterior probabilitie replaced by core from the SVM. We verify it effectivene on the Window phone tak for peech recognition. 2. RECURRENT NEURAL NETWORK Given an input equence X 1:T = x 1,..., x T, the imple RNN compute the hidden vector h 1,..., h T by iterating the following equation from t = 1,..., T, = H(W xh W hh 1 b h ) (1) where W i the weight matrix, e.g., W xh i the input-hidden weight matrix, b i the bia vector, e.g., b h i the hidden bia vector, and H( ) i the recurrent hidden layer function. Denote the correponding tate label a = 1,..., T, the tate-label poterior for frame t i computed by the oftmax exp ( ) w T P ( t X 1:t ) = t b t N exp ( ) (2) w T t b t

2 where N i the total number of label, w t i the weight vector connecting the hidden layer to the output tate t. 1 H( ) in equation (1) i typically an element-wie igmoid function. However, [18] found that uing the Long Short- Term Memory (LSTM) architecture can alleviate the gradient vanihing/exploding iue. A imple LSTM memory block i illutrated in Figure 1. In LSTM-RNN, the recurrent hidden layer function H( ) i implemented by the following compoite function, z t = tanh(w zx W zh 1 b z ) i t = (W ix W ih 1 W ic c t 1 b i ) f t = (W fx W fh 1 W fc c t 1 b f ) c t = i t z t f t c t 1 o t = (W ox W oh 1 W oc c t 1 b o ) block input input gate forget gate cell tate output gate = o t tanh(c t ) block output (3) where vector i t, f t, o t are the activation of the input, output, and forget gate repectively, W x are the weight matrice for input, W h are the weight matrice for recurrent input. W c are the diagonal weight matrice for the peephole connection. tanh( ), ( ) and are the hyperbolic tangent, logitic igmoid and multiplication, element-wie function, repectively. 3. RECURRENT SVM In LSTM-RNN, the oftmax normalization term (Eq. (2)) i a contant for all label, thu, it can be ignored during decoding. For example, given a input equence X 1:t, the mot likely tate label at time t can be inferred by arg max log P ( X 1:t ) = arg max w T (4) For multicla SVM [11], the claification function i arg max w T φ( ) (5) where φ( ) i the predefined feature pace of SVM and w i the weight parameter for cla/tate. If RNN are ued to derive the feature pace, e.g., φ( ), decoding of multicla SVM and RNN are the ame. Thi inpire u to replace the oftmax layer in RNN with SVM. The reulting model i named Recurrent SVM. It architecture i illutrated in Figure 1. Note that RNN can be trained uing frame-level croentropy, connectionit temporal claification (CTC) [19] or equence-level MMI/MBR criteria [15,20]. In thi work, two training algorithm at frame-level (Section 3.1) and equencelevel (Section 3.2), uing max-margin criterion, are propoed. Each algorithm include two iterative tep. The firt tep i to etimate the parameter of SVM in the lat layer uing the quadratic programing. The econd tep i to update the parameter of RNN in all previou layer uing ubgradient approache. 1 For implicity, the tate bia b t i ignored for the ret of the paper. 1 time lag Recurrent SVM SVM w1 T w T w T N zoom in 1 tanh z t 1 peephole i t c t tanh f t 1 LSTM memory block Fig. 1. The architecture of Recurrent SVM with LSTM unit. The dah arrow illutrate the connection witime lag. The SVM in the lat layer can be Multicla/Structured SVM Frame-level training (Recurrent multicla SVM) In the frame-level max-margin training, given training input and tate label, (, t ) T, let φ() a the feature pace derived from RNN recurrent tate, the parameter of the lat layer are firt etimated uing the multicla SVM training algorithm [11], 1 min w,ξ t 2 N T w 2 2 C ξt 2 (6) =1.t. for every training frame t = 1,..., T, for every competing tate t 1,..., N : w T t w T t 1 ξ t, t t where ξ t 0 i the lack variable which penalize the data point that violate the margin requirement. The equation (6) baically ay that, the core of the correct tate label, w T t, ha to be greater than the core of any other tate, w T t, by a margin. Subtituting ξ t from the contraint into the objective function, equation (6) can be reformulated a minimizing F frm (w,w )= 1 N w C T [ 1 w T t maxw T t t t =1 (7) where depend on all the weight W in previou layer hown in equation (3). [x] = max(0, x) i a hinge function. Note the maximum of a et of linear function i convex, thu equation (7) i convex w.r.t. w. The firt tep of recurrent SVM training i to optimize w by minimizing (7). The econd tep of recurrent SVM training i to optimize the parameter in previou layer, e.g., W ih in equation (3). Thee parameter can be updated by back propagating the gradient from the top layer. F frm = ( Ffrm h t Note / i the ame a tandard RNN which can be computed uing BPTT algorithm. The key here i to compute the derivative of F frm w.r.t. the. However, equation (7) i ) (8) ] 2 o t 1

3 not differentiable becaue of the max( ). To handle thi, the ubgradient method [21] i applied. Given the current SVM parameter, w, in the lat layer, the ubgradient of objective function (7) w.r.t. can be derived a F frm = 2C [ 1 w T t w T t ] (w t w t ) (9) log P ( X ) w T φ(x, ) reference tate equence Margin competing tate equence where t =arg max t w T t i the mot competing tate label. After thi point, the BPTT algorithm work exactly the ame a the tandard RNN. Note, after SVM training for current iteration, mot of data may be claified correctly and beyond the margin, i.e., [ 1 w T t w T t = 0 for thee frame. ] Interetingly, thi mean, only the data located in the margin (known a the upport vector) have non-zero gradient Sequence-level training (Recurrent tructured SVM) In the max-margin equence training, for implicity, let conider one training utterance (X, ) where X = x 1,..., x T i the input and = 1,..., T i reference tate label. The parameter of the model can be etimated by maximiing min log P ( X ) P ( X ) = min log Here the margin i defined a the minimum ditance between the reference tate equence and competing tate equence in the log poterior domain a illutrated in the Fig. 2. Note that, unlike MMI/MBR equence training, the normalization term p(x, ) in poterior probability i cancelled out, a it appear in both numerator and denominator. For clarity, the language model probability i not hown here. To generalize the above objective function, a lo function L(, ) i introduced to control the ize of the margin, a hinge function [ ] i applied to ignore the data that beyond the margin, and a prior P (w) i incorporated to further reduce the generalization error. Thu the criterion become minimizing [ ] 2 log P (w) max L(, ) log (10) For recurrent SVM, log () can be computed via ( w T t log P ( t ) log P ( t t 1 ) ) = w T φ(x, )(11) where φ(x, ) i the joint feature [22], which characterize the dependencie between two equence, X and, δ( t = 1) w 1 φ(x, ) =. δ( t = N) log P ( t ) log P ( t t 1 ), w =. w N 1 1 (12) pace φ(x, ) Fig. 2. The equence-level max-margin criterion. Margin i defined in the log-poterior domain between the reference tate equence S and the mot competing tate equence. where δ( ) i the Kronecker delta (indicator) function. Here the prior, P (w), i aumed to be a Gauian with a zero mean and a caled identity covariance matrix CI, thu log P (w) = log N (0, CI) 1 2C wt w. Subtituting the prior and equation (11) into criterion (10), the parameter of recurrent SVM can be etimated by minimizing 2 linear F eq (w, W ) = 1 [ 2 w 2 2 C w T φ(x, ) (13) max L(, ) w T φ(x, ) convex where φ(, ) depend on and depend on all the weight W in previou layer hown in equation (3). The firt tep of equence training i to optimize w = w 1,..., w,..., w N in the lat layer by minimizing (13). Similar to F frm, the F eq i alo convex for w. Interetingly, given the feature pace φ, equation (13) i the ame a the training criterion for tructured SVM [12]. Thi optimization can be olved uing the cutting plane algorithm [2]. The econd tep of equence training i to optimize the parameter W in the previou layer, e.g., W ih in equation (3). Thee parameter can be updated by back propagating the gradient from the top layer. F eq = ( Feq ) ] 2 (14) Again / i the ame a tandard RNN. The key i to calculate the ubgradient of F eq w.r.t. for each frame t, F eq = 2C [ L w T φ w T φ ] (w t w t ) (15) where L i the lo between the reference and it mot competing tate equence, and φ i hort for feature φ(x, ). After thi point, the BPTT algorithm work exactly the ame a the tandard RNN. The term [ L w T φ w T φ ] in equation (15) indicate that, only the utterance located in the margin (upport vector) have non-zero gradient. 2 For implicity, only one training utterance i hown in equation (13).

4 3.3. Inference The decoding proce i imilar to the RNN-HMM hybrid ytem but withe log poterior probabilitie, log P ( t ), replaced by the core from recurrent SVM, w T t. Note that earching the mot likely tate equence during decoding i eentially the ame a earche mot competing tate equence in equation (13), except the lo L(, ). They can be olved uing the ame Viterbi algorithm Practical Iue An efficient implementation of the algorithm i important for peech recognition. In thi ection everal deign option for recurrent SVM training are decribed that have a ubtantial influence on computational efficiency. Form of Prior. Previouly a zero-mean Gauian prior i introduced. However, a proper mean of prior hould be the one can yield LSTM performance, log P (w) = log N (w RNN, CI). Thu the term 1 2 w 2 2 in (7) and (13) become 1 2 w w RNN 2 2. Since a better mean i applied, a maller variation C can be ued to reduce the training iteration [23]. Lattice. Solving the optimization (13) require to earche mot competing tate equence efficiently. The computational load during training i dominated by thi earch proce. To peed up the training, denominator lattice with tate alignment are ued to contrain the earching pace. Then a lattice-baed forward-backward earch [3, 23] i applied to find the mot competing tate equence. Caching. To further avoid the computation cot of earching the in (13), during training iteration, the five mot recently ued tate equence for each utterance are cached. Thi reduce the number of call to earch in the lattice. 4. EXPERIMENTS The Window Phone hort meage diction tak i ued to evaluate the effectivene of the recurrent SVM propoed in Section 3. The training data conit of 60 hour of trancribed US-Englih peech. The tet et conit of 3 hour of data and 16k word. All the experiment were implemented baed on the computational network toolkit (CNTK) [24]. 87 dimentional log-filter-bank feature are ued for LSTM and recurrent SVM. The feature context window i 11 frame. The baeline DNN ha 5 hidden layer, each layer include 2048 hidden node. All the ytem ue 1812 tied triphone tate. The baeline LSTM ha four layer, each ha 1024 hidden node and the output ize of each layer i reduced to 512 uing a linear projection layer [25]. The tate label i delayed by 5 frame for LSTM a decribed in [25]. No frame tacking wa applied. During LSTM training, the back propagation thougime (BPTT) tep i et to 20. The forget gate bia b f in (3) i initialized a 1 according to [26]. The Recurrent SVM i contructed bae on the baeline LSTM (four LSTM layer) in combination with a SVM in Model WER % frame-level training equence training DNN 23.06% (CE) % (MMI) LSTM-RNN 21.14% (CE) 20.40% (MMI) Recurrent SVM 20.69% (MM) 19.83% (MM) Table 1. The reult (in word error rate) for DNN, LSTM- RNN and Recurrent SVM ytem, trained uing frame-level and equence-level criterion. The CE, MMI and MM are hort for cro entropy (CE), maximum mutual information (MMI) and maximum margin (MM) criterion. Training Recurrent SVM LSTM oftmax layer previou layer baeline frame-level % 20.69% 21.14% equence-level 20.19% 19.83% 20.40% Table 2. The impact of each layer in recurrent SVM uing the frame-level and equence-level max-margin training. the lat oftmax layer. In the frame-level training, the caled 0/1 lo i ued in equation (6) and the CE trained LSTM i ued a a prior. The calar i tuned uing the cro validation et. In the equence training, the tate lo [15] i applied in equation (13) and the MMI-trained LSTM i ued a a prior. The reult of all ytem are hown in Table 1. In frame-level training, the propoed recurrent SVM improved the LSTM (CE) by 2.1%, and improved the DNN (CE) by 10.3%. In equence-level training, the recurrent SVM outperformed the LSTM (MMI) by 2.8%, and improved the DNN (MMI) by 6%. Table 2 ugget that in botraining criterion, more than 65% of gain in recurrent SVM are coming from updating the recurrent layer. Note that althoughe improvement of recurrent SVM over LSTM i not big, it doe not increae any latency/computation cot for the runtime decoding. 5. CONCLUSION AND FUTURE WORK A new type of recurrent model i decribed in thi work. The traditional RNN ue the oftmax activation at the lat layer for claification. The propoed model intead ue a SVM at the lat layer. Two training algorithm are propoed at frame and equence-level to learn parameter of the SVM and RNN jointly in maximum-margin criteria. In frame-level training, the new model i hown to be related to the multicla SVM with RNN feature; In equence-level training, it i related to the tructured SVM with RNN feature and tate tranition feature. The propoed model, named recurrent SVM, yield 2.8% improvement over LSTM (MMI trained) on Window Phone tak without increaing any runtime latency. Unlike the conventional SVM need predefined feature pace, the propoed recurrent SVM ue LSTM to learn the feature pace and to capture temporal dependencie, while uing the linear SVM in the lat layer for claification. Future work will invetigate recurrent SVM with non-linear kernel.

5 6. REFERENCES [1] C. M. Bihop, Pattern recognition and machine learning, Springer, [2] T. Joachim, T. Finley, and C.-N. J. Yu, Cuttingplane training of tructural SVM, Journal of Machine Learning Reearch, vol. 77, pp , [3] S.-X. Zhang and M. J. F. Gale, Structured SVM for automatic peech recognition, IEEE Tranaction Audio, Speech and Language Proceing, vol. 21, no. 3, pp , [4] A. Grave, A. Mohamed, and G. Hinton, Speech recognition with deep recurrent neural network, in ICASSP. IEEE, 2013, pp [5] D. E. Rumelhart, G. E. Hinton, and R. J. William, Learning internal repreentation by back-propagating error, Nature, vol. 323, pp , [6] G. Cybenko, Approximation by uperpoition of a igmoidal function, Math. Control, Signal Syt, vol. 2, pp , [7] V. N. Vapnik, The Nature of Statitical Learning Theory, Springer-Verlag, New York, [8] C. J. C. Burge, A tutorial on upport vector machine for pattern recognition, Data Mining and Knowledge Dicovery, vol. 2, pp , [9] T. Joachim, Text categorization with uport vector machine: Learning with many relevant feature, in Proceeding of 10th European Conference on Machine Learning, 1998, pp [10] J. Salomon, S. King, and M. Oborne, Framewie phone claification uing upport vector machine, in Proceeding of Interpeech, 2002, pp [11] K. Crammer and Y. Singer, On the algorithmic implementation of multicla kernel-baed vector machine, vol. 2, pp , [12] S.-X. Zhang and M. J. F. Gale, Structured upport vector machine for noie robut continuou peech recognition, in Proceeding of Interpeech, Florence, Italy, 2011, pp [13] G. E. Dahl, D. Yu, L. Deng, and A. Acero, Contextdependent pre-trained deep neural network for large vocabulary peech recognition, IEEE Tranaction on Audio, Speech, and Language Proceing, vol. 20, no. 1, pp , [14] G. Hinton, L. Deng, D. Yu, G. E. Dahl, et al., Deep neural network for acoutic modeling in peech recognition: The hared view of four reearch group, Signal Proceing Magazine, vol. 29, no. 6, pp , [15] K. Veelý, A. Ghohal, L. Burget, and D. Povey, Sequence-dicriminative training of deep neural network, in INTERSPEECH, 2013, pp [16] Y. Tang, Deep learning uing linear upport vector machine, in Internaltion Conference on Machine Learning, December [17] S.-X. Zhang, C. Liu, K. Yao, and Y. Gong, Deep neural upport vector machine for peech recognition, in ICASSP. IEEE, April 2015, pp [18] S. Hochreiter and J. Schmidhuber, Long hort-term memory, Neural Comput., vol. 9, no. 8, pp , Nov [19] A. Grave, S. Fernández, F. Gomez, and J. Schmidhuber, Connectionit temporal claification: labelling unegmented equence data with recurrent neural network, in ICML, 2006, pp [20] H. Sak, A. Senior, K. Rao, O. Iroy, A. Grave, F. Beaufay, and J. Schalkwyk, Learning acoutic frame labeling for peech recognition with recurrent neural network, in ICASSP. IEEE, 2015, pp [21] Y. Singer and N. Srebro, Pegao: Primal etimated ub-gradient olver for SVM, in Proceeding of ICML, 2007, pp [22] S.-X. Zhang, A. Ragni, and M. J. F. Gale, Structured log linear model for noie robut peech recognition, Signal Proceing Letter, IEEE, vol. 17, pp , [23] S.-X. Zhang, Structured Support Vector Machine for Speech Recogntion, Ph.D. thei, Cambridge Univerity, March [24] D. Yu, A. Everole, M. Seltzer, K. Yao, et al., An introduction to computational network and the computational network toolkit, Tech. Rep. MSR-TR , Microoft Technical Report, [25] H. Sak, A. Senior, and F. Beaufay, Long hort-term memory recurrent neural network architecture for large cale acoutic modeling, in INTERSPEECH, [26] R. Jozefowicz, W. Zaremba, and L. Sutkever, An empirical exploration of recurrent network architecture, in ICML, 2015, pp

DEEP NEURAL SUPPORT VECTOR MACHINES FOR SPEECH RECOGNITION. Shi-Xiong Zhang, Chaojun Liu, Kaisheng Yao and Yifan Gong

DEEP NEURAL SUPPORT VECTOR MACHINES FOR SPEECH RECOGNITION. Shi-Xiong Zhang, Chaojun Liu, Kaisheng Yao and Yifan Gong DEEP NEURAL SUPPORT VECTOR MACHINES FOR SPEECH RECOGNITION Shi-Xiong Zhang, Chaojun Liu, Kaiheng Yao and Yifan Gong Microoft Corporation, One Microoft Way, Redmond, WA 98052 zhahi, chaojunl, kaiheny, ygong@microoft.com

More information

Deep Recurrent Neural Networks

Deep Recurrent Neural Networks Deep Recurrent Neural Networks Artem Chernodub e-mail: a.chernodub@gmail.com web: http://zzphoto.me ZZ Photo IMMSP NASU 2 / 28 Neuroscience Biological-inspired models Machine Learning p x y = p y x p(x)/p(y)

More information

Factor Analysis with Poisson Output

Factor Analysis with Poisson Output Factor Analyi with Poion Output Gopal Santhanam Byron Yu Krihna V. Shenoy, Department of Electrical Engineering, Neurocience Program Stanford Univerity Stanford, CA 94305, USA {gopal,byronyu,henoy}@tanford.edu

More information

Clustering Methods without Given Number of Clusters

Clustering Methods without Given Number of Clusters Clutering Method without Given Number of Cluter Peng Xu, Fei Liu Introduction A we now, mean method i a very effective algorithm of clutering. It mot powerful feature i the calability and implicity. However,

More information

Segmental Recurrent Neural Networks for End-to-end Speech Recognition

Segmental Recurrent Neural Networks for End-to-end Speech Recognition Segmental Recurrent Neural Networks for End-to-end Speech Recognition Liang Lu, Lingpeng Kong, Chris Dyer, Noah Smith and Steve Renals TTI-Chicago, UoE, CMU and UW 9 September 2016 Background A new wave

More information

CHAPTER 8 OBSERVER BASED REDUCED ORDER CONTROLLER DESIGN FOR LARGE SCALE LINEAR DISCRETE-TIME CONTROL SYSTEMS

CHAPTER 8 OBSERVER BASED REDUCED ORDER CONTROLLER DESIGN FOR LARGE SCALE LINEAR DISCRETE-TIME CONTROL SYSTEMS CHAPTER 8 OBSERVER BASED REDUCED ORDER CONTROLLER DESIGN FOR LARGE SCALE LINEAR DISCRETE-TIME CONTROL SYSTEMS 8.1 INTRODUCTION 8.2 REDUCED ORDER MODEL DESIGN FOR LINEAR DISCRETE-TIME CONTROL SYSTEMS 8.3

More information

Jul 4, 2005 turbo_code_primer Revision 0.0. Turbo Code Primer

Jul 4, 2005 turbo_code_primer Revision 0.0. Turbo Code Primer Jul 4, 5 turbo_code_primer Reviion. Turbo Code Primer. Introduction Thi document give a quick tutorial on MAP baed turbo coder. Section develop the background theory. Section work through a imple numerical

More information

Learning Multiplicative Interactions

Learning Multiplicative Interactions CSC2535 2011 Lecture 6a Learning Multiplicative Interaction Geoffrey Hinton Two different meaning of multiplicative If we take two denity model and multiply together their probability ditribution at each

More information

A Constraint Propagation Algorithm for Determining the Stability Margin. The paper addresses the stability margin assessment for linear systems

A Constraint Propagation Algorithm for Determining the Stability Margin. The paper addresses the stability margin assessment for linear systems A Contraint Propagation Algorithm for Determining the Stability Margin of Linear Parameter Circuit and Sytem Lubomir Kolev and Simona Filipova-Petrakieva Abtract The paper addree the tability margin aement

More information

Conditional Random Fields For Speech and Language Processing

Conditional Random Fields For Speech and Language Processing Outline For Speech and Language Proceing Background Maximum Entropy model and CRF CRF Example SLaTe experiment with CRF Jeremy Morri 0/7/008 Background (CRF) Dicriminative probabilitic equence model Ued

More information

Large Vocabulary Continuous Speech Recognition with Long Short-Term Recurrent Networks

Large Vocabulary Continuous Speech Recognition with Long Short-Term Recurrent Networks Large Vocabulary Continuous Speech Recognition with Long Short-Term Recurrent Networks Haşim Sak, Andrew Senior, Oriol Vinyals, Georg Heigold, Erik McDermott, Rajat Monga, Mark Mao, Françoise Beaufays

More information

Residual LSTM: Design of a Deep Recurrent Architecture for Distant Speech Recognition

Residual LSTM: Design of a Deep Recurrent Architecture for Distant Speech Recognition INTERSPEECH 017 August 0 4, 017, Stockholm, Sweden Residual LSTM: Design of a Deep Recurrent Architecture for Distant Speech Recognition Jaeyoung Kim 1, Mostafa El-Khamy 1, Jungwon Lee 1 1 Samsung Semiconductor,

More information

Highway-LSTM and Recurrent Highway Networks for Speech Recognition

Highway-LSTM and Recurrent Highway Networks for Speech Recognition Highway-LSTM and Recurrent Highway Networks for Speech Recognition Golan Pundak, Tara N. Sainath Google Inc., New York, NY, USA {golan, tsainath}@google.com Abstract Recently, very deep networks, with

More information

A Study on Simulating Convolutional Codes and Turbo Codes

A Study on Simulating Convolutional Codes and Turbo Codes A Study on Simulating Convolutional Code and Turbo Code Final Report By Daniel Chang July 27, 2001 Advior: Dr. P. Kinman Executive Summary Thi project include the deign of imulation of everal convolutional

More information

CHAPTER 4 DESIGN OF STATE FEEDBACK CONTROLLERS AND STATE OBSERVERS USING REDUCED ORDER MODEL

CHAPTER 4 DESIGN OF STATE FEEDBACK CONTROLLERS AND STATE OBSERVERS USING REDUCED ORDER MODEL 98 CHAPTER DESIGN OF STATE FEEDBACK CONTROLLERS AND STATE OBSERVERS USING REDUCED ORDER MODEL INTRODUCTION The deign of ytem uing tate pace model for the deign i called a modern control deign and it i

More information

Social Studies 201 Notes for March 18, 2005

Social Studies 201 Notes for March 18, 2005 1 Social Studie 201 Note for March 18, 2005 Etimation of a mean, mall ample ize Section 8.4, p. 501. When a reearcher ha only a mall ample ize available, the central limit theorem doe not apply to the

More information

An exploration of dropout with LSTMs

An exploration of dropout with LSTMs An exploration of out with LSTMs Gaofeng Cheng 1,3, Vijayaditya Peddinti 4,5, Daniel Povey 4,5, Vimal Manohar 4,5, Sanjeev Khudanpur 4,5,Yonghong Yan 1,2,3 1 Key Laboratory of Speech Acoustics and Content

More information

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recap Standard RNNs Training: Backpropagation Through Time (BPTT) Application to sequence modeling Language modeling Applications: Automatic speech

More information

Advanced D-Partitioning Analysis and its Comparison with the Kharitonov s Theorem Assessment

Advanced D-Partitioning Analysis and its Comparison with the Kharitonov s Theorem Assessment Journal of Multidiciplinary Engineering Science and Technology (JMEST) ISSN: 59- Vol. Iue, January - 5 Advanced D-Partitioning Analyi and it Comparion with the haritonov Theorem Aement amen M. Yanev Profeor,

More information

White Rose Research Online URL for this paper: Version: Accepted Version

White Rose Research Online URL for this paper:   Version: Accepted Version Thi i a repoitory copy of Identification of nonlinear ytem with non-peritent excitation uing an iterative forward orthogonal leat quare regreion algorithm. White Roe Reearch Online URL for thi paper: http://eprint.whiteroe.ac.uk/107314/

More information

arxiv: v3 [cs.lg] 14 Jan 2018

arxiv: v3 [cs.lg] 14 Jan 2018 A Gentle Tutorial of Recurrent Neural Network with Error Backpropagation Gang Chen Department of Computer Science and Engineering, SUNY at Buffalo arxiv:1610.02583v3 [cs.lg] 14 Jan 2018 1 abstract We describe

More information

Introduction to Laplace Transform Techniques in Circuit Analysis

Introduction to Laplace Transform Techniques in Circuit Analysis Unit 6 Introduction to Laplace Tranform Technique in Circuit Analyi In thi unit we conider the application of Laplace Tranform to circuit analyi. A relevant dicuion of the one-ided Laplace tranform i found

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions Stochatic Optimization with Inequality Contraint Uing Simultaneou Perturbation and Penalty Function I-Jeng Wang* and Jame C. Spall** The John Hopkin Univerity Applied Phyic Laboratory 11100 John Hopkin

More information

Hybrid Projective Dislocated Synchronization of Liu Chaotic System Based on Parameters Identification

Hybrid Projective Dislocated Synchronization of Liu Chaotic System Based on Parameters Identification www.ccenet.org/ma Modern Applied Science Vol. 6, No. ; February Hybrid Projective Dilocated Synchronization of Liu Chaotic Sytem Baed on Parameter Identification Yanfei Chen College of Science, Guilin

More information

SMALL-SIGNAL STABILITY ASSESSMENT OF THE EUROPEAN POWER SYSTEM BASED ON ADVANCED NEURAL NETWORK METHOD

SMALL-SIGNAL STABILITY ASSESSMENT OF THE EUROPEAN POWER SYSTEM BASED ON ADVANCED NEURAL NETWORK METHOD SMALL-SIGNAL STABILITY ASSESSMENT OF THE EUROPEAN POWER SYSTEM BASED ON ADVANCED NEURAL NETWORK METHOD S.P. Teeuwen, I. Erlich U. Bachmann Univerity of Duiburg, Germany Department of Electrical Power Sytem

More information

A Comparison of Soft In/Soft Out Algorithms for Turbo Detection

A Comparison of Soft In/Soft Out Algorithms for Turbo Detection A Comparion of Soft In/Soft Out Algorithm for Turbo Detection Gerhard Bauch 1, Volker Franz 2 1 Munich Univerity of Technology (TUM), Intitute for Communication Engineering (LNT) D-80290 Munich, Germany

More information

Evolutionary Algorithms Based Fixed Order Robust Controller Design and Robustness Performance Analysis

Evolutionary Algorithms Based Fixed Order Robust Controller Design and Robustness Performance Analysis Proceeding of 01 4th International Conference on Machine Learning and Computing IPCSIT vol. 5 (01) (01) IACSIT Pre, Singapore Evolutionary Algorithm Baed Fixed Order Robut Controller Deign and Robutne

More information

Lecture 11 Recurrent Neural Networks I

Lecture 11 Recurrent Neural Networks I Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 01, 2017 Introduction Sequence Learning with Neural Networks Some Sequence Tasks

More information

into a discrete time function. Recall that the table of Laplace/z-transforms is constructed by (i) selecting to get

into a discrete time function. Recall that the table of Laplace/z-transforms is constructed by (i) selecting to get Lecture 25 Introduction to Some Matlab c2d Code in Relation to Sampled Sytem here are many way to convert a continuou time function, { h( t) ; t [0, )} into a dicrete time function { h ( k) ; k {0,,, }}

More information

Bogoliubov Transformation in Classical Mechanics

Bogoliubov Transformation in Classical Mechanics Bogoliubov Tranformation in Claical Mechanic Canonical Tranformation Suppoe we have a et of complex canonical variable, {a j }, and would like to conider another et of variable, {b }, b b ({a j }). How

More information

Real Sources (Secondary Sources) Phantom Source (Primary source) LS P. h rl. h rr. h ll. h lr. h pl. h pr

Real Sources (Secondary Sources) Phantom Source (Primary source) LS P. h rl. h rr. h ll. h lr. h pl. h pr Ecient frequency domain ltered-x realization of phantom ource iet C.W. ommen, Ronald M. Aart, Alexander W.M. Mathijen, John Gara, Haiyan He Abtract A phantom ound ource i a virtual ound image which can

More information

Deep Neural Networks

Deep Neural Networks Deep Neural Networks DT2118 Speech and Speaker Recognition Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 45 Outline State-to-Output Probability Model Artificial Neural Networks Perceptron Multi

More information

Codes Correcting Two Deletions

Codes Correcting Two Deletions 1 Code Correcting Two Deletion Ryan Gabry and Frederic Sala Spawar Sytem Center Univerity of California, Lo Angele ryan.gabry@navy.mil fredala@ucla.edu Abtract In thi work, we invetigate the problem of

More information

Matching Feature Distributions for Robust Speaker Verification

Matching Feature Distributions for Robust Speaker Verification Matching Feature Ditribution for Robut Speaker Verification Marhalleno Skoan, Daniel Mahao Department of Electrical Engineering, Univerity of Cape Town Rondeboch, Cape Town, South Africa mkoan@crg.ee.uct.ac.za

More information

One Class of Splitting Iterative Schemes

One Class of Splitting Iterative Schemes One Cla of Splitting Iterative Scheme v Ciegi and V. Pakalnytė Vilniu Gedimina Technical Univerity Saulėtekio al. 11, 2054, Vilniu, Lithuania rc@fm.vtu.lt Abtract. Thi paper deal with the tability analyi

More information

Introduction to RNNs!

Introduction to RNNs! Introduction to RNNs Arun Mallya Best viewed with Computer Modern fonts installed Outline Why Recurrent Neural Networks (RNNs)? The Vanilla RNN unit The RNN forward pass Backpropagation refresher The RNN

More information

Multi-dimensional Fuzzy Euler Approximation

Multi-dimensional Fuzzy Euler Approximation Mathematica Aeterna, Vol 7, 2017, no 2, 163-176 Multi-dimenional Fuzzy Euler Approximation Yangyang Hao College of Mathematic and Information Science Hebei Univerity, Baoding 071002, China hdhyywa@163com

More information

Social Studies 201 Notes for November 14, 2003

Social Studies 201 Notes for November 14, 2003 1 Social Studie 201 Note for November 14, 2003 Etimation of a mean, mall ample ize Section 8.4, p. 501. When a reearcher ha only a mall ample ize available, the central limit theorem doe not apply to the

More information

Lecture 11 Recurrent Neural Networks I

Lecture 11 Recurrent Neural Networks I Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor niversity of Chicago May 01, 2017 Introduction Sequence Learning with Neural Networks Some Sequence Tasks

More information

A FUNCTIONAL BAYESIAN METHOD FOR THE SOLUTION OF INVERSE PROBLEMS WITH SPATIO-TEMPORAL PARAMETERS AUTHORS: CORRESPONDENCE: ABSTRACT

A FUNCTIONAL BAYESIAN METHOD FOR THE SOLUTION OF INVERSE PROBLEMS WITH SPATIO-TEMPORAL PARAMETERS AUTHORS: CORRESPONDENCE: ABSTRACT A FUNCTIONAL BAYESIAN METHOD FOR THE SOLUTION OF INVERSE PROBLEMS WITH SPATIO-TEMPORAL PARAMETERS AUTHORS: Zenon Medina-Cetina International Centre for Geohazard / Norwegian Geotechnical Intitute Roger

More information

The machines in the exercise work as follows:

The machines in the exercise work as follows: Tik-79.148 Spring 2001 Introduction to Theoretical Computer Science Tutorial 9 Solution to Demontration Exercie 4. Contructing a complex Turing machine can be very laboriou. With the help of machine chema

More information

Suggested Answers To Exercises. estimates variability in a sampling distribution of random means. About 68% of means fall

Suggested Answers To Exercises. estimates variability in a sampling distribution of random means. About 68% of means fall Beyond Significance Teting ( nd Edition), Rex B. Kline Suggeted Anwer To Exercie Chapter. The tatitic meaure variability among core at the cae level. In a normal ditribution, about 68% of the core fall

More information

Linear Motion, Speed & Velocity

Linear Motion, Speed & Velocity Add Important Linear Motion, Speed & Velocity Page: 136 Linear Motion, Speed & Velocity NGSS Standard: N/A MA Curriculum Framework (006): 1.1, 1. AP Phyic 1 Learning Objective: 3.A.1.1, 3.A.1.3 Knowledge/Undertanding

More information

Confusion matrices. True / False positives / negatives. INF 4300 Classification III Anne Solberg The agenda today: E.g., testing for cancer

Confusion matrices. True / False positives / negatives. INF 4300 Classification III Anne Solberg The agenda today: E.g., testing for cancer INF 4300 Claification III Anne Solberg 29.10.14 The agenda today: More on etimating claifier accuracy Cure of dimenionality knn-claification K-mean clutering x i feature vector for pixel i i- The cla label

More information

An estimation approach for autotuning of event-based PI control systems

An estimation approach for autotuning of event-based PI control systems Acta de la XXXIX Jornada de Automática, Badajoz, 5-7 de Septiembre de 08 An etimation approach for autotuning of event-baed PI control ytem Joé Sánchez Moreno, María Guinaldo Loada, Sebatián Dormido Departamento

More information

Boundary Contraction Training for Acoustic Models based on Discrete Deep Neural Networks

Boundary Contraction Training for Acoustic Models based on Discrete Deep Neural Networks INTERSPEECH 2014 Boundary Contraction Training for Acoustic Models based on Discrete Deep Neural Networks Ryu Takeda, Naoyuki Kanda, and Nobuo Nukaga Central Research Laboratory, Hitachi Ltd., 1-280, Kokubunji-shi,

More information

Yoram Gat. Technical report No. 548, March Abstract. A classier is said to have good generalization ability if it performs on

Yoram Gat. Technical report No. 548, March Abstract. A classier is said to have good generalization ability if it performs on A bound concerning the generalization ability of a certain cla of learning algorithm Yoram Gat Univerity of California, Berkeley Technical report No. 548, March 999 Abtract A claier i aid to have good

More information

Digital Control System

Digital Control System Digital Control Sytem - A D D A Micro ADC DAC Proceor Correction Element Proce Clock Meaurement A: Analog D: Digital Continuou Controller and Digital Control Rt - c Plant yt Continuou Controller Digital

More information

Bayesian-Based Decision Making for Object Search and Characterization

Bayesian-Based Decision Making for Object Search and Characterization 9 American Control Conference Hyatt Regency Riverfront, St. Loui, MO, USA June -, 9 WeC9. Bayeian-Baed Deciion Making for Object Search and Characterization Y. Wang and I. I. Huein Abtract Thi paper focue

More information

EE-559 Deep learning LSTM and GRU

EE-559 Deep learning LSTM and GRU EE-559 Deep learning 11.2. LSTM and GRU François Fleuret https://fleuret.org/ee559/ Mon Feb 18 13:33:24 UTC 2019 ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE The Long-Short Term Memory unit (LSTM) by Hochreiter

More information

Assignment for Mathematics for Economists Fall 2016

Assignment for Mathematics for Economists Fall 2016 Due date: Mon. Nov. 1. Reading: CSZ, Ch. 5, Ch. 8.1 Aignment for Mathematic for Economit Fall 016 We now turn to finihing our coverage of concavity/convexity. There are two part: Jenen inequality for concave/convex

More information

Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries

Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries Yu-Hsuan Wang, Cheng-Tao Chung, Hung-yi

More information

Online supplementary information

Online supplementary information Electronic Supplementary Material (ESI) for Soft Matter. Thi journal i The Royal Society of Chemitry 15 Online upplementary information Governing Equation For the vicou flow, we aume that the liquid thickne

More information

Efficient Methods of Doppler Processing for Coexisting Land and Weather Clutter

Efficient Methods of Doppler Processing for Coexisting Land and Weather Clutter Efficient Method of Doppler Proceing for Coexiting Land and Weather Clutter Ça gatay Candan and A Özgür Yılmaz Middle Eat Technical Univerity METU) Ankara, Turkey ccandan@metuedutr, aoyilmaz@metuedutr

More information

] Automatic Speech Recognition (CS753)

] Automatic Speech Recognition (CS753) ] Automatic Speech Recognition (CS753) Lecture 17: Discriminative Training for HMMs Instructor: Preethi Jyothi Sep 28, 2017 Discriminative Training Recall: MLE for HMMs Maximum likelihood estimation (MLE)

More information

A Simple Approach to Synthesizing Naïve Quantized Control for Reference Tracking

A Simple Approach to Synthesizing Naïve Quantized Control for Reference Tracking A Simple Approach to Syntheizing Naïve Quantized Control for Reference Tracking SHIANG-HUA YU Department of Electrical Engineering National Sun Yat-Sen Univerity 70 Lien-Hai Road, Kaohiung 804 TAIAN Abtract:

More information

Estimating floor acceleration in nonlinear multi-story moment-resisting frames

Estimating floor acceleration in nonlinear multi-story moment-resisting frames Etimating floor acceleration in nonlinear multi-tory moment-reiting frame R. Karami Mohammadi Aitant Profeor, Civil Engineering Department, K.N.Tooi Univerity M. Mohammadi M.Sc. Student, Civil Engineering

More information

Gain and Phase Margins Based Delay Dependent Stability Analysis of Two- Area LFC System with Communication Delays

Gain and Phase Margins Based Delay Dependent Stability Analysis of Two- Area LFC System with Communication Delays Gain and Phae Margin Baed Delay Dependent Stability Analyi of Two- Area LFC Sytem with Communication Delay Şahin Sönmez and Saffet Ayaun Department of Electrical Engineering, Niğde Ömer Halidemir Univerity,

More information

On the Isomorphism of Fractional Factorial Designs 1

On the Isomorphism of Fractional Factorial Designs 1 journal of complexity 17, 8697 (2001) doi:10.1006jcom.2000.0569, available online at http:www.idealibrary.com on On the Iomorphim of Fractional Factorial Deign 1 Chang-Xing Ma Department of Statitic, Nankai

More information

Proximal Iteratively Reweighted Algorithm with Multiple Splitting for Nonconvex Sparsity Optimization

Proximal Iteratively Reweighted Algorithm with Multiple Splitting for Nonconvex Sparsity Optimization Proceeding of the Twenty-Eighth AAAI Conference on Artificial Intelligence Proimal Iteratively Reweighted Algorithm with Multiple Splitting for Nonconve Sparity Optimization Canyi Lu 1, Yunchao Wei, Zhouchen

More information

7.2 INVERSE TRANSFORMS AND TRANSFORMS OF DERIVATIVES 281

7.2 INVERSE TRANSFORMS AND TRANSFORMS OF DERIVATIVES 281 72 INVERSE TRANSFORMS AND TRANSFORMS OF DERIVATIVES 28 and i 2 Show how Euler formula (page 33) can then be ued to deduce the reult a ( a) 2 b 2 {e at co bt} {e at in bt} b ( a) 2 b 2 5 Under what condition

More information

Unavoidable Cycles in Polynomial-Based Time-Invariant LDPC Convolutional Codes

Unavoidable Cycles in Polynomial-Based Time-Invariant LDPC Convolutional Codes European Wirele, April 7-9,, Vienna, Autria ISBN 978--87-4-9 VE VERLAG GMBH Unavoidable Cycle in Polynomial-Baed Time-Invariant LPC Convolutional Code Hua Zhou and Norbert Goertz Intitute of Telecommunication

More information

Preemptive scheduling on a small number of hierarchical machines

Preemptive scheduling on a small number of hierarchical machines Available online at www.ciencedirect.com Information and Computation 06 (008) 60 619 www.elevier.com/locate/ic Preemptive cheduling on a mall number of hierarchical machine György Dóa a, Leah Eptein b,

More information

ROBUST CONTROLLER DESIGN WITH HARD INPUT CONSTRAINTS. TIME DOMAIN APPROACH. Received August 2015; revised November 2015

ROBUST CONTROLLER DESIGN WITH HARD INPUT CONSTRAINTS. TIME DOMAIN APPROACH. Received August 2015; revised November 2015 International Journal of Innovative Computing, Information and Control ICIC International c 2016 ISSN 1349-4198 Volume 12, Number 1, February 2016 pp. 161 170 ROBUST CONTROLLER DESIGN WITH HARD INPUT CONSTRAINTS.

More information

PIPELINING AND PARALLEL PROCESSING. UNIT 4 Real time Signal Processing

PIPELINING AND PARALLEL PROCESSING. UNIT 4 Real time Signal Processing PIPELINING AND PARALLEL PROCESSING UNIT 4 Real time Signal Proceing Content Introduction Pipeling of FIR Digital Filter Parallel proceing Low power Deign FIR Digital Filter A FIR Filter i defined a follow:

More information

Beyond Cut-Set Bounds - The Approximate Capacity of D2D Networks

Beyond Cut-Set Bounds - The Approximate Capacity of D2D Networks Beyond Cut-Set Bound - The Approximate Capacity of DD Network Avik Sengupta Hume Center for National Security and Technology Department of Electrical and Computer Engineering Virginia Tech, Blackburg,

More information

THE EXPERIMENTAL PERFORMANCE OF A NONLINEAR DYNAMIC VIBRATION ABSORBER

THE EXPERIMENTAL PERFORMANCE OF A NONLINEAR DYNAMIC VIBRATION ABSORBER Proceeding of IMAC XXXI Conference & Expoition on Structural Dynamic February -4 Garden Grove CA USA THE EXPERIMENTAL PERFORMANCE OF A NONLINEAR DYNAMIC VIBRATION ABSORBER Yung-Sheng Hu Neil S Ferguon

More information

March 18, 2014 Academic Year 2013/14

March 18, 2014 Academic Year 2013/14 POLITONG - SHANGHAI BASIC AUTOMATIC CONTROL Exam grade March 8, 4 Academic Year 3/4 NAME (Pinyin/Italian)... STUDENT ID Ue only thee page (including the back) for anwer. Do not ue additional heet. Ue of

More information

μ + = σ = D 4 σ = D 3 σ = σ = All units in parts (a) and (b) are in V. (1) x chart: Center = μ = 0.75 UCL =

μ + = σ = D 4 σ = D 3 σ = σ = All units in parts (a) and (b) are in V. (1) x chart: Center = μ = 0.75 UCL = Our online Tutor are available 4*7 to provide Help with Proce control ytem Homework/Aignment or a long term Graduate/Undergraduate Proce control ytem Project. Our Tutor being experienced and proficient

More information

Suggestions - Problem Set (a) Show the discriminant condition (1) takes the form. ln ln, # # R R

Suggestions - Problem Set (a) Show the discriminant condition (1) takes the form. ln ln, # # R R Suggetion - Problem Set 3 4.2 (a) Show the dicriminant condition (1) take the form x D Ð.. Ñ. D.. D. ln ln, a deired. We then replace the quantitie. 3ß D3 by their etimate to get the proper form for thi

More information

Speech recognition. Lecture 14: Neural Networks. Andrew Senior December 12, Google NYC

Speech recognition. Lecture 14: Neural Networks. Andrew Senior December 12, Google NYC Andrew Senior 1 Speech recognition Lecture 14: Neural Networks Andrew Senior Google NYC December 12, 2013 Andrew Senior 2 1

More information

Deep Learning. Recurrent Neural Network (RNNs) Ali Ghodsi. October 23, Slides are partially based on Book in preparation, Deep Learning

Deep Learning. Recurrent Neural Network (RNNs) Ali Ghodsi. October 23, Slides are partially based on Book in preparation, Deep Learning Recurrent Neural Network (RNNs) University of Waterloo October 23, 2015 Slides are partially based on Book in preparation, by Bengio, Goodfellow, and Aaron Courville, 2015 Sequential data Recurrent neural

More information

Modeling the scalar wave equation with Nyström methods

Modeling the scalar wave equation with Nyström methods GEOPHYSICS, VOL. 71, NO. 5 SEPTEMBER-OCTOBER 200 ; P. T151 T158, 7 FIGS. 10.1190/1.2335505 Modeling the calar wave equation with Nytröm method Jing-Bo Chen 1 ABSTRACT High-accuracy numerical cheme for

More information

Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks

Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks Modeling Time-Frequency Patterns with LSTM vs Convolutional Architectures for LVCSR Tasks Tara N Sainath, Bo Li Google, Inc New York, NY, USA {tsainath, boboli}@googlecom Abstract Various neural network

More information

Fair Game Review. Chapter 7 A B C D E Name Date. Complete the number sentence with <, >, or =

Fair Game Review. Chapter 7 A B C D E Name Date. Complete the number sentence with <, >, or = Name Date Chapter 7 Fair Game Review Complete the number entence with , or =. 1. 3.4 3.45 2. 6.01 6.1 3. 3.50 3.5 4. 0.84 0.91 Find three decimal that make the number entence true. 5. 5.2 6. 2.65 >

More information

OBSERVER-BASED REDUCED ORDER CONTROLLER DESIGN FOR THE STABILIZATION OF LARGE SCALE LINEAR DISCRETE-TIME CONTROL SYSTEMS

OBSERVER-BASED REDUCED ORDER CONTROLLER DESIGN FOR THE STABILIZATION OF LARGE SCALE LINEAR DISCRETE-TIME CONTROL SYSTEMS International Journal o Computer Science, Engineering and Inormation Technology (IJCSEIT, Vol.1, No.5, December 2011 OBSERVER-BASED REDUCED ORDER CONTROLLER DESIGN FOR THE STABILIZATION OF LARGE SCALE

More information

THE RATIO OF DISPLACEMENT AMPLIFICATION FACTOR TO FORCE REDUCTION FACTOR

THE RATIO OF DISPLACEMENT AMPLIFICATION FACTOR TO FORCE REDUCTION FACTOR 3 th World Conference on Earthquake Engineering Vancouver, B.C., Canada Augut -6, 4 Paper No. 97 THE RATIO OF DISPLACEMENT AMPLIFICATION FACTOR TO FORCE REDUCTION FACTOR Mua MAHMOUDI SUMMARY For Seimic

More information

Neural Architectures for Image, Language, and Speech Processing

Neural Architectures for Image, Language, and Speech Processing Neural Architectures for Image, Language, and Speech Processing Karl Stratos June 26, 2018 1 / 31 Overview Feedforward Networks Need for Specialized Architectures Convolutional Neural Networks (CNNs) Recurrent

More information

Modelling Time Series with Neural Networks. Volker Tresp Summer 2017

Modelling Time Series with Neural Networks. Volker Tresp Summer 2017 Modelling Time Series with Neural Networks Volker Tresp Summer 2017 1 Modelling of Time Series The next figure shows a time series (DAX) Other interesting time-series: energy prize, energy consumption,

More information

Finding the location of switched capacitor banks in distribution systems based on wavelet transform

Finding the location of switched capacitor banks in distribution systems based on wavelet transform UPEC00 3t Aug - 3rd Sept 00 Finding the location of witched capacitor bank in ditribution ytem baed on wavelet tranform Bahram nohad Shahid Chamran Univerity in Ahvaz bahramnohad@yahoo.com Mehrdad keramatzadeh

More information

SMALL-FOOTPRINT HIGH-PERFORMANCE DEEP NEURAL NETWORK-BASED SPEECH RECOGNITION USING SPLIT-VQ. Yongqiang Wang, Jinyu Li and Yifan Gong

SMALL-FOOTPRINT HIGH-PERFORMANCE DEEP NEURAL NETWORK-BASED SPEECH RECOGNITION USING SPLIT-VQ. Yongqiang Wang, Jinyu Li and Yifan Gong SMALL-FOOTPRINT HIGH-PERFORMANCE DEEP NEURAL NETWORK-BASED SPEECH RECOGNITION USING SPLIT-VQ Yongqiang Wang, Jinyu Li and Yifan Gong Microsoft Corporation, One Microsoft Way, Redmond, WA 98052 {erw, jinyli,

More information

CHAPTER 6. Estimation

CHAPTER 6. Estimation CHAPTER 6 Etimation Definition. Statitical inference i the procedure by which we reach a concluion about a population on the bai of information contained in a ample drawn from that population. Definition.

More information

Signaling over MIMO Multi-Base Systems: Combination of Multi-Access and Broadcast Schemes

Signaling over MIMO Multi-Base Systems: Combination of Multi-Access and Broadcast Schemes Signaling over MIMO Multi-Bae Sytem: Combination of Multi-Acce and Broadcat Scheme Mohammad Ali Maddah-Ali Abolfazl S. Motahari and Amir K. Khandani Coding & Signal Tranmiion Laboratory (www.ct.uwaterloo.ca)

More information

Noise Compensation for Subspace Gaussian Mixture Models

Noise Compensation for Subspace Gaussian Mixture Models Noise ompensation for ubspace Gaussian Mixture Models Liang Lu University of Edinburgh Joint work with KK hin, A. Ghoshal and. enals Liang Lu, Interspeech, eptember, 2012 Outline Motivation ubspace GMM

More information

A SIMPLE NASH-MOSER IMPLICIT FUNCTION THEOREM IN WEIGHTED BANACH SPACES. Sanghyun Cho

A SIMPLE NASH-MOSER IMPLICIT FUNCTION THEOREM IN WEIGHTED BANACH SPACES. Sanghyun Cho A SIMPLE NASH-MOSER IMPLICIT FUNCTION THEOREM IN WEIGHTED BANACH SPACES Sanghyun Cho Abtract. We prove a implified verion of the Nah-Moer implicit function theorem in weighted Banach pace. We relax the

More information

Efficient Global Optimization Applied to Multi-Objective Design Optimization of Lift Creating Cylinder Using Plasma Actuators

Efficient Global Optimization Applied to Multi-Objective Design Optimization of Lift Creating Cylinder Using Plasma Actuators Efficient Global Optimization Applied to Multi-Objective Deign Optimization of Lift Creating Cylinder Uing Plama Actuator Maahiro Kanazaki 1, Takahi Matuno 2, Kengo Maeda 2 and Mituhiro Kawazoe 2 1 Graduate

More information

A Comparative Study on Control Techniques of Non-square Matrix Distillation Column

A Comparative Study on Control Techniques of Non-square Matrix Distillation Column IJCTA, 8(3), 215, pp 1129-1136 International Science Pre A Comparative Study on Control Technique of Non-quare Matrix Ditillation Column 1 S Bhat Vinayambika, 2 S Shanmuga Priya, and 3 I Thirunavukkarau*

More information

EE Control Systems LECTURE 6

EE Control Systems LECTURE 6 Copyright FL Lewi 999 All right reerved EE - Control Sytem LECTURE 6 Updated: Sunday, February, 999 BLOCK DIAGRAM AND MASON'S FORMULA A linear time-invariant (LTI) ytem can be repreented in many way, including:

More information

Chapter 5 Optimum Receivers for the Additive White Gaussian Noise Channel

Chapter 5 Optimum Receivers for the Additive White Gaussian Noise Channel Chapter 5 Optimum Receiver for the Additive White Gauian Noie Channel Table of Content 5.1 Optimum Receiver for Signal Corrupted by Additive White Noie 5.1.1 Correlation Demodulator 5.1. Matched-Filter

More information

Convex Optimization-Based Rotation Parameter Estimation Using Micro-Doppler

Convex Optimization-Based Rotation Parameter Estimation Using Micro-Doppler Journal of Electrical Engineering 4 (6) 57-64 doi:.765/8-/6.4. D DAVID PUBLISHING Convex Optimization-Baed Rotation Parameter Etimation Uing Micro-Doppler Kyungwoo Yoo, Joohwan Chun, Seungoh Yoo and Chungho

More information

Predicting the Performance of Teams of Bounded Rational Decision-makers Using a Markov Chain Model

Predicting the Performance of Teams of Bounded Rational Decision-makers Using a Markov Chain Model The InTITuTe for ytem reearch Ir TechnIcal report 2013-14 Predicting the Performance of Team of Bounded Rational Deciion-maer Uing a Marov Chain Model Jeffrey Herrmann Ir develop, applie and teache advanced

More information

ON THE APPROXIMATION ERROR IN HIGH DIMENSIONAL MODEL REPRESENTATION. Xiaoqun Wang

ON THE APPROXIMATION ERROR IN HIGH DIMENSIONAL MODEL REPRESENTATION. Xiaoqun Wang Proceeding of the 2008 Winter Simulation Conference S. J. Maon, R. R. Hill, L. Mönch, O. Roe, T. Jefferon, J. W. Fowler ed. ON THE APPROXIMATION ERROR IN HIGH DIMENSIONAL MODEL REPRESENTATION Xiaoqun Wang

More information

Simple Observer Based Synchronization of Lorenz System with Parametric Uncertainty

Simple Observer Based Synchronization of Lorenz System with Parametric Uncertainty IOSR Journal of Electrical and Electronic Engineering (IOSR-JEEE) ISSN: 78-676Volume, Iue 6 (Nov. - Dec. 0), PP 4-0 Simple Oberver Baed Synchronization of Lorenz Sytem with Parametric Uncertainty Manih

More information

Optimal Coordination of Samples in Business Surveys

Optimal Coordination of Samples in Business Surveys Paper preented at the ICES-III, June 8-, 007, Montreal, Quebec, Canada Optimal Coordination of Sample in Buine Survey enka Mach, Ioana Şchiopu-Kratina, Philip T Rei, Jean-Marc Fillion Statitic Canada New

More information

The Use of MDL to Select among Computational Models of Cognition

The Use of MDL to Select among Computational Models of Cognition The Ue of DL to Select among Computational odel of Cognition In J. yung, ark A. Pitt & Shaobo Zhang Vijay Balaubramanian Department of Pychology David Rittenhoue Laboratorie Ohio State Univerity Univerity

More information

Unified Correlation between SPT-N and Shear Wave Velocity for all Soil Types

Unified Correlation between SPT-N and Shear Wave Velocity for all Soil Types 6 th International Conference on Earthquake Geotechnical Engineering 1-4 ovember 15 Chritchurch, ew Zealand Unified Correlation between SPT- and Shear Wave Velocity for all Soil Type C.-C. Tai 1 and T.

More information

arxiv: v4 [cs.cl] 5 Jun 2017

arxiv: v4 [cs.cl] 5 Jun 2017 Multitask Learning with CTC and Segmental CRF for Speech Recognition Liang Lu, Lingpeng Kong, Chris Dyer, and Noah A Smith Toyota Technological Institute at Chicago, USA School of Computer Science, Carnegie

More information

Manprit Kaur and Arun Kumar

Manprit Kaur and Arun Kumar CUBIC X-SPLINE INTERPOLATORY FUNCTIONS Manprit Kaur and Arun Kumar manpreet2410@gmail.com, arun04@rediffmail.com Department of Mathematic and Computer Science, R. D. Univerity, Jabalpur, INDIA. Abtract:

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)

More information

Solving Radical Equations

Solving Radical Equations 10. Solving Radical Equation Eential Quetion How can you olve an equation that contain quare root? Analyzing a Free-Falling Object MODELING WITH MATHEMATICS To be proficient in math, you need to routinely

More information