Question Classification Using Language Modeling

Size: px
Start display at page:

Download "Question Classification Using Language Modeling"

Transcription

1 Queston Classfcaton Usng Language Modelng We L Center for Intellgent Informaton Retreval Department of Computer Scence Unversty of Massachusetts, Amherst, MA ABSTRACT Queston classfcaton assgns a partcular class to a queston based on the type of answer entty the queston represents. In ths report, I present two approaches: the tradtonal regular expresson model, whch s both effcent and effectve for some questons but nsuffcent when dealng wth others; and the language model, a probablstc approach to solvng the problem. Two types of language models have been constructed: ungram models and bgram models. Several ssues are explored, such as how to smooth the probabltes and how to combne the two types of mode ls. As expected, the language model outperforms the regular expresson model. An even better result can be acheved by combnng the two approaches together. 1. INTRODUCTION Queston answerng s a varant of nformaton retreval, whch retreves specfc nformaton rather than documents. A QA system takes a natural language queston as nput, transforms the queston nto a query and forwards t to an IR module. When a set of relevant documents s retreved, the QA system extracts an answer for ths queston. There are dfferent ways of dentfyng answers. One of them makes use of a predefned set of entty classes. Gven a partcular queston, the QA system classfes t nto those classes based on the type of entty t s lookng for, dentfes entty nstances n the documents, and selects the most lkely one from all the enttes wth the same class as the queston. So ths approach nvolves two tasks. Frst, we should be able to dentfy named enttes. Ths s a problem n the Informaton Extracton area [1], and we can make use of an exstng entty tagger. Second, we need to classfy questons nto dfferent classes, and ths s the problem I addressed here. One approach to queston classfcaton s to determne the queston type based on the sentence structure and key words, whch represent syntactc and semantc nformaton respectvely. A set of patterns are defned and hard-coded, often wth regular expressons. When a new queston comes, t s matched aganst those patterns to fnd the class t belongs to. As the pattern set gets more complete and accurate, the performance of ths approach wll become better. So to mprove ths model, we always have the problem of defnng more and more queston patterns. To make the process of queston classfcaton more dynamc and automatc, we make use of language modelng, a statstcal approach that has ganed much attenton recently n the IR area [2]. In ths approach, the models can be automatcally constructed from the tranng set, and ts performance s compettve to other approaches. As for the QA task, we buld one language model for every class of questons based on the tranng data set. To classfy a queston, the probablty of generatng t s calculated for each class based on ts language model, and the hghest probablty determnes the classfcaton.

2 For the rest of ths report, I wll present the mplementaton of these two approaches and dscuss ther performance, wth the focus on language modelng. In secton 2, I wll talk about two preparaton steps: defnng queston classes and preprocessng questons before classfcaton; secton 3 s about the regular expresson model: pros and cons; secton 4 dscusses language models two experments and combnaton wth the regular expresson model; secton 5 examnes performance n dfferent cases; secton 6 ntroduces related work and secton 7 s the concluson. 2. PREPARATION Defnng queston classes s the frst step n classfcaton. One mportant prncpal when defnng these classes s that all the classes we use to mark questons should be recognzable as enttes n the documents. Ths s because queston classfcaton s not an ndependent job, but a component for the QA task. Two knds of classes are used. Some entty classes are naturally related to queston classes, such as person, locaton, number and so on. Other classes are created for partcular types of questons. For example, a frequently asked type of queston s: Who s sb.? Typcally people want to fnd qute detaled nformaton about ths person wth ths queston. We don t have a good object class correspondng to ths type of queston, so we use the term bography to denote the type of answers for ths queston and add t nto the queston class set. Another preprocessng step s to re -form the queston to make ts underlyng pattern clearer. For example, the Who s sb.? questons always ask for a bography entty no matter what person s name appears n the queston. In other words, the mportant thng s to know that ths queston contans a person entty. We do not care about the specfc entty. So we can safely change questons of ths pattern nto Who s <PERSON>? wthout losng any nformaton useful n determnng the queston type. What we actually do s to run an entty recognzer, the major part of whch s IdentFnder [3], aganst questons and replace all enttes wth ther entty class names. 3. REGULAR EXPRESSION MODEL The basc dea of ths model s to determne a queston type based on the sentence pattern, whch ncludes the nterrogatve word, certan sequences of words and some representatve terms of partcular queston classes. Those patterns are defned wth regular expressons. For example, a queston startng wth how many s very lkely to be lookng for a number, and a queston startng wth where s probably a locaton queston. For a what queston, we can look for some key words to make our decson. For example, agency, company and unversty are related to the organzaton class. Here are some regular expressons used for certan classes of questons: Questons that start wth what and ask for a person entty: actor actresse? attorne y e) teacher... senator) s? Questons that start wth how and ask for a length entty: long short wde far close bg.* dameter radus) Ths approach s very effcent and effectve on some queston patterns, such as how many questons. It seldom makes mstakes for ths type of queston. But there are dffcult cases that t can hardly handle. For nstance, the answer to a who queston mght be a person, an organzaton, and even a locaton. Let s take the queston Who s the largest producer of laptop computers n the world? as an example. People can easly tell ths s askng for an organzaton, but our program cannot decde ts type just based on the queston pattern. We need addtonal semantc

3 nformaton, whch s not avalable n the regular expresson model. The same problem occurs wth the where questons. Many where questons are classfed as locaton whle they are actually organzaton questons. The only way to solve ths knd of problem s to buld a more complete and accurate pattern set, whch nvolves a great deal of human work. Instead of buldng a larger and larger queston pattern model, we turned to a more automatc and flexble approach: language modelng. 4. LANGUAGE MODEL The basc dea of language modelng s that every pece of text can be vewed as beng generated from a language model. If we have two peces of text, we can defne the degree of relevance between them as the probablty that they are generated by the same language model. In the nformaton retreval area, we buld one language model for each document. Gven a query, we can decde whether a document s relevant based on the probablty that ts language model generates such a query. Suppose that the query Q s composed of n tokens: w 1, w 2, w n, and we can calculate the probablty as: P Q D) = P D)* P D, )*...* P wn D,,,..., wn 1) So to buld the language model on a document, we need to estmate those term probabltes. Usually, a k-gram assumpton s made to smplfy the estmaton: P w D,, w 2,..., w 1) = P w D, w k 1), w k 2 ),..., w 1) It means that the probablty that w occurs n the document D wll only depend on the precedng k-1) tokens [4]. Smlar deas have been ntroduced nto the queston classfcaton task. We buld one language model for each category C of sample questons. When a new queston Q comes, we calculate the probablty PQ for each C and pck the one wth the hghest probablty. The major advantage of language model over the regular expresson model s ts flexblty. The regular expresson model s composed of hard-coded rules, whch need to be modfed to handle new cases. The language model, however, can be automatcally mantaned. And we beleve that, wth larger sets of tranng data, the performance of the language model can be mproved. Two experments have been conducted, and both of them nclude two language models: ungram and bgram models. The dfference between them s the smoothng technque and the combnaton method. However, the two experments provde smlar performance. The detals wll be dscussed below. 4.1 EXPERIMENT 1 The ungram and bgram models are the two smplest to construct, where P Q = P * P *...* P wn and P Q = P w * P w C, w )*...* P w C, w ) 1 n n 1 respectvely. For the ungram model, we need to estmate the probablty of a token w occurrng n the category C, Pw. Intutvely, t should be proportonal to the term frequency Fw. The trcky part s how to deal wth tokens that never occurred n ths category. We don t want them to have a probablty of 0, so some probabltes must be assgned to them and the probabltes for other words wll be adjusted accordngly. Ths knd of smoothng can be done n several ways, and for ths experment, we used an absolute dscount method. A small

4 constant amount of probabltes s assgned to all 0-occurrence tokens, and the probabltes for other tokens wll be subtracted accordngly [4]. Here s the formula: Let Total0 be the number of 0-occurrence tokens n category C and S be the smoothng dscount. So we have: P w = F w *1 S) f F w 0 S / Total 0 f F w = 0 The bgram model s bult smlarly, where we need to estmate the condtonal probablty Pw 2 C,w 1 ). Let Total0C,w 1 ) be the number of tokens that never occur after w 1 n category C, and S be the smoothng dscount. There are two cases to consder: In ths experment, we stll buld the ungram and bgram models. But a dfferent smoothng technque and combnaton method are used. For the ungram model, we make use of Good -Turng [5] to estmate the probabltes for tokens that occur small numbers of tmes or never occur. Accordng to the Good-Turng estmate, Pw should have the followng structure: P w = α F w f Count w > M q f The choces of α, ω Count w = and 0 M q and M must satsfy: P w = 1 and q 1 < q Case 1: Fw 1 0, where the probabltes for all unseen w 2 s S. So we have: P C, ) = F w C, w ) * 1 ) f F w C, w ) 0 S S Total0 C, w ) f F w C, w ) 0 / 1 = Case 2: Fw 1 = 0, where all w 2 are unseen. So P C,) should be the same for every w 2, whch s calculated as follows: P C, ) = 1/ Total0 C, ) To make the estmaton more accurate, we try to combne the two models together. Lnear combnaton s a straghtforward way, where P Q = λp Q + 1 λ) P Q u Dfferent values for λ have been tested, and the best one s chosen. 4.2 EXPERIMENT 2 b There are several ways to derve the formula, and the result s as follows: Let N be the sze of the corpus, and n be the number of tokens that occur tmes n C. Actually, we should use En ), the expected value for n. But ths value s not avalable, so we can only use the drectly observed one nstead. P w = n > M + 1 ) F w n n ) + 1 C ) n > M f + f Count w > M 1 ) N Count w = and 0 M M s the largest number that satsfes:

5 2 + 1 n ) < n 1 n+ 1 = 1,..., M and n M + 1 > M + 1 < n M > M n n Whle the ungram model s bult based on the Good-Turng estmate, a Back-Off model [4] s developed for bgrams. The basc dea of the Back-Off model s that P C,) should be proportonal to Fw 2 C,w 1 ) only when the occurrence of w 1, w 2 ) n C s larger than a certan number. Otherwse, we just use Pw 2 to estmate Pw 2 C,w 1 ). Here s the formula: P C, ) = α F w C, ) f Count w C, w ) > K 2 β P f Count w C, w ) K a s a dscount to subtract the probabltes from large-occurrence bgrams, and we used the same dscount as n the Good-Turng. ß s chosen for normalzaton: P C, ) = 1 It s a functon of w 1. K should be a small number, and we found that 0 provdes the best performance for our data. The Back-Off model naturally combnes the ungram and bgram models. So to calculate the probablty PQ, we can just use the bgram result,.e., P Q = P Q 4.3 COMBINED WITH RE MODEL Although the language model seems more attractve, t stll has drawbacks. One of them s unpredctablty. For example, as we do not have any restrcton on the classfcaton result b of the language model, t s possble to classfy a queston that starts wth how many as a person queston. On the other hand, ths knd of pattern s easy to capture by the regular expresson model. So we tred to combne them to mprove performance. The language model s modfed to generate a ranked lst of categores based on the belef score, and the regular expresson model returns all categores compatble wth the queston pattern. The combnaton polcy s that the category wth the hghest rank that s accepted by the regular expresson model s the fnal answer. In ths way, the mstake mentoned above wll be avoded. 5. EVALUATION A set of 693 TREC questons has been used for evaluaton. They belong to the followng classes: Class Name # of questons PERSON 116 LOCATION 126 DATE 73 ORGANIZATION 64 NUMBER 74 OBJECT 121 REFERENCE 119 When testng the language models, we need a tranng set to buld the models. So we dd the experments n the followng way. The whole queston set was randomly dvded nto fve equally large dsjont parts. One part s chosen to be the test data, whle the other four serve as the tranng data. Accuracy s calculated by comparng the classfcaton result wth the manually classfed result. The same process has been repeated fve tmes, each tme a dfferent test set s chosen. And the average accuracy s used to measure the performance.

6 Here are the test results for all the models dscussed above: phrase after what and the queston type s determned accordngly. Model Accuracy Regular Expresson Model only 57.57% Experment1 Experment2 LM only 81.54% LM combned wth RE Model 85.43% LM only 80.96% LM combned wth RE Model 83.56% The result shows that the language model performs better than the regular expresson model, and the performance can be further mproved f we combne them together. A lttle surprsngly, the language model n the frst experment outperforms the second one. We were expectng the reverse result snce both the Good -Turng Estmate and the Back-Off Model have been shown to perform well n practce. One possble explanaton s that our data set s nsuffcent to apply Good-Turng Estmate. As dscussed above, we used n n places of En ). These two values should be close when the data set s large enough. But n our case, where there are only around 700 questons, ths estmaton mght be qute bad. 6. RELATED WORK Queston classfcaton s a common part n QA systems. The basc dea s the same: to classfy questons and dentfy correspondng enttes n documents, but t can be acheved n dfferent ways. Many systems use technques that are smlar to the regular expresson model just mentoned. MURAX s an earler QA system that makes use of an onlne encyclopeda [6]. Its heurstc s smple: to classfy questons based on the nterrogatve words. And for what questons, whch may ask for several types of enttes, the encyclopeda s searched for the noun Another QA system usng named enttes and queston classfcaton s the GuruQA system descrbed by Prager [7]. It mantans a set of patterns and compares questons wth them to determne ther types. The queston type s used as a query term and the documents have been processed to add types to the named enttes. In ths way, the document contanng a named entty wth the same type of the queston s more lkely to be retreved. 7. CONCLUSION Queston answerng dffers from nformaton retreval n that t needs to retreve specfc fact nformaton rather than whole documents. Ths mght nvolve excessve computaton f there s no gudance for possble answers. By classfyng questons and named enttes nto the same set of classes, we can elmnate a large amount of rrelevant nformaton. Ths report has nvestgated two approaches for queston classfcaton: regular expresson model and language modelng. The regular expresson model s a smplstc approach and has been put nto practce n many systems. Language modelng s a probablstc approach mported from IR systems. The models are constructed n a more flexble and automatc way. We have bult two types of models: a lnear combnaton of ungram and bgram models wth an absolute-dscount smoothng technque; and a Back-Off bgram model wth Good -Turng estmate. The test result shows that the language model outperforms the regular expresson model. And an even better result can be acheved when the two models are combned together. Although Good-Turng and Back-Off models have been proved effectve n practce, the

7 second language model doesn t mprove the performance over the frst one. ACKOWLEDGMENTS Ths materal s based on work supported n part by the Center for Intellgent Informaton Retreval and n part by NSF grant #EIA The author would lke to thank Davd Pnto, Bruce Croft, Andres Corrada-Emmanuel and Davd Fsher for ther help and support. Any opnon, fndngs and conclusons or recommendatons expressed n ths materal are the authors and do not necessarly reflect those of the sponsors. REFERENCES [1] R. Srhar and W. L, Informaton Extracton Supported Queston Answerng. [2] J. M. Ponte and W. B. Croft, A Language Modelng Approach to Informaton Retreval. [3] BBN offcal ste about the IdentFnder: [4] C. Mannng, H. Schutze and H. Schutze, Foundatons of Statstcal Natural Language Processng. [5] F. Jelnek, Statstcal Methods for Speech Recognton. [6] J. Kupec, MURAX: A Robust Lngustc Approach For Queston Answerng Usng An On-Lne Encyclopeda. [7] J. Prager, E. Brown and A. Coden, Queston-Answerng by Predcatve Annotaton.

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k. THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Evaluation for sets of classes

Evaluation for sets of classes Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Note on EM-training of IBM-model 1

Note on EM-training of IBM-model 1 Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Informaton Search and Management Probablstc Retreval Models Prof. Chrs Clfton 7 September 2018 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group 14 Why probabltes

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Manning & Schuetze, FSNLP (c)1999, 2001

Manning & Schuetze, FSNLP (c)1999, 2001 page 589 16.2 Maxmum Entropy Modelng 589 Mannng & Schuetze, FSNLP (c)1999, 2001 a decson tree that detects spam. Fndng the rght features s paramount for ths task, so desgn your feature set carefully. Exercse

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Extracting Pronunciation-translated Names from Chinese Texts using Bootstrapping Approach

Extracting Pronunciation-translated Names from Chinese Texts using Bootstrapping Approach Extractng Pronuncaton-translated Names from Chnese Texts usng Bootstrappng Approach Jng Xao School of Computng, Natonal Unversty of Sngapore xaojng@comp.nus.edu.sg Jmn Lu School of Computng, Natonal Unversty

More information

arxiv:quant-ph/ Jul 2002

arxiv:quant-ph/ Jul 2002 Lnear optcs mplementaton of general two-photon proectve measurement Andrze Grudka* and Anton Wóck** Faculty of Physcs, Adam Mckewcz Unversty, arxv:quant-ph/ 9 Jul PXOWRZVNDR]QDRODQG Abstract We wll present

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Fundamental loop-current method using virtual voltage sources technique for special cases

Fundamental loop-current method using virtual voltage sources technique for special cases Fundamental loop-current method usng vrtual voltage sources technque for specal cases George E. Chatzaraks, 1 Marna D. Tortorel 1 and Anastasos D. Tzolas 1 Electrcal and Electroncs Engneerng Departments,

More information

Lecture 7: Boltzmann distribution & Thermodynamics of mixing

Lecture 7: Boltzmann distribution & Thermodynamics of mixing Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

Probabilistic Information Retrieval CE-324: Modern Information Retrieval Sharif University of Technology

Probabilistic Information Retrieval CE-324: Modern Information Retrieval Sharif University of Technology Probablstc Informaton Retreval CE-324: Modern Informaton Retreval Sharf Unversty of Technology M. Soleyman Fall 2016 Most sldes have been adapted from: Profs. Mannng, Nayak & Raghavan (CS-276, Stanford)

More information

Topic 23 - Randomized Complete Block Designs (RCBD)

Topic 23 - Randomized Complete Block Designs (RCBD) Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,

More information

Retrieval Models: Language models

Retrieval Models: Language models CS-590I Informaton Retreval Retreval Models: Language models Luo S Department of Computer Scence Purdue Unversty Introducton to language model Ungram language model Document language model estmaton Maxmum

More information

Société de Calcul Mathématique SA

Société de Calcul Mathématique SA Socété de Calcul Mathématque SA Outls d'ade à la décson Tools for decson help Probablstc Studes: Normalzng the Hstograms Bernard Beauzamy December, 202 I. General constructon of the hstogram Any probablstc

More information

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Statistical Foundations of Pattern Recognition

Statistical Foundations of Pattern Recognition Statstcal Foundatons of Pattern Recognton Learnng Objectves Bayes Theorem Decson-mang Confdence factors Dscrmnants The connecton to neural nets Statstcal Foundatons of Pattern Recognton NDE measurement

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Complex Question Answering with ASQA at NTCIR 7 ACLIA

Complex Question Answering with ASQA at NTCIR 7 ACLIA Complex Queston Answerng wth ASQA at NTCIR 7 ACLIA Y-Hsun Lee 1, Cheng-We Lee 12, Cheng-Lung Sung 1, Mon-Tn Tzou 1, Chh-Chen Wang 1, Shh-Hung Lu 1, Cheng-We Shh 1, Pe-Yn Yang 1, Wen-Lan Hsu 1 1 Insttute

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

18. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

On the correction of the h-index for career length

On the correction of the h-index for career length 1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Lecture 13 APPROXIMATION OF SECOMD ORDER DERIVATIVES

Lecture 13 APPROXIMATION OF SECOMD ORDER DERIVATIVES COMPUTATIONAL FLUID DYNAMICS: FDM: Appromaton of Second Order Dervatves Lecture APPROXIMATION OF SECOMD ORDER DERIVATIVES. APPROXIMATION OF SECOND ORDER DERIVATIVES Second order dervatves appear n dffusve

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

CS-433: Simulation and Modeling Modeling and Probability Review

CS-433: Simulation and Modeling Modeling and Probability Review CS-433: Smulaton and Modelng Modelng and Probablty Revew Exercse 1. (Probablty of Smple Events) Exercse 1.1 The owner of a camera shop receves a shpment of fve cameras from a camera manufacturer. Unknown

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

MAKING A DECISION WHEN DEALING WITH UNCERTAIN CONDITIONS

MAKING A DECISION WHEN DEALING WITH UNCERTAIN CONDITIONS Luca Căbulea, Mhaela Aldea-Makng a decson when dealng wth uncertan condtons MAKING A DECISION WHEN DEALING WITH UNCERTAIN CONDITIONS. Introducton by Luca Cabulea and Mhaela Aldea The decson theory offers

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Expected Value and Variance

Expected Value and Variance MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Hidden Markov Models

Hidden Markov Models CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte

More information

CHAPTER IV RESEARCH FINDING AND DISCUSSIONS

CHAPTER IV RESEARCH FINDING AND DISCUSSIONS CHAPTER IV RESEARCH FINDING AND DISCUSSIONS A. Descrpton of Research Fndng. The Implementaton of Learnng Havng ganed the whole needed data, the researcher then dd analyss whch refers to the statstcal data

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Neryškioji dichotominių testo klausimų ir socialinių rodiklių diferencijavimo savybių klasifikacija

Neryškioji dichotominių testo klausimų ir socialinių rodiklių diferencijavimo savybių klasifikacija Neryškoj dchotomnų testo klausmų r socalnų rodklų dferencjavmo savybų klasfkacja Aleksandras KRYLOVAS, Natalja KOSAREVA, Julja KARALIŪNAITĖ Technologcal and Economc Development of Economy Receved 9 May

More information

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

The optimal delay of the second test is therefore approximately 210 hours earlier than =2. THE IEC 61508 FORMULAS 223 The optmal delay of the second test s therefore approxmately 210 hours earler than =2. 8.4 The IEC 61508 Formulas IEC 61508-6 provdes approxmaton formulas for the PF for smple

More information

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting Onlne Appendx to: Axomatzaton and measurement of Quas-hyperbolc Dscountng José Lus Montel Olea Tomasz Strzaleck 1 Sample Selecton As dscussed before our ntal sample conssts of two groups of subjects. Group

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: November 17, Part 1 Single Buffer Management Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

More information

MDL-Based Unsupervised Attribute Ranking

MDL-Based Unsupervised Attribute Ranking MDL-Based Unsupervsed Attrbute Rankng Zdravko Markov Computer Scence Department Central Connectcut State Unversty New Brtan, CT 06050, USA http://www.cs.ccsu.edu/~markov/ markovz@ccsu.edu MDL-Based Unsupervsed

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

Lecture 2: Prelude to the big shrink

Lecture 2: Prelude to the big shrink Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson

More information