Blocky models via the L1/L2 hybrid norm
|
|
- Emery McDowell
- 5 years ago
- Views:
Transcription
1 Blocky models va the L1/L2 hybrd norm Jon Claerbout ABSTRACT Ths paper seeks to defne robust, effcent solvers of regressons of L1 nature wth two goals: (1) straghtforward parameterzaton, and (2) blocky solutons. It uses an L1/L2 hybrd norm characterzed by a resdual R d of transton between L1 and L2 for data fttng and another R m for model stylng. Both the steepest descent and conjugate drecton methods are ncluded. The 1-D blnd deconvoluton problem s formulated n a manner ntended to lead to both a blocky mpedance functon and a source waveform. No results are gven. INTRODUCTION I ve seen many applcatons mproved when least-squares (L2) model fttng was changed to least absolute values (L1). I ve never seen the reverse. Never-the-less we always return to L2 because the solvng method s easer and faster. It does not requre us to specfy parameters of numercal analyss that are unclear how to specfy. Another reason to re-nvestgate L1 s ts natural ablty to estmate blocky models. Sedmentary sectons tend to fluctuate randomly, but sometmes there s a homogeneous materal contnung for some dstance. A functon wth such homogeneous regons s called blocky. The dervatve of such a functon s called sparse. L2 gves huge penaltes to large values and mnuscule penaltes to small ones, hence t never really produces sparse functons and ther ntegrals are never really blocky. If we had an easy, relable L1 solver, we could expect to see many more realstc solutons. There are reasons to abandon strct L1 and revert to an L1/L2 hybrd solver. A hybrd solver has a parameter, a threshold, at whch L2 behavor transts to L1. We have good reasons to use two dfferent hybrd solvers, one for the data fttng, the other for the model stylng (pror knowledge or regularzaton). Each requres a threshold of resdual, let us call t R d for the data fttng, and R m for the model stylng. Processes that requre parameters are detestble when we have a poor dea of the meanng of the parameters (especally f they relate to numercal analyss), however the meanng of the thresholds R d and R m s qute clear. When we look at a shot gather and see about 30% of the area s covered wth ground roll, t s clear we would lke to choose R d to be at about the 70th percentle of the fttng resdual. As for the model stylng, f we d lke to see blocks about 20 ponts long, we d lke our spkes to average about 20 ponts apart, so we would lke R m about the 95th percentle allowng 5% of the spkes to be of unlmted sze, whle the others small.
2 Claerbout 2 Blocky models: L1/L2 hybrd norm I was frst attracted to strct L1 by ts potental for blocky models. But then I realzed for each nonspke (zero) on the tme axs, theory says I would need a bass equaton. That mples an mmense number of teratons, so t s unacceptable n magng applcatons. Wth the hybrd solvers, nstead of exact zeros we have a large regon drven down by the L2 norm and a small L1 regon where large spkes are welcomed. MODEL DERIVATIVES Here s the usual defnton of resdual r of theoretcal data j F,jm j from observed data d r = ( j F,j m j ) d or r = Fm d. (1) Let C() be a convex functon (C 0) of a scalar. The penalty functon (or norm of resduals s expressed by N(m) = C(r ) (2) We denote a column vector g wth components g by g = vec(g ). We soon requre the dervatve of C(r) at each resdual r : [ ] C(r ) g = vec (3) r We often update models n the drecton of the gradent of the norm of the resdual. m = N m k = C(r ) r r = m k g(r )F,k = F g (4) Defne a model update drecton by m = F g. Snce r = Fm d, we see the resdual update drecton wll be r = F m. To fnd the dstance α to move n those drectons we choose the scalar α to mnmze m m + α m (5) r r + α r (6) N(α) = C(r + α r ) (7) The sum n equaton (7) s a sum of dshes, shapes between L2 parabolas and L1 V s. The -th dsh s centered on α = r / r. It s steep and narrow f r s large,
3 Claerbout 3 Blocky models: L1/L2 hybrd norm and low and flat where r s small. The postve sum of convex functons s convex. There are no local mnma. We can get to the bottom by followng the gradent. Next we consder some choces for convex functons. We ll need them, ther frst and second dervatves. Some convex functons and ther dervatves LEAST SQUARES: C = r 2 /2 (8) C = r (9) C = 1 0 (10) L1 NORM: C = r (11) C = sgn(r) (12) C = 0 or 0 (13) HYBRID: C = R 2 ( 1 + r 2 /R 2 1) (14) C = r 1 + r2 /R 2 (15) C = 1 (1 + r 2 /R 2 ) 3/2 0 (16) HUBER: C = C = C = { } r R/2 f r R r 2 /2R otherwse { } sgn(r) f r R r/r otherwse { } 0 or f r R 1/R otherwse (17) (18) 0 (19) I have scaled Hybrd so t naturally approaches the least squares lmt as R. As R 0, t tends to C = R r, scaled L1. Because of the erratc behavor of C for L1 and Huber, and our planned use of second order Taylor seres, we wll not be usng L1 and Huber norms here. Also, we should prepare ourselves for danger as HYBRID approaches the L1 lmt. (I thank Mandy for remndng me of the nfnte second dervatve and I thank Mohammad and Nader for demonstratng numercal erratc behavor.)
4 Claerbout 4 Blocky models: L1/L2 hybrd norm PLANE SEARCH The most unversally used method of solvng mmense lnear regressons such as magng problems s the Conjugate Gradent (CG) method. It has the remarkable property that n the presence of exact arthmetc, the exact soluton s found n a fnte number of teratons. A smpler method wth the same property s the Conjugate Drecton method. It s debatable whch has the better numercal roundoff propertes, so we generally use the Conjugate Drecton method as t s smpler to comprehend. It says not to move along the gradent drecton lne, but somewhere n the plane of the gradent and the prevous step. The best move n that plane requres us to fnd two scalars, one α to scale the gradent, the other β to scale the prevous step. That s all for L2 optmzaton. We proceed here n the same way wth other norms and hope for the best. So here we are, embedded n a gant multvarate regresson where we have a bvarate regresson (two unknowns). From the multvarate regresson we are gven three vectors n data space. r, g and s. You wll recognze these as the current resdual, the gradent ( r ), and the prevous step. (The gradent and prevous step appearng here have prevously been transformed to data space (the conjugate space) by the operator F.) Our next resdual wll be a perturbaton of the old one. We seek to mnmze by varaton of (α, β) r = r + αg + βs (20) N(α, β) = C( r + αg + βs ) (21) Let the coeffcents (C, C, C ) refer to a Taylor expanson of C(r) about r. N(α, β) = C + (αg + βs )C + (αg + βs ) 2 C /2 (22) We have two unknowns, (α, β) n a quadratc form. We set to zero the α dervatve of the quadratc form, lkewse the β dervatve gettng [ ] 0 = [ ] {[ ] } C g + C α 0 s (αg + βs ) (αg + βs ) (23) β resultng n a 2 2 set of equatons to solve for α and β. { [( ) ] } [ ] C g α (g s s ) = β C [ g s ] (24) The soluton of any 2 2 set of smultaneous equatons s generally trval. The only dffcultes arse when the determnant vanshes whch here s easy (luckly) to
5 Claerbout 5 Blocky models: L1/L2 hybrd norm understand. Generally the gradent should not pont n the drecton of the prevous step f the prevous move went the proper dstance. Hence the determnant should not vansh. Practce shows that the determnant wll vansh when all the nputs are zero, and t may vansh f you do so many teratons that you should have stopped already, n other words when the gradent and prevous step are both tendng to zero. Usng the newly found (α, β), update the resdual r at each locaton (and update the model). Then go back to re-evaluate C and C at the new r locatons. Iterate. In what way do we hope/expect ths new bvarate solver embedded n a conjugate drecton solver to perform better than old IRLS solvers? After payng the nevtable prce, a substantal prce, of computng F r and F m the teraton above does some serous thnkng, not a smple lnearzaton, before payng the prce agan. If the convex functon C(r) were least squares, subsequent teratons would do nothng. Although the Taylor seres of the second teraton would expand about dfferent resduals r than the frst teraton, the new second order Taylor seres are the exact representaton of the least squares penalty functon,.e. the same as the frst so the next teraton goes nowhere. Wll ths computatonal method work (converge fast enough) n the L1 lmt? I don t know. Perhaps we ll do better to approach that lmt (f we actually want that lmt) va gradually decreasng the threshold R. BLOCKY LOGS: BOTH FITTING AND REGULARIZATION Here we set out to fnd blocky functons, such as well logs. We wll do data fttng wth a somewhat L2-lke convex penalty functon whle dong model stylng wth a more L1-lke functon. We mght defne the composte norm threshold resdual R d for the data fttng at the 60th percentle, and that for regularzaton seekng spky models (wth blocky ntegrals) as R m at the 5th percentle. The data fttng goal and the model regularzaton goal at each z s ndependent from that at all other z values. The fttng goal says the reflectvty m(z) should be equal to ts measurement d(z) (the sesmogram). The model stylng goal says the reflectvty m(z) should vansh. 0 r d (z) = m(z) d(z) (25) 0 r m (z) = ɛ m(z) (26) These two goals are n drect contradcton to each other. Wth the L2 norm the answer would be smply m = d/(1 + ɛ 2 ). Wth the L1 norm, the answer would be ether m = d or m = 0 dependng on the numercal choce of ɛ. Let us denote the convex functon and ts dervatves for data space at the resdual as (B, B, B ) and for model space as (C, C, C ). Remember, m and d, whle normally vectors, are here
6 Claerbout 6 Blocky models: L1/L2 hybrd norm scalars (ndependently for each z). loop over all tme ponts { m = d/(1 + ɛ 2 ) # These are scalars! loop over non-lnear teratons { r d = m d r m = m Get dervatves of hybrd norm B (r d ) and B (r d ) for data goal. Get dervatves of hybrd norm C (r m ) and C (r m ) for model goal. # Plan to fnd α to update m = m + α # Taylor seres for data penalty N(r d ) = B + B α + B α 2 /2 # Taylor seres for model penalty N(r m ) = C + C α + C α 2 /2 # 0 = (N(r α d) + ɛn(r m )) α = (B + ɛc )/(B + ɛc ) m = m + α } end of loop over non-lnear teratons } end of loop over all tme ponts To help us understand the choce of parameters R d, R m, and ɛ, We examne the theoretcal relaton between m and d mpled by the above code as a functon of ɛ and R m at R d, n other words, when the data has normal behavor and we are mostly nterested n the role of the regularzaton drawng weak sgnals down towards zero. The data fttng penalty s B = (m d) 2 /2 and ts dervatve B = m d. The dervatve of the model penalty (from equaton (15)) s C = m/ 1 + m 2 /R 2 m. Settng the sum of the dervatves to zero we have 0 = B + ɛc = m d + ɛm 1 + m2 /R 2 m (27) Ths says m s mostly a lttle smaller than d, but t gets more nterestng near (m, d) 0. There the slope m/d = 1/(1 + ɛ) whch says an ɛ = 4 wll damp the sgnal (where small) by a factor of 5. Movng away from m = 0 we see the dampng power of ɛ dmnshes unformly as m exceeds R m. UNKNOWN SHOT WAVEFORM A one-dmensonal sesmogram d(t) s unknown reflectvty c(t) convolved wth unknown source waveform s(t). The number of data ponts ND NC s less than the number of unknowns NC+NS. Clearly we need a smart regularzaton. Let us see how ths problem can be set up so reflectvty c(t) comes out wth sparse spkes so the ntegral of c(t) s blocky. Ths s a nonlnear problem because the convoluton of the unknowns s made of ther product. Nonlnear problems elct well-warranted fear of multple solutons leadng to us gettng stuck n the wrong one. The key to avodng ths ptfall s
7 Claerbout 7 Blocky models: L1/L2 hybrd norm startng close enough to the correct soluton. The way to get close enough (besdes luck and a good startng guess) s to defne a lnear problem that takes us to the neghborhood where a nonlnear solver can be trusted. We wll do that frst. Block cyclc solver In the Block Cyclc solver (I hope ths s the correct term.), we have two half cycles. In the frst half we take one of the varables known and the other unknown. We solve for the unknown. In the next half we swtch the known for the unknown. The beauty of ths approach s that each half cycle s a lnear problem so ts soluton s ndependent of the startng locaton. Hooray! Even better, repeatng the cycles enough tmes should converge to the correct soluton. Hooray agan! The convergence may be slow, however, so at some stage (maybe just one or two cycles) you can safely swtch over to the nonlnear method whch converges faster because t deals drectly wth the nteractons of the two varables. We could begn from the assumpton that the shot waveform s an mpulse and the reflectvty s the data. Then ether half cycle can be the startng pont. Suppose we assume we know the reflectvty, say c, and solve for the shot waveform s. We use the reflectvty c to make a convoluton matrx C. The regresson par for fndng s s 0 Cs d (28) 0 ɛ s I s (29) These would be solved for s by famlar least squares methods. It s a very easy problem because s has many fewer components than c. Now wth our source estmate s we can defne the operator S that convolves t on reflectvty c. The second half of the cycle s to solve for the reflectvty c. Ths s a lttle trcker. The data fttng may stll be done by an L2 type method, but we need somethng lke an L1 method for the regularzaton to pull the small values closer to zero to yeld a more spky c(t). 0 L2 r d = Sc d (30) 0 L1 r m = I c (31) Normally we expect an ɛ c n equaton (31) but now t comes n later. (It mght seem that the regularzaton (29) s not necessary, but wthout t, c mght get smaller and smaller whle s gets larger and larger. We should be able to neglect regresson (29) f we smply rescale approprately at each teraton.) We can take the usual L2 norm to defne a gradent vector for model perturbaton c = S r d. From t we get the resdual perturbaton r d = S c. We need to fnd an unknown dstance α to move n those drectons. We take the norm of the data fttng resdual, add to t a bt ɛ of the model stylng resdual, and set the dervatve to zero. 0 = α [N d(r d + α r) + ɛn m (c + α c)] (32)
8 Claerbout 8 Blocky models: L1/L2 hybrd norm We need dervatves of each norm at each resdual. We base these on the convex functon C(r) of the Hybrd norm. Let us call these A for the data fttng, and B for the model stylng. A = C(R d, r ) (33) B = C(R m, c ) (34) (Actually, we don t need A (because for Least Squares, A = r and A = 1), but I nclude t here n case we wsh to deal wth nose bursts n the data.) As earler, expandng the norms n Taylor seres, equaton (32) becomes ) 0 = A r + α A r 2 ( + ɛ B c A r2 + α whch gves the α we need to update the model c and the resdual r d. α = A r + ɛ B c + ɛ B c 2 Ths s the steepest descent method. For the conjugate drectons method there s a 2 2 equaton lke equaton (24). B c 2 (35) (36) Non-lnear solver The non-lnear approach s a lttle more complcated but t explctly deals wth the nteracton between s and c so t converges faster. We represent everythng as a known part plus a perturbaton part whch we wll fnd and add nto the known part. Ths s most easly expressed n the Fourer doman. Lnearze by droppng S C. 0 (S + S)(C + C) D (37) 0 S C + C S + (CS D) (38) Let us change to the tme doman wth a matrx notaton. Put the unknowns C and S n vectors c and s. Put the knowns C and S n convoluton matrces C and S. Express CS D as a column vector d. Its tme doman coeffcents are d 0 = c 0 s 0 d 0 and d 1 = c 0 s 1 + c 1 s 0 d 1, etc. The data fttng regresson s now Ths regresson s expressed more explctly below. 0 S c + C s + d (39)
9 Claerbout 9 Blocky models: L1/L2 hybrd norm 0 s c 0.. s 1 s c 1 c 0. s 2 s 1 s c 2 c 1 c 0. s 2 s 1 s 0... c 3 c 2 c 1.. s 2 s 1 s 0.. c 4 c 3 c 2... s 2 s 1 s 0. c 5 c 4 c s 2 s 1 s 0 c 6 c 5 c s 2 s 1. c 6 c s 2.. c 6 c 0 c 1 c 2 c 3 c 4 c 5 c 6 s 0 s 1 s 2 + d 0 d 1 d 2 d 3 d 4 d 5 d 6 d 7 d 8 (40) s The model stylng regresson s smply 0 C + C, whch n famlar matrx form 0 I c + c (41) It s ths regresson, along wth a composte norm and ts assocated threshold that makes c(t) come out sparse. Now we have the danger that c 0 whle s so we need one more regresson 0 I s + s (42) We can use ordnary least squares on the data fttng regresson and the shot waveform regresson. Thus [ ] [ ] [ ] [ ] rd S C d 0 = 0 c s Ī + (43) s r s 0 r = Fm + d (44) The model stylng regresson s where we seek spky behavor. The bg pcture s that we mnmze the sum 0 r c = I c + c (45) mn c, s N d (r) + ɛn m (r c ) (46) Insde the bg pcture we have updatng steps m = F r (47) g d = r = F m (48) We also have a gradent for changng c, namely g c = c. (I need to be sure g d and g c are commensurate. Maybe need an ɛ here.) One update step s to choose a lne search for α mn N d (r + αg d ) + ɛn m ( c + αg c ) (49) α
10 Claerbout 10 Blocky models: L1/L2 hybrd norm That was steepest descent. The extenson to conjugate drecton s straghtforward. As wth all nonlnear problems there s the danger of bzarre behavor and multple mnma. To avod frustraton, whle learnng you should spend about half of your effort drected toward fndng a good startng soluton. Ths normally amounts to defnng and solvng one or two lnear problems. In ths applcaton we mght get our startng soluton for s(t) and c(t) from conventonal deconvoluton analyss, or we mght get t from the block cyclc solver.
Feature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More informationLinear Feature Engineering 11
Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19
More informationDifference Equations
Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1
More informationSection 8.3 Polar Form of Complex Numbers
80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationSingular Value Decomposition: Theory and Applications
Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real
More information1 Matrix representations of canonical matrices
1 Matrx representatons of canoncal matrces 2-d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3-d rotaton around the x-axs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3-d rotaton around the y-axs:
More informationTransfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system
Transfer Functons Convenent representaton of a lnear, dynamc model. A transfer functon (TF) relates one nput and one output: x t X s y t system Y s The followng termnology s used: x y nput output forcng
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationTracking with Kalman Filter
Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16
STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus
More informationC/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1
C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned
More information1 GSW Iterative Techniques for y = Ax
1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn
More informationPHYS 705: Classical Mechanics. Calculus of Variations II
1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary
More informationn α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0
MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector
More informationInner Product. Euclidean Space. Orthonormal Basis. Orthogonal
Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,
More informationIntroduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:
CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationChapter Newton s Method
Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationMMA and GCMMA two methods for nonlinear optimization
MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationMore metrics on cartesian products
More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationCurve Fitting with the Least Square Method
WIKI Document Number 5 Interpolaton wth Least Squares Curve Fttng wth the Least Square Method Mattheu Bultelle Department of Bo-Engneerng Imperal College, London Context We wsh to model the postve feedback
More informationSome Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)
Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998
More informationPsychology 282 Lecture #24 Outline Regression Diagnostics: Outliers
Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationIV. Performance Optimization
IV. Performance Optmzaton A. Steepest descent algorthm defnton how to set up bounds on learnng rate mnmzaton n a lne (varyng learnng rate) momentum learnng examples B. Newton s method defnton Gauss-Newton
More informationTHE SUMMATION NOTATION Ʃ
Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationDynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)
/24/27 Prevew Fbonacc Sequence Longest Common Subsequence Dynamc programmng s a method for solvng complex problems by breakng them down nto smpler sub-problems. It s applcable to problems exhbtng the propertes
More informationStructure and Drive Paul A. Jensen Copyright July 20, 2003
Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.
More information9 Characteristic classes
THEODORE VORONOV DIFFERENTIAL GEOMETRY. Sprng 2009 [under constructon] 9 Characterstc classes 9.1 The frst Chern class of a lne bundle Consder a complex vector bundle E B of rank p. We shall construct
More informationAPPENDIX A Some Linear Algebra
APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,
More informationSome modelling aspects for the Matlab implementation of MMA
Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton
More informationYong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )
Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often
More informationThree views of mechanics
Three vews of mechancs John Hubbard, n L. Gross s course February 1, 211 1 Introducton A mechancal system s manfold wth a Remannan metrc K : T M R called knetc energy and a functon V : M R called potental
More informationFrom Biot-Savart Law to Divergence of B (1)
From Bot-Savart Law to Dvergence of B (1) Let s prove that Bot-Savart gves us B (r ) = 0 for an arbtrary current densty. Frst take the dvergence of both sdes of Bot-Savart. The dervatve s wth respect to
More informationReport on Image warping
Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationChapter 9: Statistical Inference and the Relationship between Two Variables
Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,
More informationLecture 21: Numerical methods for pricing American type derivatives
Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationTHE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens
THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of
More informationJAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger
JAB Chan Long-tal clams development ASTIN - September 2005 B.Verder A. Klnger Outlne Chan Ladder : comments A frst soluton: Munch Chan Ladder JAB Chan Chan Ladder: Comments Black lne: average pad to ncurred
More informationPolynomial Regression Models
LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance
More informationChapter Twelve. Integration. We now turn our attention to the idea of an integral in dimensions higher than one. Consider a real-valued function f : D
Chapter Twelve Integraton 12.1 Introducton We now turn our attenton to the dea of an ntegral n dmensons hgher than one. Consder a real-valued functon f : R, where the doman s a nce closed subset of Eucldean
More informationChapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the
More informationTopic 5: Non-Linear Regression
Topc 5: Non-Lnear Regresson The models we ve worked wth so far have been lnear n the parameters. They ve been of the form: y = Xβ + ε Many models based on economc theory are actually non-lnear n the parameters.
More information12. The Hamilton-Jacobi Equation Michael Fowler
1. The Hamlton-Jacob Equaton Mchael Fowler Back to Confguraton Space We ve establshed that the acton, regarded as a functon of ts coordnate endponts and tme, satsfes ( ) ( ) S q, t / t+ H qpt,, = 0, and
More informationHow Strong Are Weak Patents? Joseph Farrell and Carl Shapiro. Supplementary Material Licensing Probabilistic Patents to Cournot Oligopolists *
How Strong Are Weak Patents? Joseph Farrell and Carl Shapro Supplementary Materal Lcensng Probablstc Patents to Cournot Olgopolsts * September 007 We study here the specal case n whch downstream competton
More informationLinear Regression Analysis: Terminology and Notation
ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More information36.1 Why is it important to be able to find roots to systems of equations? Up to this point, we have discussed how to find the solution to
ChE Lecture Notes - D. Keer, 5/9/98 Lecture 6,7,8 - Rootndng n systems o equatons (A) Theory (B) Problems (C) MATLAB Applcatons Tet: Supplementary notes rom Instructor 6. Why s t mportant to be able to
More informationGrover s Algorithm + Quantum Zeno Effect + Vaidman
Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the
More informationWeek3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity
Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle
More informationIntegrals and Invariants of Euler-Lagrange Equations
Lecture 16 Integrals and Invarants of Euler-Lagrange Equatons ME 256 at the Indan Insttute of Scence, Bengaluru Varatonal Methods and Structural Optmzaton G. K. Ananthasuresh Professor, Mechancal Engneerng,
More informationReview of Taylor Series. Read Section 1.2
Revew of Taylor Seres Read Secton 1.2 1 Power Seres A power seres about c s an nfnte seres of the form k = 0 k a ( x c) = a + a ( x c) + a ( x c) + a ( x c) k 2 3 0 1 2 3 + In many cases, c = 0, and the
More informationCHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE
CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng
More informationAssortment Optimization under MNL
Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.
More informationExercises. 18 Algorithms
18 Algorthms Exercses 0.1. In each of the followng stuatons, ndcate whether f = O(g), or f = Ω(g), or both (n whch case f = Θ(g)). f(n) g(n) (a) n 100 n 200 (b) n 1/2 n 2/3 (c) 100n + log n n + (log n)
More informationNeural networks. Nuno Vasconcelos ECE Department, UCSD
Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X
More informationELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM
ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM An elastc wave s a deformaton of the body that travels throughout the body n all drectons. We can examne the deformaton over a perod of tme by fxng our look
More informationAdvanced Circuits Topics - Part 1 by Dr. Colton (Fall 2017)
Advanced rcuts Topcs - Part by Dr. olton (Fall 07) Part : Some thngs you should already know from Physcs 0 and 45 These are all thngs that you should have learned n Physcs 0 and/or 45. Ths secton s organzed
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More informationLecture 10: Euler s Equations for Multivariable
Lecture 0: Euler s Equatons for Multvarable Problems Let s say we re tryng to mnmze an ntegral of the form: {,,,,,, ; } J f y y y y y y d We can start by wrtng each of the y s as we dd before: y (, ) (
More information15-381: Artificial Intelligence. Regression and cross validation
15-381: Artfcal Intellgence Regresson and cross valdaton Where e are Inputs Densty Estmator Probablty Inputs Classfer Predct category Inputs Regressor Predct real no. Today Lnear regresson Gven an nput
More informationAPPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14
APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce
More informationwhere the sums are over the partcle labels. In general H = p2 2m + V s(r ) V j = V nt (jr, r j j) (5) where V s s the sngle-partcle potental and V nt
Physcs 543 Quantum Mechancs II Fall 998 Hartree-Fock and the Self-consstent Feld Varatonal Methods In the dscusson of statonary perturbaton theory, I mentoned brey the dea of varatonal approxmaton schemes.
More informationBernoulli Numbers and Polynomials
Bernoull Numbers and Polynomals T. Muthukumar tmk@tk.ac.n 17 Jun 2014 The sum of frst n natural numbers 1, 2, 3,..., n s n n(n + 1 S 1 (n := m = = n2 2 2 + n 2. Ths formula can be derved by notng that
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationPhysics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1
P. Guterrez Physcs 5153 Classcal Mechancs D Alembert s Prncple and The Lagrangan 1 Introducton The prncple of vrtual work provdes a method of solvng problems of statc equlbrum wthout havng to consder the
More informationLecture 2: Prelude to the big shrink
Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationFeb 14: Spatial analysis of data fields
Feb 4: Spatal analyss of data felds Mappng rregularly sampled data onto a regular grd Many analyss technques for geophyscal data requre the data be located at regular ntervals n space and/or tme. hs s
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationMidterm Examination. Regression and Forecasting Models
IOMS Department Regresson and Forecastng Models Professor Wllam Greene Phone: 22.998.0876 Offce: KMC 7-90 Home page: people.stern.nyu.edu/wgreene Emal: wgreene@stern.nyu.edu Course web page: people.stern.nyu.edu/wgreene/regresson/outlne.htm
More informationEconomics 101. Lecture 4 - Equilibrium and Efficiency
Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of
More informationSupport Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More information4DVAR, according to the name, is a four-dimensional variational method.
4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The
More informationSIO 224. m(r) =(ρ(r),k s (r),µ(r))
SIO 224 1. A bref look at resoluton analyss Here s some background for the Masters and Gubbns resoluton paper. Global Earth models are usually found teratvely by assumng a startng model and fndng small
More informationSupport Vector Machines
Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class
More informationa b a In case b 0, a being divisible by b is the same as to say that
Secton 6.2 Dvsblty among the ntegers An nteger a ε s dvsble by b ε f there s an nteger c ε such that a = bc. Note that s dvsble by any nteger b, snce = b. On the other hand, a s dvsble by only f a = :
More informationp 1 c 2 + p 2 c 2 + p 3 c p m c 2
Where to put a faclty? Gven locatons p 1,..., p m n R n of m houses, want to choose a locaton c n R n for the fre staton. Want c to be as close as possble to all the house. We know how to measure dstance
More informationExample: (13320, 22140) =? Solution #1: The divisors of are 1, 2, 3, 4, 5, 6, 9, 10, 12, 15, 18, 20, 27, 30, 36, 41,
The greatest common dvsor of two ntegers a and b (not both zero) s the largest nteger whch s a common factor of both a and b. We denote ths number by gcd(a, b), or smply (a, b) when there s no confuson
More informationThe Geometry of Logit and Probit
The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationVector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.
Vector Norms Chapter 7 Iteratve Technques n Matrx Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematcs Unversty of Calforna, Berkeley Math 128B Numercal Analyss Defnton A vector norm
More informationMath1110 (Spring 2009) Prelim 3 - Solutions
Math 1110 (Sprng 2009) Solutons to Prelm 3 (04/21/2009) 1 Queston 1. (16 ponts) Short answer. Math1110 (Sprng 2009) Prelm 3 - Solutons x a 1 (a) (4 ponts) Please evaluate lm, where a and b are postve numbers.
More informatione i is a random error
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown
More information