How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

XII. LINEAR ALGEBRA: SOLVING SYSTEMS OF EQUATIONS Tody we re going to tlk out solving systems of liner equtions. These re prolems tht give couple of equtions with couple of unknowns, like: 6= x + x 7= 4x + 5x How do we solve these things, especilly when they get complicted? How do we know when system hs solution, nd when is it unique? Provided tht the system is firly simple, it might e esiest to solve using successive sustitution. Given system tht looks like this: = x + x + x = x + x + x = x + x + x (For simplicity, most of things I show here will e systems, ut everything works just s well with more vriles.) You pick ny eqution nd ny vrile, nd solve in terms of tht vrile in terms of the constnts nd the other vriles. Let s sy we pick eqution one nd x : x = ( x x) Then we sustitute this vlue of x ck into the other two equtions, = ( x x )+ x + x = ( x x)+ x + x And then we hve two liner equtions in two unknowns: = + x + x = + x x + Once gin, we pick one eqution nd solve it in terms of prticulr vrile: x ( ) x = After sustituting into the remining eqution, we get single expression for the lst of the vriles: x = + ( ) + ( ) ( ) ( ) Summer 00 mth clss notes, pge 85

Knowing wht x is, we cn find the vlue of x nd then x. However, this is tiring process, especilly when you strt off with unch of equtions, nd there re no pprent simple sustitutions. It s going to e esier to do this in mtrix form. Let A e the mtrix of coefficients on the system of equtions, nd v the constnts. We cn write this system of equtions s: = x + x + x = x + x + x v x v = = x = A x = x + x + x x And the question is how to solve this system for the vector x v of unknowns. There re three wys, more or less. In the first method, we essentilly use Gussin elimintion in mtrix form. First, we write out the ugmented mtrix: This is shorthnd for sying the vector x v times the left hnd side of the mtrix will equl the right hnd side of the mtrix. Now, if the left-hnd side equls the identity mtrix, 0 0 c 0 0 c 0 0 c wht we hve is tht the vector x v times the identity mtrix (which equls x v itself) equls the right hnd side, so x v = c v. Whenever the left-hnd side equls the identity mtrix, the right-hnd side is solution for x. v Given the ugmented mtrix corresponding to the system of liner equtions, our mission (should we choose to ccept) is to get the left-hnd side into the form of the identity mtrix, using only these three elementry row opertions:. interchnging two rows of the mtrix;. dd (or sutrct) multiple of one row, to nother row; nd. multiply ech element in row y the sme nonzero numer. We perform these opertions to every element of the row, oth on the left hnd side. With the prticulr mtrix given ove, these re wht the permissile elementry row opertions look like: Summer 00 mth clss notes, pge 86

γ γ γ γ γ γ γ γ My strtegy for solving these is usully first to rrnge the equtions in wy tht mkes sense (with experience, you ll figure out wht s esiest). Then I divide the first row through y the constnt : Then I sutrct times the first row off of the second; times the first row off from the third: 0 0 I do similr thing for the second row now, dividing through y the coefficient on the term in the second row: 0 ( )( ) ( )( ) 0 In order to get zeros in the second plces of the first nd third rows, I multiply the second row y the pproprite constnt nd sutrct off: 0 ( )( ) ( ) ( )( )( ) 0 ( )( ) ( )( ) 0 0 ( )( )( ) ( )( ) And so on. Though this looks relly nsty when presented this wy, it turns out usully to work pretty well. Let s try n exmple: 7= x + x + x 7 x 7 5 = 4x + 5x + 6x 5 = 4 5 6x 4 5 6 5 = 7x + 8x + 9x 7 8 9x 7 8 9 The first step is to divide the first row y the coefficient in the top left in this cse, tht turns out to e negtive one. Then we sutrct the top row time four from the second row, nd the top row times seven from the ottom row: ( ) Summer 00 mth clss notes, pge 87

7 7 7 4 5 6 5 4 5 6 5 0 8 7 8 9 7 8 9 0 08 Then we divide the second row y in order to get leding, nd dd two times the second row to the first row, nd sutrct times the second row from the lst: 7 7 45 0 8 8 0 8 0 0 0 08 0 0 6 8 0 0 Finlly, we divide the lst row y 6, nd sutrct the pproprite out off from the first nd second rows: 45 0 45 0 0 0 8 8 0 0 0 0 6 0 0 0 0 0 0 The right-hnd side of the mtrix now tells us wht the vector x v should equl. We should now go ck nd verify (y multiplying the originl prolem) tht this works. Sometimes, you might try to work one of these systems nd end up with very funny (contrdictory) result in the end, or n entire row might turn into zeros (which leves you with no chnce of turning its digonl element into one). Most likely, this is sign tht you hve mde n rithmetic error ut if you go ck nd check your steps nd this is still the outcome, then you hve encountered system without solution or with infinitely mny solutions. I ll tlk more out these lter. The second wy of solving system of equtions is so simple people often overlook it. Suppose we hve the system: x v v = x = A x x Provided tht A is n invertile n n mtrix, we cn solve this y premultiplying oth sides y A : v v x= A And then performing the pproprite mtrix multipliction. Let s look t tht exmple gin: 7 x x 7 5 = 4 5 6x x = 4 5 6 5 7 8 9x x 7 8 9 Using the formul for mtrix inversion, we find this: Summer 00 mth clss notes, pge 88

A A A 7 6 7 v v x= A = A A A 5 = 6 0 8 5 = A A A 6 0 4 Pretty nifty tht we cn do it two wys nd get the sme solution, huh? Of course, this method works on when the mtrix is invertile; lter, I ll show how eing singulr corresponds to system with mny or no solutions. If we look t the mtrix inversion method, we oserve n interesting pttern rising. In the three y three cse, wht we hve is tht: x = ( A + A + A ) x = ( A + A + A ) x = ( A + A + A ) Wht does this look like? Well, these er remrkle resemlnce to the formul for determinnts. A + A + A = A + A + A = A + A + A = So in fct ll we hve to do to solve this system of equtions (much esier thn inverting mtrix) is to sy tht x i equls the determinnt of the mtrix formed y replcing the i-th column of A with the vector, v divided y the determinnt of A. This is known s Crmer s Rule. Theorem: Let A e nonsingulr n n mtrix. Then the system of equtions: L x n v n x v = = = Ax M O M M 4 n n L nn xn hs the unique solution tht: i x i = det B where B i is the mtrix formed y replcing the i-th column of A with the vector v. Summer 00 mth clss notes, pge 89

Provided tht you cn rememer this formul, this is usully the most efficient wy to solve system of equtions. Recll tht if we imgine mtrix s unch of vectors, the determinnt mesures the spn of these vectors. This re is lrgest when the vectors re more t odds with one nother, the closer they re to eing orthogonl, the less they hve in common. The first column of A is where x does ll of its explining of the outcome: = x + x + K + nxn = x + x + K + nxn If x is very lrge (reltive to the other vriles), then the first column of A should e very similr in direction to the outcome, v right? Only the mgnitudes might differ. In order to test how lrge this effect is, we tke out this first column nd stick in v insted. If it s true tht x hs the most effect on the outcome, then this sustitution should not chnge the shpe of the re spnned y the mtrix much, only its size. Another wy of thinking of this is tht if vriles other thn x hd reltively little effect on the outcome of v, then v would e firly orthogonl to the vectors in A other thn x. This would men tht the re spnned y v nd these other vectors would e reltively lrge. It might e useful to mke up some numers for two-y-two mtrix A, nd to represent its determinnt grphiclly. Then mke up vector for x, nd see wht the implied vlues for re. Drw the re spnned y B nd B. Does it seem tht the reltive size of these res corresponds to the reltive sizes of the two x vriles? Not ll systems of equtions hve unique solution. Some hve infinitely mny, nd some hve none. Here is one simple exmple: = x + x 6= 4x + 4x In some sense, the second eqution gives us no more informtion thn the first, since it simple hs ll the constnts douled. This system cn e fulfilled y lot of points, ll lying long line. In contrst, the system: = x + x 6= x + x hs no solution. Effectively, we hve een given two contrdictory pieces of informtion: y trnsitivity, they imply tht = 6, which is surd. When we hve system of n equtions in n unknowns, the lck of unique solution hppens if nd only if two (or more) equtions give the suggest tht the sme reltionship etween vriles produces the sme outcome, or tht they produce different outcomes. Summer 00 mth clss notes, pge 90

In short, the lck of unique solution hppens if nd only if two equtions suggest the sme reltionship etween vriles. Here re some exmples of systems of equtions tht suggest the sme reltionship, lso represented in mtrix form: = x + x 6= 4 + 4 6 = 4 4 x x x x = x + x 6= + 6 = x x x x = 4x + x + 5x 4 5 x = 6x + 4x + x = 6 4 x = 5x + x + 4x 5 4 x In ech cse, either two rows re the sme, one rows is multiple of nother, or one row is liner comintion of two others. If we look t the determinnts of the mtrices on the right hnd side, we ll see something else these equtions hve in common (other thn the lck of unique solution): ll these mtrices re singulr. So here s the lw for squre mtrices: Unique solution Full rnk Liner independence Nonsingulr Invertile I think tht s it. If there re ny other desirle properties of squre mtrices, they re most likely lso equivlent. The old principle out eing le to solve n equtions in n unknowns works if nd only if these re linerly independent equtions. Wht out when you hve k equtions in n unknowns? Well, s you proly knew efore, k< n generlly mens tht there is n infinite numer of solutions, wheres k> n generlly implies no solution t ll. Systems of inequlities Intersection of lines => intersection of hlfspces References: Hrville, Mtrix lger from sttisticin s perspective Greene, Econometric nlysis (Chpter ) Eves, Elementry mtrix theory Summer 00 mth clss notes, pge 9