🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 40 (from laksa043)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

ℹ️ Skipped - page is already crawled

đź“„
INDEXABLE
âś…
CRAWLED
4 days ago
🤖
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH0.2 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html
Last Crawled2026-04-14 09:53:06 (4 days ago)
First Indexednot set
HTTP Status Code200
Meta TitleEigenvectors and Eigenvalues — Linear Algebra, Geometry, and Computation
Meta Descriptionnull
Meta Canonicalnull
Boilerpipe Text
cohortMatrix = np . array ([ [ 56 , 42 , 49 , 39 , 40 , 31 , 33 , 46 , 55 , 91 , 83 , 129 , 153 , 178 ], [ 43 , 36 , 27 , 34 , 24 , 29 , 39 , 56 , 74 , 69 , 111 , 136 , 168 , 171 ], [ 32 , 24 , 22 , 21 , 26 , 25 , 44 , 64 , 52 , 77 , 105 , 130 , 139 , 175 ], [ 25 , 16 , 19 , 28 , 24 , 30 , 37 , 40 , 49 , 56 , 79 , 93 , 90 , 126 ]]) nyears = np . shape ( cohortMatrix )[ 1 ] # index rows by time, and columns by cohort Year = pd . DateOffset ( years = 1 ) # need to fliplr because spreadsheet rows are in decreasing cohort order datestrs = [ '09-2004' , '09-2005' , '09-2006' , '09-2007' , '09-2008' , '09-2009' , '09-2010' , '09-2011' , '09-2012' , '09-2013' , '09-2014' , '09-2015' , '09-2016' , '09-2017' ] dates = [ datetime . strptime ( x , '%m-%Y' ) for x in datestrs [: nyears ]] cohorts = pd . DataFrame ( np . fliplr ( cohortMatrix . T ), index = dates , columns = pd . Index ([ 'U1' , 'U2' , 'U3' , 'U4' ])) # learning the model b = np . array ( cohorts [ 1 :]) a = np . array ( cohorts [: - 1 ]) x , resid , rank , s = np . linalg . lstsq ( a , b , rcond = None ) A = x . T # cohorts . sum ( axis = 1 ) . plot () plt . ylim ( ymin = 0 ) # plt.legend(['Model','Actual'],loc='best') plt . title ( 'Number of BU CS Majors' , size = 20 ); Currently the number of CS majors is growing very rapidly. We’d like to have some estimate of where the numbers might be going. Modeling the number of students in the CS major is challenging because students enter and leave the major at various points during their undergraduate degree. We will use a simple linear model that we will train on historical data. Our state variable is \[\begin{split}\mathbf{x}_t = \begin{bmatrix}x_{1,t}\\ x_{2,t}\\ x_{3,t}\\ x_{4,t}\end{bmatrix}\end{split}\] where \(x_{i,t}\) is the number of students who are declared CS majors and in their \(i\) th year at BU, in the fall of year \(t\) . Our model is a linear dynamical system (or linear recurrence): \[ \mathbf{x}_{t+1} = A \mathbf{x}_t.\] Given historical data which are measurements of our state variable from past years, we can estimate \(A\) by a method called least squares (which we will cover later in the semester.) print ( A ) [[ 0.62110633 0.28792787 0.0204491 0.10003897] [ 0.65239203 0.55574243 0.30323349 -0.16349735] [ 0.33101614 1.24636712 -0.26153545 0.07684781] [ 0.49319575 -0.30684656 1.00419585 0.07532737]] Now, using \(A\) , we can predict growth into the future via \(\mathbf{x}_{t+1} = A \mathbf{x}_t.\) OneYearAhead = cohorts . dot ( A . T ) . shift ( 1 , freq = Year ) OneYearAhead . columns = pd . Index ([ 'U1' , 'U2' , 'U3' , 'U4' ]) OneYearAhead . sum ( axis = 1 ) . plot () cohorts . sum ( axis = 1 ) . plot () plt . ylim ( ymin = 0 ) plt . legend ([ 'Model' , 'Actual' ], loc = 'best' ); Now, the matrix \(A\) captures the complex way that one year’s student population relates to the next year’s population. It can be hard to understand what is going on just by looking at the values of \(A\) . But, when we look at the plot above, it appears that there is a relatively simple kind of exponential growth going on. Can we “extract” this exponential growth model from \(A\) ? How? Today’s lecture will start to develop the tools to answer such questions (and more). Eigenvectors and Eigenvalues ¶ As we’ve seen, linear transformations (thinking geometrically) can “move” a vector to a new location. For example, a linear transformation \(A\mathbf{x}\) can do the following to \(\mathbf{x}\) : rotate \(\mathbf{x}\) reflect \(\mathbf{x}\) project \(\mathbf{x}\) scale \(\mathbf{x}\) Of the above transformations, one is a bit different: scaling . That is because, if a matrix \(A\) scales \(\mathbf{x}\) , that transformation could also have been expressed without a matrix-vector multiplication i.e., as \(\lambda\mathbf{x}\) for some scalar value \(\lambda\) . In many ways, life would be simpler if we could avoid all of the complexities of rotation, reflection, and projection … and only work in terms of scaling. That is the idea we will develop today. Demo: eigshow . # import eigshow # eigshow.eigshow() Let’s start with a simple example: a shear matrix: \[\begin{split} A = \begin{bmatrix}1&0.1\\0&1\end{bmatrix} \end{split}\] Let’s look at what this matrix does to each point in the plane. A = np . array ([[ 1 , 0.1 ], [ 0 , 1 ]]) ax = ut . plotSetup ( - 7 , 7 , - 7 , 7 , size = ( 12 , 8 )) ut . centerAxes ( ax ) for x in range ( - 6 , 7 ): for y in range ( - 6 , 7 ): v = np . array ([ x , y ]) ut . plotVec ( ax , v ) ut . plotVec ( ax , A . dot ( v ), 'b' ) We can use arrows to show how each red point is moved to the corresponding blue point. sl . hide_code_in_slideshow () ax = ut . plotSetup ( - 7 , 7 , - 7 , 7 , size = ( 12 , 8 )) ut . centerAxes ( ax ) A = np . array ([[ 1 , 0.1 ],[ 0 , 1 ]]) for x in range ( - 6 , 7 ): for y in range ( - 6 , 7 ): ut . plotArrowVec ( ax , A . dot ( np . array ([ x , y ])), [ x , y ]) Now let’s look at a more complicated situation. Consider the matrix \(A = \begin{bmatrix}1.0&0.2\\0.1&0.9\end{bmatrix}\) . sl . hide_code_in_slideshow () V = np . array ([[ 2 , - 1 ],[ 1 , 1 ]]) L = np . array ([[ 1.1 , 0 ], [ 0 , 0.8 ]]) A = V . dot ( L ) . dot ( np . linalg . inv ( V )) ax = ut . plotSetup ( - 7 , 7 , - 7 , 7 , size = ( 12 , 8 )) ut . centerAxes ( ax ) for x in range ( - 6 , 7 ): for y in range ( - 6 , 7 ): v = np . array ([ x , y ]) ut . plotArrowVec ( ax , A . dot ( v ), v ) The starting point for understanding what \(A\) does is to look for special vectors which do not change their direction when multiplied by \(A\) . Example. Let \(A = \begin{bmatrix}3&-2\\1&0\end{bmatrix}, \mathbf{u} = \begin{bmatrix}-1\\1\end{bmatrix}, \mathbf{v}=\begin{bmatrix}2\\1\end{bmatrix}.\) The images of \(\mathbf{u}\) and \(\mathbf{v}\) under multiplication by \(A\) are shown here: ax = ut . plotSetup ( size = ( 12 , 8 )) ut . centerAxes ( ax ) A = np . array ([[ 3 , - 2 ],[ 1 , 0 ]]) u = np . array ([ - 1 , 1 ]) v = np . array ([ 2 , 1 ]) # ut . plotArrowVec ( ax , v , [ 0 , 0 ], color = 'Blue' ) ut . plotArrowVec ( ax , A . dot ( v ), [ 0 , 0 ], color = 'Blue' ) ax . text ( v [ 0 ], v [ 1 ] + 0.2 , r '${\bf v}$' , size = 20 ) ax . text ( A . dot ( v )[ 0 ], A . dot ( v )[ 1 ] + 0.2 , r '$A{\bf v}$' , size = 20 ) # ut . plotArrowVec ( ax , u , [ 0 , 0 ], color = 'Red' ) ut . plotArrowVec ( ax , A . dot ( u ), [ 0 , 0 ], color = 'Red' ) ax . text ( u [ 0 ] - 0.5 , u [ 1 ] + 0.1 , r '${\bf u}$' , size = 20 ) ax . text ( A . dot ( u )[ 0 ] - 0.7 , A . dot ( u )[ 1 ] + 0.3 , r '$A{\bf u}$' , size = 20 ); For a “typical” vector (like \(\mathbf{u}\) ) the action of \(A\) is to scale, reflect, and/or rotate it. But for some special vectors (like \(\mathbf{v}\) ) the action of \(A\) is only to “stretch” it (without any rotation or reflection). In fact, \(A\mathbf{v}\) is simply \(2\mathbf{v}.\) Today we will by studying equations such as \[ A\mathbf{x} = 2\mathbf{x}\;\;\;\;\mbox{or}\;\;\;\;A\mathbf{x} = -4\mathbf{x}\] where special vectors are transformed by \(A\) into scalar multiples of themselves. Definition. An eigenvector of an \(n\times n\) matrix \(A\) is a nonzero vector \(\mathbf{x}\) such that \(A\mathbf{x} = \lambda\mathbf{x}\) for some scalar \(\lambda.\) A scalar \(\lambda\) is called an eigenvalue of \(A\) if there is a nontrivial solution \(\mathbf{x}\) of \(A\mathbf{x} = \lambda\mathbf{x}.\) Such an \(\mathbf{x}\) is called an eigenvector corresponding to \(\lambda.\) Checking Eigenvectors and Eigenvalues ¶ Before considering how to compute eigenvectors, let’s look at simply how to determine if a given vector is an eigenvector of a matrix. Example. Let \(A = \begin{bmatrix}1&6\\5&2\end{bmatrix}, \mathbf{u} = \begin{bmatrix}6\\-5\end{bmatrix}, \mathbf{v}=\begin{bmatrix}3\\-2\end{bmatrix}.\) Are \(\mathbf{u}\) and \(\mathbf{v}\) eigenvectors of \(A\) ? Solution. \[\begin{split}A\mathbf{u} = \begin{bmatrix}1&6\\5&2\end{bmatrix}\begin{bmatrix}6\\-5\end{bmatrix} = \begin{bmatrix}-24\\20\end{bmatrix} = -4\begin{bmatrix}6\\-5\end{bmatrix} = -4\mathbf{u}.\end{split}\] \[\begin{split}A\mathbf{v} = \begin{bmatrix}1&6\\5&2\end{bmatrix}\begin{bmatrix}3\\-2\end{bmatrix} = \begin{bmatrix}-9\\11\end{bmatrix} \neq \lambda\begin{bmatrix}3\\-2\end{bmatrix}.\end{split}\] So \(\mathbf{u}\) is an eigenvector corresponding to the eigenvalue \(-4\) , but \(\mathbf{v}\) is not an eigenvector of \(A\) . Example. Show that 7 is an eigenvalue of the matrix \(A\) , and find the corresponding eigenvectors. For 7 to be an eigenvalue of \(A\) , it must be the case that \[ A\mathbf{x}=7\mathbf{x}\] has a nontrivial solution. This is equivalent to: \[ A\mathbf{x} - 7\mathbf{x} = {\bf 0}\] or \[ (A - 7I)\mathbf{x} = {\bf 0}.\] To solve this homogeneous equation, form the matrix \[\begin{split}A - 7I = \begin{bmatrix}1&6\\5&2\end{bmatrix}-\begin{bmatrix}7&0\\0&7\end{bmatrix} = \begin{bmatrix}-6&6\\5&-5\end{bmatrix}.\end{split}\] The columns of \(A-7I\) are obviously linearly dependent, so \((A - 7I)\mathbf{x} = {\bf 0}\) has a nontrivial solution. So 7 is an eigenvalue of \(A\) . To find the corresponding eigenvectors, we solve \((A - 7I)\mathbf{x} = {\bf 0}\) using row operations: \[\begin{split}\begin{bmatrix}-6&6&0\\5&-5&0\end{bmatrix}\sim\begin{bmatrix}1&-1&0\\0&0&0\end{bmatrix}.\end{split}\] This says that \(x_1 = x_2,\) and \(x_2\) is free. So the general solution has the form \(x_2\begin{bmatrix}1\\1\end{bmatrix}.\) Each vector of this form with \(x_2 \neq 0\) is an eigenvector corresponding to \(\lambda = 7.\) Question TIme! Q16.1 Eigenspace ¶ Now, let’s generalize what we just did. From the previous example, we can conclude that \(\lambda\) is an eigenvalue of an \(n\times n\) matrix \(A\) if and only if the equation \[(A-\lambda I)\mathbf{x} = {\bf 0}\] has a nontrivial solution. Notice that the set of all solutions of that equation is just the null space of the matrix \(A-\lambda I.\) So the set of all eigenvectors corresponding to a particular \(\lambda\) is a subspace of \(\mathbb{R}^n\) and is called the eigenspace of \(A\) corresponding to \(\lambda\) . Example. For matrix \(A = \begin{bmatrix}1&6\\5&2\end{bmatrix},\) we found that the general solution for the eigenvector corrsponding to \(\lambda = 7\) is the expression \(x_2\begin{bmatrix}1\\1\end{bmatrix}.\) This means that the eigenspace corresponding to \(\lambda = 7\) consists of all multiples of \(\begin{bmatrix}1\\1\end{bmatrix},\) which is the line through \((1,1)\) and the origin. We could also show that the eigenspace corresponding to \(\lambda = -4\) is the line through \((-6,5)\) . Demo: confirm using eigshow. ax = ut . plotSetup ( - 10 , 10 , - 6 , 10 , size = ( 12 , 8 )) ut . centerAxes ( ax ) A = np . array ([[ 1 , 6 ],[ 5 , 2 ]]) u = np . array ([ 1 , 1 ]) v = np . array ([ 3. / 2 , - 5. / 4 ]) # ax . plot ( v [ 0 ], v [ 1 ], 'bo' ) ax . plot ( A . dot ( v )[ 0 ], A . dot ( v )[ 1 ], 'bo' ) ax . text ( v [ 0 ], v [ 1 ] + 0.2 , r '${\bf v}$' , size = 20 ) ax . text ( A . dot ( v )[ 0 ] + 0.2 , A . dot ( v )[ 1 ] + 0.2 , r '$A{\bf v}$' , size = 20 ) ax . plot ([ - 10 , 10 ],[ 5 * 10 / 6.0 , - 5 * 10 / 6.0 ], 'b-' , lw = 2 ) # ax . plot ( u [ 0 ], u [ 1 ], 'ro' ) ax . plot ( A . dot ( u )[ 0 ], A . dot ( u )[ 1 ], 'ro' ) ax . text ( u [ 0 ] + 0.7 , u [ 1 ], r '${\bf u}$' , size = 20 ) ax . text ( A . dot ( u )[ 0 ] + 0.7 , A . dot ( u )[ 1 ], r '$A{\bf u}$' , size = 20 ) ax . plot ([ - 6 , 10 ],[ - 6 , 10 ], 'r-' , lw = 2 ) # ax . annotate ( '' , xy = ( A . dot ( u )[ 0 ], A . dot ( u )[ 1 ]), xycoords = 'data' , xytext = ( u [ 0 ], u [ 1 ]), textcoords = 'data' , size = 18 , #bbox=dict(boxstyle="round", fc="0.8"), arrowprops = { 'arrowstyle' : 'simple' , 'fc' : '0.5' , 'ec' : 'none' , 'connectionstyle' : 'arc3,rad=0.5' }, ) ax . annotate ( '' , xy = ( A . dot ( v )[ 0 ], A . dot ( v )[ 1 ]), xycoords = 'data' , xytext = ( v [ 0 ], v [ 1 ]), textcoords = 'data' , size = 18 , #bbox=dict(boxstyle="round", fc="0.8"), arrowprops = { 'arrowstyle' : 'simple' , 'fc' : '0.5' , 'ec' : 'none' , 'connectionstyle' : 'arc3,rad=-0.5' }, ) ax . text ( 3 , 9 , r 'Eigenspace for $\lambda$ = 7' , size = 16 ) ax . text ( - 9.5 , 8 , r 'Eigenspace for $\lambda = -4$' , size = 16 ); Example. Let \(A = \begin{bmatrix}4&-1&6\\2&1&6\\2&-1&8\end{bmatrix}.\) An eigenvalue of \(A\) is \(2.\) Find a basis for the the corresponding eigenspace. Solution. The eigenspace we are looking for is the nullspace of \(A - 2I.\) We will use the strategy for finding a basis for a nullspace that we learned in the last lecture. Form \[\begin{split}A - 2I = \begin{bmatrix}4&-1&6\\2&1&6\\2&-1&8\end{bmatrix}-\begin{bmatrix}2&0&0\\0&2&0\\0&0&2\end{bmatrix} = \begin{bmatrix}2&-1&6\\2&-1&6\\2&-1&6\end{bmatrix}\end{split}\] and row reduce the augmented matrix for \((A-2I)\mathbf{x}={\bf 0}\) : \[\begin{split}\begin{bmatrix}2&-1&6&0\\2&-1&6&0\\2&-1&6&0\end{bmatrix} \sim \begin{bmatrix}2&-1&6&0\\0&0&0&0\\0&0&0&0\end{bmatrix}\end{split}\] At this point, it is clear that 2 is indeed an eigenvalue of \(A\) because the equation \((A-2I)\mathbf{x} = {\bf 0}\) has free variables. The general solution is \(x_1 = \frac{1}{2}x_2 +-3x_3.\) Expressed as a vector, the general solution is: \[\begin{split}\begin{bmatrix}\frac{1}{2}x_2 - 3x_3\\x_2\\x_3\end{bmatrix}.\end{split}\] Expressed as a vector sum, this is: \[\begin{split}x_2\begin{bmatrix}1/2\\1\\0\end{bmatrix} + x_3\begin{bmatrix}-3\\0\\1\end{bmatrix}\;\;\;\;\mbox{with $x_2$ and $x_3$ free.}\end{split}\] So this eigenspace is a two-dimensional subspace of \(\mathbb{R}^3,\) and a basis for this subspace is \[\begin{split}\left\{\begin{bmatrix}1/2\\1\\0\end{bmatrix},\begin{bmatrix}-3\\0\\1\end{bmatrix}\right\}.\end{split}\] fig = ut . three_d_figure (( 16 , 1 ), fig_desc = '2D Eigenspace for lambda = 2' , xmin = - 3 , xmax = 3 , ymin = - 3 , ymax = 3 , zmin = - 3 , zmax = 3 , qr = qr_setting , figsize = ( 12 , 8 )) v = np . array ([ 1. , 2. , 0 ]) v = v / np . sqrt ( np . sum ( v * v )) u = np . array ([ - 3.0 , 0 , 1.0 ]) u = u / np . sqrt ( np . sum ( u * u )) w = - v - 2.0 * u w = w / np . sqrt ( np . sum ( w * w )) pts = np . array ([ v , u , w ]) #ax.text(v[0],v[1],v[2],r'$\bf v$',size=20) #ax.text(u[0],u[1],u[2],r'$\bf u$',size=20) fig . text ( - 2 , - 2 , - 3 , r 'Eigenspace for $\lambda=2$' , 'Eigenspace for lambda = 2' , size = 20 ) #ax.text(0.2,0.2,-4,r'$\bf 0$',size=20) # plotting the span of v fig . plotSpan ( u , v , 'Green' ) fig . plotPoint ( u [ 0 ], u [ 1 ], u [ 2 ], 'r' ) fig . plotPoint ( v [ 0 ], v [ 1 ], v [ 2 ], 'r' ) fig . plotPoint ( w [ 0 ], w [ 1 ], w [ 2 ], 'r' ) fig . plotPoint ( 0 , 0 , 0 , 'k' ) plt . quiver ( pts [:, 0 ], pts [:, 1 ], pts [:, 2 ], pts [:, 0 ], pts [:, 1 ], pts [:, 2 ], length = 1.0 , color = 'Red' , lw = 2 ) pts = 2.0 * pts plt . quiver ( pts [:, 0 ], pts [:, 1 ], pts [:, 2 ], pts [:, 0 ], pts [:, 1 ], pts [:, 2 ], length = 1.0 , color = 'Blue' , lw = 2 ) u = 2 * u v = 2 * v w = 2 * w fig . plotPoint ( u [ 0 ], u [ 1 ], u [ 2 ], 'b' ) fig . plotPoint ( v [ 0 ], v [ 1 ], v [ 2 ], 'b' ) fig . plotPoint ( w [ 0 ], w [ 1 ], w [ 2 ], 'b' ) # plotting the axes # ut.plotIntersection3d(ax,[0,0,1,0],[0,1,0,0]) # ut.plotIntersection3d(ax,[0,0,1,0],[1,0,0,0]) fig . plotIntersection ([ 0 , 1 , 0 , 0 ], [ 1 , 0 , 0 , 0 ]) fig . save (); Question Time! Q16.2 Eigenvalues of a Triangular Matrix ¶ Theorem. The eigenvalues of a triangular matrix are the entries on its main diagonal. Proof. We’ll consider the \(3\times 3\) case. If \(A\) is upper triangular, then \(A-\lambda I\) has the form \[\begin{split}A - \lambda I = \begin{bmatrix}a_{11}&a_{12}&a_{13}\\0&a_{22}&a_{23}\\0&0&a_{33}\end{bmatrix} - \begin{bmatrix}\lambda&0&0\\0&\lambda&0\\0&0&\lambda\end{bmatrix}\end{split}\] \[\begin{split} = \begin{bmatrix}a_{11}-\lambda&a_{12}&a_{13}\\0&a_{22}-\lambda&a_{23}\\0&0&a_{33}-\lambda\end{bmatrix}\end{split}\] Now, \(\lambda\) is an eigenvalue of \(A\) if and only if the equation \((A-\lambda I)\mathbf{x} = {\bf 0}\) has a nontrivial solution – in other words, if and only if the equation has a free variable. The special pattern that the zero entries have in \(A\) means that there will be a free variable if any diagonal entry is also zero, because then there will be a column without a pivot. That is, \((A-\lambda I)\mathbf{x} = {\bf 0}\) has a free variable if and only if at least one of the entries on the diagonal of \((A-\lambda I)\) is zero. This happens if and only if \(\lambda\) equals one of the entries \(a_{11}, a_{22},\) or \(a_{33}\) on the diagonal of \(A\) . Example. Let \(A = \begin{bmatrix}3&6&-8\\0&0&6\\0&0&2\end{bmatrix}\) and \(B = \begin{bmatrix}4&0&0\\-2&1&0\\5&3&4\end{bmatrix}\) . Then the eigenvalues of \(A\) are 3, 0, and 2. The eigenvalues of \(B\) are 4 and 1. Invertibility and Eigenvalues ¶ So far we haven’t probed what it means for a matrix to have an eigenvalue of 0. This happens if and only if the equation \(A\mathbf{x} = 0\mathbf{x}\) has a nontrivial solution. But that equation is equivalent to \(A\mathbf{x} = {\bf 0}\) which has a nontrivial solution if and only if \(A\) is not invertible. This means that 0 is an eigenvalue of $A$ if and only if $A$ is not invertible. This draws an important connection between invertibility and zero eigenvalues. So we have yet another addition to the Invertible Matrix Theorem! Question Time! Q16.3 Eigenvectors and Difference Equations ¶ Let’s return to the problem we considered at the outset: predicting future values of \(\mathbf{x}_t\) (the number of CS majors of each class in year \(t\) ). We modeled the future as \(\mathbf{x}_{t+1} = A\mathbf{x}_t\;\;\;(t = 0,1,2,\dots).\) So then \(\mathbf{x}_{t} = A^t\mathbf{x}_0\;\;\;(t = 0,1,2,\dots).\) A solution of such an equation is an explicit description of \(\{\mathbf{x}_t\}\) whose formula for each \(\mathbf{x}_t\) does not depend directly on \(A\) or on the preceding terms in the sequence other than the initial term \(\mathbf{x}_0.\) The simplest way to build a solution is to take an eigenvector \(\mathbf{x}_0\) and its corresponding eigenvalue \(\lambda\) . Then $ \( \mathbf{x}_t = \lambda^t \mathbf{x}_0 \;\;\;(t = 0,1,2,\dots)\) $ This sequence is a solution because \[A\mathbf{x}_t = A(\lambda^t\mathbf{x}_0) = \lambda^t(A\mathbf{x}_0) = \lambda^k(\lambda\mathbf{x}_0) = \lambda^{t+1}\mathbf{x}_0 = \mathbf{x}_{t+1}.\] We can extend this to combinations of eigenvectors. If \(A\) has two eigenvectors \(\mathbf{u}\) and \(\mathbf{v}\) corresponding to eigenvalues \(\lambda\) and \(\mu\) , then another solution is \[ x_t = \lambda^t\mathbf{u} + \mu^t \mathbf{v}\] To address the problem of predicting the CS major population, we assume that we can express the state vector \(\mathbf{x}\) as a linear combination of eigenvectors \(\mathbf{u}_1, \mathbf{u}_2, \dots,\mathbf{u}_p\) . That is, \[\mathbf{x}_0 = a_1\mathbf{u}_1 + a_2\mathbf{u}_2+\dots+a_p\mathbf{u}_p.\] Then at time \(t\) , \[\mathbf{x}_t = \lambda_1^ta_1\mathbf{u}_1 + \lambda_2^ta_2\mathbf{u}_2 + \dots + \lambda_p^ta_p\mathbf{u}_p.\] Which gives us an explicit solution for the number of students in the major in any future year. Note that the expression above is exponential in each of the \(\lambda_1,\dots,\lambda_p\) . The resulting value will be essentially determined by the largest \(\lambda_i\) . To see this, let’s say that \(\lambda_1\) is the largest eigenvalue, and \(\lambda_2\) is the second largest. Then the proportion of \(\mathbf{u}_1\) to \(\mathbf{u}_2\) contained in \(\mathbf{x}_t\) will be given by \((\lambda_1/\lambda_2)^t\) . Since this ratio is larger than 1, the relative proportion of \(\mathbf{u}_1\) will grow exponentially. This shows the importance of the largest eigenvalue in the value of \(\mathbf{x}_t\) for large \(t\) . In the long run, \(\mathbf{x}_t\) will be very close to \(\lambda_1^t\mathbf{u}_1.\) The largest eigenvalue shows the asymptotic rate of growth (or shrinkage) of the state vector \(\mathbf{x}_t\) . For the matrix \(A\) that describes CS major population growth, the largest eigenvalue is 1.19. In other words, this models the population growth rate of CS majors at that time as increasing about 19% per year.
Markdown
[![logo](https://www.cs.bu.edu/fac/snyder/cs132-book/_static/DiagramAR-icon.png) Linear Algebra, Geometry, and Computation](https://www.cs.bu.edu/fac/snyder/cs132-book/index.html) - [Preface](https://www.cs.bu.edu/fac/snyder/cs132-book/landing-page.html) - [Linear Equations](https://www.cs.bu.edu/fac/snyder/cs132-book/L01LinearEquations.html) - [Numbers](https://www.cs.bu.edu/fac/snyder/cs132-book/L02Numerics.html) - [Gaussian Elimination](https://www.cs.bu.edu/fac/snyder/cs132-book/L03RowReductions.html) - [Vector Equations](https://www.cs.bu.edu/fac/snyder/cs132-book/L04VectorEquations.html) - [\\(A{\\bf x} = {\\bf b}\\)](https://www.cs.bu.edu/fac/snyder/cs132-book/L05Axb.html) - [Linear Independence](https://www.cs.bu.edu/fac/snyder/cs132-book/L06LinearIndependence.html) - [Linear Transformations](https://www.cs.bu.edu/fac/snyder/cs132-book/L07LinearTransformations.html) - [The Matrix of a Linear Transformation](https://www.cs.bu.edu/fac/snyder/cs132-book/L08MatrixofLinearTranformation.html) - [Matrix Algebra](https://www.cs.bu.edu/fac/snyder/cs132-book/L09MatrixOperations.html) - [The Inverse of a Matrix](https://www.cs.bu.edu/fac/snyder/cs132-book/L10MatrixInverse.html) - [Markov Chains](https://www.cs.bu.edu/fac/snyder/cs132-book/L11MarkovChains.html) - [Matrix Factorizations](https://www.cs.bu.edu/fac/snyder/cs132-book/L12MatrixFactorizations.html) - [Computer Graphics](https://www.cs.bu.edu/fac/snyder/cs132-book/L13ComputerGraphics-Spring2021.html) - [Subspaces](https://www.cs.bu.edu/fac/snyder/cs132-book/L14Subspaces.html) - [Dimension and Rank](https://www.cs.bu.edu/fac/snyder/cs132-book/L15DimensionRank.html) - [Eigenvectors and Eigenvalues](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html) - [The Characteristic Equation](https://www.cs.bu.edu/fac/snyder/cs132-book/L17CharacteristicEqn.html) - [Diagonalization](https://www.cs.bu.edu/fac/snyder/cs132-book/L18Diagonalization.html) - [PageRank](https://www.cs.bu.edu/fac/snyder/cs132-book/L19PageRank.html) - [Analytic Geometry in \\(\\mathbb{R}^n\\)](https://www.cs.bu.edu/fac/snyder/cs132-book/L20Orthogonality.html) - [Orthogonal Sets and Projection](https://www.cs.bu.edu/fac/snyder/cs132-book/L21OrthogonalSets.html) - [Least Squares](https://www.cs.bu.edu/fac/snyder/cs132-book/L22LeastSquares.html) - [Linear Models](https://www.cs.bu.edu/fac/snyder/cs132-book/L23LinearModels.html) - [Symmetric Matrices](https://www.cs.bu.edu/fac/snyder/cs132-book/L24SymmetricMatrices.html) - [The Singular Value Decomposition](https://www.cs.bu.edu/fac/snyder/cs132-book/L25SVD.html) Powered by [Jupyter Book](https://jupyterbook.org/) [.ipynb](https://www.cs.bu.edu/fac/snyder/cs132-book/_sources/L16Eigenvectors.ipynb) .pdf [repository](https://github.com/mcrovella/CS132-Geometric-Algorithms) [![Interact on binder](https://www.cs.bu.edu/fac/snyder/cs132-book/_static/images/logo_binder.svg)Binder](https://mybinder.org/v2/gh/mcrovella/CS132-Geometric-Algorithms/master?urlpath=tree/L16Eigenvectors.ipynb) Contents - [Eigenvectors and Eigenvalues](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#id1) - [Checking Eigenvectors and Eigenvalues](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#checking-eigenvectors-and-eigenvalues) - [Eigenspace](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#eigenspace) - [Eigenvalues of a Triangular Matrix](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#eigenvalues-of-a-triangular-matrix) - [Invertibility and Eigenvalues](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#invertibility-and-eigenvalues) - [Eigenvectors and Difference Equations](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#eigenvectors-and-difference-equations) ``` # for lecture use notebook %matplotlib inline qr_setting = None qrviz_setting = 'save' # %config InlineBackend.figure_format='retina' # import libraries import numpy as np import matplotlib as mpf import pandas as pd import matplotlib.pyplot as plt import laUtilities as ut import slideUtilities as sl import demoUtilities as dm import pandas as pd from importlib import reload from datetime import datetime from IPython.display import Image from IPython.display import display_html from IPython.display import display from IPython.display import Math from IPython.display import Latex from IPython.display import HTML; ``` # Eigenvectors and Eigenvalues[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#eigenvectors-and-eigenvalues "Permalink to this headline") ``` cohortMatrix = np.array([ [56, 42, 49, 39, 40, 31, 33, 46, 55, 91, 83, 129, 153, 178], [43, 36, 27, 34, 24, 29, 39, 56, 74, 69, 111, 136, 168, 171], [32, 24, 22, 21, 26, 25, 44, 64, 52, 77, 105, 130, 139, 175], [25, 16, 19, 28, 24, 30, 37, 40, 49, 56, 79, 93, 90, 126]]) nyears = np.shape(cohortMatrix)[1] # index rows by time, and columns by cohort Year = pd.DateOffset(years=1) # need to fliplr because spreadsheet rows are in decreasing cohort order datestrs=['09-2004','09-2005','09-2006','09-2007','09-2008','09-2009','09-2010','09-2011','09-2012','09-2013','09-2014','09-2015','09-2016','09-2017'] dates = [datetime.strptime(x, '%m-%Y') for x in datestrs[:nyears]] cohorts = pd.DataFrame(np.fliplr(cohortMatrix.T),index=dates,columns=pd.Index(['U1','U2','U3','U4'])) # learning the model b = np.array(cohorts[1:]) a = np.array(cohorts[:-1]) x, resid, rank, s = np.linalg.lstsq(a,b,rcond=None) A = x.T # cohorts.sum(axis=1).plot() plt.ylim(ymin=0) # plt.legend(['Model','Actual'],loc='best') plt.title('Number of BU CS Majors',size=20); ``` ![\_images/L16Eigenvectors\_2\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_2_0.png) Currently the number of CS majors is growing very rapidly. We’d like to have some estimate of where the numbers might be going. Modeling the number of students in the CS major is challenging because students enter and leave the major at various points during their undergraduate degree. We will use a simple linear model that we will train on historical data. Our state variable is \\\[\\begin{split}\\mathbf{x}\_t = \\begin{bmatrix}x\_{1,t}\\\\ x\_{2,t}\\\\ x\_{3,t}\\\\ x\_{4,t}\\end{bmatrix}\\end{split}\\\] where \\(x\_{i,t}\\) is the number of students who are declared CS majors and in their \\(i\\)th year at BU, in the fall of year \\(t\\). Our model is a linear dynamical system (or linear recurrence): \\\[ \\mathbf{x}\_{t+1} = A \\mathbf{x}\_t.\\\] Given historical data which are measurements of our state variable from past years, we can estimate \\(A\\) by a method called *least squares* (which we will cover later in the semester.) ``` print(A) ``` ``` [[ 0.62110633 0.28792787 0.0204491 0.10003897] [ 0.65239203 0.55574243 0.30323349 -0.16349735] [ 0.33101614 1.24636712 -0.26153545 0.07684781] [ 0.49319575 -0.30684656 1.00419585 0.07532737]] ``` Now, using \\(A\\), we can predict growth into the future via \\(\\mathbf{x}\_{t+1} = A \\mathbf{x}\_t.\\) ``` OneYearAhead = cohorts.dot(A.T).shift(1,freq=Year) OneYearAhead.columns=pd.Index(['U1','U2','U3','U4']) OneYearAhead.sum(axis=1).plot() cohorts.sum(axis=1).plot() plt.ylim(ymin=0) plt.legend(['Model','Actual'],loc='best'); ``` ![\_images/L16Eigenvectors\_8\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_8_0.png) Now, the matrix \\(A\\) captures the complex way that one year’s student population relates to the next year’s population. It can be hard to understand what is going on just by looking at the values of \\(A\\). But, when we look at the plot above, it appears that there is a relatively simple kind of exponential growth going on. Can we “extract” this exponential growth model from \\(A\\)? How? Today’s lecture will start to develop the tools to answer such questions (and more). ## Eigenvectors and Eigenvalues[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#id1 "Permalink to this headline") As we’ve seen, linear transformations (thinking geometrically) can “move” a vector to a new location. For example, a linear transformation \\(A\\mathbf{x}\\) can do the following to \\(\\mathbf{x}\\): - rotate \\(\\mathbf{x}\\) - reflect \\(\\mathbf{x}\\) - project \\(\\mathbf{x}\\) - scale \\(\\mathbf{x}\\) Of the above transformations, one is a bit different: **scaling**. That is because, if a matrix \\(A\\) scales \\(\\mathbf{x}\\), that transformation could also have been expressed **without** a matrix-vector multiplication i.e., as \\(\\lambda\\mathbf{x}\\) for some scalar value \\(\\lambda\\). In many ways, life would be simpler if we could avoid all of the complexities of rotation, reflection, and projection … and only work in terms of scaling. That is the idea we will develop today. Demo: **eigshow**. ``` # import eigshow # eigshow.eigshow() ``` Let’s start with a simple example: a **shear** matrix: \\\[\\begin{split} A = \\begin{bmatrix}1&0.1\\\\0&1\\end{bmatrix} \\end{split}\\\] Let’s look at what this matrix does to each point in the plane. ``` A = np.array([[1, 0.1], [0, 1]]) ax = ut.plotSetup(-7,7,-7, 7, size=(12,8)) ut.centerAxes(ax) for x in range(-6,7): for y in range(-6,7): v = np.array([x,y]) ut.plotVec(ax, v) ut.plotVec(ax, A.dot(v),'b') ``` ![\_images/L16Eigenvectors\_20\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_20_0.png) We can use arrows to show how each red point is moved to the corresponding blue point. ``` sl.hide_code_in_slideshow() ax = ut.plotSetup(-7,7,-7, 7, size=(12,8)) ut.centerAxes(ax) A = np.array([[1,0.1],[0,1]]) for x in range(-6,7): for y in range(-6,7): ut.plotArrowVec(ax, A.dot(np.array([x,y])), [x,y]) ``` ![\_images/L16Eigenvectors\_22\_1.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_22_1.png) Now let’s look at a more complicated situation. Consider the matrix \\(A = \\begin{bmatrix}1.0&0.2\\\\0.1&0.9\\end{bmatrix}\\). ``` sl.hide_code_in_slideshow() V = np.array([[2,-1],[1,1]]) L = np.array([[1.1,0], [0,0.8]]) A = V.dot(L).dot(np.linalg.inv(V)) ax = ut.plotSetup(-7,7,-7, 7, size=(12,8)) ut.centerAxes(ax) for x in range(-6,7): for y in range(-6,7): v = np.array([x,y]) ut.plotArrowVec(ax, A.dot(v), v) ``` ![\_images/L16Eigenvectors\_24\_1.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_24_1.png) The starting point for understanding what \\(A\\) does is to look for special vectors which **do not change their direction** when multiplied by \\(A\\). **Example.** Let \\(A = \\begin{bmatrix}3&-2\\\\1&0\\end{bmatrix}, \\mathbf{u} = \\begin{bmatrix}-1\\\\1\\end{bmatrix}, \\mathbf{v}=\\begin{bmatrix}2\\\\1\\end{bmatrix}.\\) The images of \\(\\mathbf{u}\\) and \\(\\mathbf{v}\\) under multiplication by \\(A\\) are shown here: ``` ax = ut.plotSetup(size=(12,8)) ut.centerAxes(ax) A = np.array([[3,-2],[1,0]]) u = np.array([-1,1]) v = np.array([2,1]) # ut.plotArrowVec(ax, v, [0,0], color='Blue') ut.plotArrowVec(ax, A.dot(v), [0,0], color='Blue') ax.text(v[0],v[1]+0.2,r'${\bf v}$',size=20) ax.text(A.dot(v)[0],A.dot(v)[1]+0.2,r'$A{\bf v}$',size=20) # ut.plotArrowVec(ax, u, [0,0], color='Red') ut.plotArrowVec(ax, A.dot(u), [0,0], color='Red') ax.text(u[0]-0.5,u[1]+0.1,r'${\bf u}$',size=20) ax.text(A.dot(u)[0]-0.7,A.dot(u)[1]+0.3,r'$A{\bf u}$',size=20); ``` ![\_images/L16Eigenvectors\_27\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_27_0.png) For a “typical” vector (like \\(\\mathbf{u}\\)) the action of \\(A\\) is to scale, reflect, and/or rotate it. But for some special vectors (like \\(\\mathbf{v}\\)) the action of \\(A\\) is only to “stretch” it (without any rotation or reflection). In fact, \\(A\\mathbf{v}\\) is simply \\(2\\mathbf{v}.\\) Today we will by studying equations such as \\\[ A\\mathbf{x} = 2\\mathbf{x}\\;\\;\\;\\;\\mbox{or}\\;\\;\\;\\;A\\mathbf{x} = -4\\mathbf{x}\\\] where special vectors are transformed by \\(A\\) into scalar multiples of themselves. **Definition.** An **eigenvector** of an \\(n\\times n\\) matrix \\(A\\) is a nonzero vector \\(\\mathbf{x}\\) such that \\(A\\mathbf{x} = \\lambda\\mathbf{x}\\) for some scalar \\(\\lambda.\\) A scalar \\(\\lambda\\) is called an **eigenvalue** of \\(A\\) if there is a nontrivial solution \\(\\mathbf{x}\\) of \\(A\\mathbf{x} = \\lambda\\mathbf{x}.\\) Such an \\(\\mathbf{x}\\) is called an *eigenvector corresponding to \\(\\lambda.\\)* ## Checking Eigenvectors and Eigenvalues[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#checking-eigenvectors-and-eigenvalues "Permalink to this headline") Before considering how to compute eigenvectors, let’s look at simply how to determine if a given vector is an eigenvector of a matrix. **Example.** Let \\(A = \\begin{bmatrix}1&6\\\\5&2\\end{bmatrix}, \\mathbf{u} = \\begin{bmatrix}6\\\\-5\\end{bmatrix}, \\mathbf{v}=\\begin{bmatrix}3\\\\-2\\end{bmatrix}.\\) Are \\(\\mathbf{u}\\) and \\(\\mathbf{v}\\) eigenvectors of \\(A\\)? **Solution.** \\\[\\begin{split}A\\mathbf{u} = \\begin{bmatrix}1&6\\\\5&2\\end{bmatrix}\\begin{bmatrix}6\\\\-5\\end{bmatrix} = \\begin{bmatrix}-24\\\\20\\end{bmatrix} = -4\\begin{bmatrix}6\\\\-5\\end{bmatrix} = -4\\mathbf{u}.\\end{split}\\\] \\\[\\begin{split}A\\mathbf{v} = \\begin{bmatrix}1&6\\\\5&2\\end{bmatrix}\\begin{bmatrix}3\\\\-2\\end{bmatrix} = \\begin{bmatrix}-9\\\\11\\end{bmatrix} \\neq \\lambda\\begin{bmatrix}3\\\\-2\\end{bmatrix}.\\end{split}\\\] So \\(\\mathbf{u}\\) is an eigenvector corresponding to the eigenvalue \\(-4\\), but \\(\\mathbf{v}\\) is not an eigenvector of \\(A\\). **Example.** Show that 7 is an eigenvalue of the matrix \\(A\\), and find the corresponding eigenvectors. For 7 to be an eigenvalue of \\(A\\), it must be the case that \\\[ A\\mathbf{x}=7\\mathbf{x}\\\] has a nontrivial solution. This is equivalent to: \\\[ A\\mathbf{x} - 7\\mathbf{x} = {\\bf 0}\\\] or \\\[ (A - 7I)\\mathbf{x} = {\\bf 0}.\\\] To solve this homogeneous equation, form the matrix \\\[\\begin{split}A - 7I = \\begin{bmatrix}1&6\\\\5&2\\end{bmatrix}-\\begin{bmatrix}7&0\\\\0&7\\end{bmatrix} = \\begin{bmatrix}-6&6\\\\5&-5\\end{bmatrix}.\\end{split}\\\] The columns of \\(A-7I\\) are obviously linearly dependent, so \\((A - 7I)\\mathbf{x} = {\\bf 0}\\) has a nontrivial solution. So 7 is an eigenvalue of \\(A\\). To find the corresponding eigenvectors, we solve \\((A - 7I)\\mathbf{x} = {\\bf 0}\\) using row operations: \\\[\\begin{split}\\begin{bmatrix}-6&6&0\\\\5&-5&0\\end{bmatrix}\\sim\\begin{bmatrix}1&-1&0\\\\0&0&0\\end{bmatrix}.\\end{split}\\\] This says that \\(x\_1 = x\_2,\\) and \\(x\_2\\) is free. So the general solution has the form \\(x\_2\\begin{bmatrix}1\\\\1\\end{bmatrix}.\\) Each vector of this form with \\(x\_2 \\neq 0\\) is an eigenvector corresponding to \\(\\lambda = 7.\\) Question TIme! Q16.1 ## Eigenspace[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#eigenspace "Permalink to this headline") Now, let’s generalize what we just did. From the previous example, we can conclude that \\(\\lambda\\) is an eigenvalue of an \\(n\\times n\\) matrix \\(A\\) if and only if the equation \\\[(A-\\lambda I)\\mathbf{x} = {\\bf 0}\\\] has a nontrivial solution. Notice that the set of *all* solutions of that equation is just **the null space of the matrix \\(A-\\lambda I.\\)** So the set of all eigenvectors corresponding to a particular \\(\\lambda\\) is a *subspace of \\(\\mathbb{R}^n\\)* and is called the **eigenspace** of \\(A\\) corresponding to \\(\\lambda\\). **Example.** For matrix \\(A = \\begin{bmatrix}1&6\\\\5&2\\end{bmatrix},\\) we found that the general solution for the eigenvector corrsponding to \\(\\lambda = 7\\) is the expression \\(x\_2\\begin{bmatrix}1\\\\1\\end{bmatrix}.\\) This means that the eigenspace corresponding to \\(\\lambda = 7\\) consists of all multiples of \\(\\begin{bmatrix}1\\\\1\\end{bmatrix},\\) which is the line through \\((1,1)\\) and the origin. We could also show that the eigenspace corresponding to \\(\\lambda = -4\\) is the line through \\((-6,5)\\). **Demo:** confirm using eigshow. ``` ax = ut.plotSetup(-10,10,-6,10,size=(12,8)) ut.centerAxes(ax) A = np.array([[1,6],[5,2]]) u = np.array([1,1]) v = np.array([3./2,-5./4]) # ax.plot(v[0],v[1],'bo') ax.plot(A.dot(v)[0],A.dot(v)[1],'bo') ax.text(v[0],v[1]+0.2,r'${\bf v}$',size=20) ax.text(A.dot(v)[0]+0.2,A.dot(v)[1]+0.2,r'$A{\bf v}$',size=20) ax.plot([-10,10],[5*10/6.0,-5*10/6.0],'b-',lw=2) # ax.plot(u[0],u[1],'ro') ax.plot(A.dot(u)[0],A.dot(u)[1],'ro') ax.text(u[0]+0.7,u[1],r'${\bf u}$',size=20) ax.text(A.dot(u)[0]+0.7,A.dot(u)[1],r'$A{\bf u}$',size=20) ax.plot([-6,10],[-6,10],'r-',lw=2) # ax.annotate('', xy=(A.dot(u)[0], A.dot(u)[1]), xycoords='data', xytext=(u[0], u[1]), textcoords='data', size=18, #bbox=dict(boxstyle="round", fc="0.8"), arrowprops={'arrowstyle': 'simple', 'fc': '0.5', 'ec': 'none', 'connectionstyle' : 'arc3,rad=0.5'}, ) ax.annotate('', xy=(A.dot(v)[0], A.dot(v)[1]), xycoords='data', xytext=(v[0], v[1]), textcoords='data', size=18, #bbox=dict(boxstyle="round", fc="0.8"), arrowprops={'arrowstyle': 'simple', 'fc': '0.5', 'ec': 'none', 'connectionstyle' : 'arc3,rad=-0.5'}, ) ax.text(3,9,r'Eigenspace for $\lambda$ = 7',size=16) ax.text(-9.5,8,r'Eigenspace for $\lambda = -4$',size=16); ``` ![\_images/L16Eigenvectors\_54\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_54_0.png) **Example.** Let \\(A = \\begin{bmatrix}4&-1&6\\\\2&1&6\\\\2&-1&8\\end{bmatrix}.\\) An eigenvalue of \\(A\\) is \\(2.\\) Find a basis for the the corresponding eigenspace. **Solution.** The eigenspace we are looking for is the nullspace of \\(A - 2I.\\) We will use the strategy for finding a basis for a nullspace that we learned in the last lecture. Form \\\[\\begin{split}A - 2I = \\begin{bmatrix}4&-1&6\\\\2&1&6\\\\2&-1&8\\end{bmatrix}-\\begin{bmatrix}2&0&0\\\\0&2&0\\\\0&0&2\\end{bmatrix} = \\begin{bmatrix}2&-1&6\\\\2&-1&6\\\\2&-1&6\\end{bmatrix}\\end{split}\\\] and row reduce the augmented matrix for \\((A-2I)\\mathbf{x}={\\bf 0}\\): \\\[\\begin{split}\\begin{bmatrix}2&-1&6&0\\\\2&-1&6&0\\\\2&-1&6&0\\end{bmatrix} \\sim \\begin{bmatrix}2&-1&6&0\\\\0&0&0&0\\\\0&0&0&0\\end{bmatrix}\\end{split}\\\] At this point, it is clear that 2 is indeed an eigenvalue of \\(A\\) because the equation \\((A-2I)\\mathbf{x} = {\\bf 0}\\) has free variables. The general solution is \\(x\_1 = \\frac{1}{2}x\_2 +-3x\_3.\\) Expressed as a vector, the general solution is: \\\[\\begin{split}\\begin{bmatrix}\\frac{1}{2}x\_2 - 3x\_3\\\\x\_2\\\\x\_3\\end{bmatrix}.\\end{split}\\\] Expressed as a vector sum, this is: \\\[\\begin{split}x\_2\\begin{bmatrix}1/2\\\\1\\\\0\\end{bmatrix} + x\_3\\begin{bmatrix}-3\\\\0\\\\1\\end{bmatrix}\\;\\;\\;\\;\\mbox{with \$x\_2\$ and \$x\_3\$ free.}\\end{split}\\\] So this eigenspace is a two-dimensional subspace of \\(\\mathbb{R}^3,\\) and a basis for this subspace is \\\[\\begin{split}\\left\\{\\begin{bmatrix}1/2\\\\1\\\\0\\end{bmatrix},\\begin{bmatrix}-3\\\\0\\\\1\\end{bmatrix}\\right\\}.\\end{split}\\\] ``` fig = ut.three_d_figure((16, 1), fig_desc = '2D Eigenspace for lambda = 2', xmin = -3, xmax = 3, ymin = -3, ymax = 3, zmin = -3, zmax = 3, qr = qr_setting, figsize = (12, 8)) v = np.array([1.,2.,0]) v = v/np.sqrt(np.sum(v*v)) u = np.array([-3.0,0,1.0]) u = u/np.sqrt(np.sum(u*u)) w = -v-2.0*u w = w/np.sqrt(np.sum(w*w)) pts = np.array([v,u,w]) #ax.text(v[0],v[1],v[2],r'$\bf v$',size=20) #ax.text(u[0],u[1],u[2],r'$\bf u$',size=20) fig.text(-2, -2, -3, r'Eigenspace for $\lambda=2$', 'Eigenspace for lambda = 2', size=20) #ax.text(0.2,0.2,-4,r'$\bf 0$',size=20) # plotting the span of v fig.plotSpan(u, v, 'Green') fig.plotPoint(u[0], u[1], u[2], 'r') fig.plotPoint(v[0], v[1], v[2], 'r') fig.plotPoint(w[0], w[1], w[2], 'r') fig.plotPoint(0, 0, 0, 'k') plt.quiver(pts[:,0],pts[:,1],pts[:,2],pts[:,0],pts[:,1],pts[:,2],length=1.0,color='Red',lw=2) pts = 2.0*pts plt.quiver(pts[:,0],pts[:,1],pts[:,2],pts[:,0],pts[:,1],pts[:,2],length=1.0,color='Blue',lw=2) u = 2*u v = 2*v w = 2*w fig.plotPoint(u[0], u[1], u[2], 'b') fig.plotPoint(v[0], v[1], v[2], 'b') fig.plotPoint(w[0], w[1], w[2], 'b') # plotting the axes # ut.plotIntersection3d(ax,[0,0,1,0],[0,1,0,0]) # ut.plotIntersection3d(ax,[0,0,1,0],[1,0,0,0]) fig.plotIntersection([0,1,0,0], [1,0,0,0]) fig.save(); ``` ![\_images/L16Eigenvectors\_63\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_63_0.png) Question Time! Q16.2 ## Eigenvalues of a Triangular Matrix[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#eigenvalues-of-a-triangular-matrix "Permalink to this headline") **Theorem.** The eigenvalues of a triangular matrix are the entries on its main diagonal. **Proof.** We’ll consider the \\(3\\times 3\\) case. If \\(A\\) is upper triangular, then \\(A-\\lambda I\\) has the form \\\[\\begin{split}A - \\lambda I = \\begin{bmatrix}a\_{11}\&a\_{12}\&a\_{13}\\\\0\&a\_{22}\&a\_{23}\\\\0&0\&a\_{33}\\end{bmatrix} - \\begin{bmatrix}\\lambda&0&0\\\\0&\\lambda&0\\\\0&0&\\lambda\\end{bmatrix}\\end{split}\\\] \\\[\\begin{split} = \\begin{bmatrix}a\_{11}-\\lambda\&a\_{12}\&a\_{13}\\\\0\&a\_{22}-\\lambda\&a\_{23}\\\\0&0\&a\_{33}-\\lambda\\end{bmatrix}\\end{split}\\\] Now, \\(\\lambda\\) is an eigenvalue of \\(A\\) if and only if the equation \\((A-\\lambda I)\\mathbf{x} = {\\bf 0}\\) has a nontrivial solution – in other words, if and only if the equation has a free variable. The special pattern that the zero entries have in \\(A\\) means that there will be a free variable if any diagonal entry is also zero, because then there will be a column without a pivot. That is, \\((A-\\lambda I)\\mathbf{x} = {\\bf 0}\\) has a free variable if and only if at least one of the entries on the diagonal of \\((A-\\lambda I)\\) is zero. This happens if and only if \\(\\lambda\\) equals one of the entries \\(a\_{11}, a\_{22},\\) or \\(a\_{33}\\) on the diagonal of \\(A\\). **Example.** Let \\(A = \\begin{bmatrix}3&6&-8\\\\0&0&6\\\\0&0&2\\end{bmatrix}\\) and \\(B = \\begin{bmatrix}4&0&0\\\\-2&1&0\\\\5&3&4\\end{bmatrix}\\). Then the eigenvalues of \\(A\\) are 3, 0, and 2. The eigenvalues of \\(B\\) are 4 and 1. ## Invertibility and Eigenvalues[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#invertibility-and-eigenvalues "Permalink to this headline") So far we haven’t probed what it means for a matrix to have an eigenvalue of 0. This happens if and only if the equation \\(A\\mathbf{x} = 0\\mathbf{x}\\) has a nontrivial solution. But that equation is equivalent to \\(A\\mathbf{x} = {\\bf 0}\\) which has a nontrivial solution if and only if \\(A\\) is not invertible. This means that 0 is an eigenvalue of \$A\$ if and only if \$A\$ is not invertible. This draws an important connection between invertibility and zero eigenvalues. So we have **yet another** addition to the Invertible Matrix Theorem\! Question Time! Q16.3 ## Eigenvectors and Difference Equations[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#eigenvectors-and-difference-equations "Permalink to this headline") Let’s return to the problem we considered at the outset: predicting future values of \\(\\mathbf{x}\_t\\) (the number of CS majors of each class in year \\(t\\)). We modeled the future as \\(\\mathbf{x}\_{t+1} = A\\mathbf{x}\_t\\;\\;\\;(t = 0,1,2,\\dots).\\) So then \\(\\mathbf{x}\_{t} = A^t\\mathbf{x}\_0\\;\\;\\;(t = 0,1,2,\\dots).\\) A **solution** of such an equation is an explicit description of \\(\\{\\mathbf{x}\_t\\}\\) whose formula for each \\(\\mathbf{x}\_t\\) does not depend directly on \\(A\\) or on the preceding terms in the sequence other than the initial term \\(\\mathbf{x}\_0.\\) The simplest way to build a solution is to take an eigenvector \\(\\mathbf{x}\_0\\) and its corresponding eigenvalue \\(\\lambda\\). Then \$\\( \\mathbf{x}\_t = \\lambda^t \\mathbf{x}\_0 \\;\\;\\;(t = 0,1,2,\\dots)\\)\$ This sequence is a solution because \\\[A\\mathbf{x}\_t = A(\\lambda^t\\mathbf{x}\_0) = \\lambda^t(A\\mathbf{x}\_0) = \\lambda^k(\\lambda\\mathbf{x}\_0) = \\lambda^{t+1}\\mathbf{x}\_0 = \\mathbf{x}\_{t+1}.\\\] We can extend this to combinations of eigenvectors. If \\(A\\) has two eigenvectors \\(\\mathbf{u}\\) and \\(\\mathbf{v}\\) corresponding to eigenvalues \\(\\lambda\\) and \\(\\mu\\), then another solution is \\\[ x\_t = \\lambda^t\\mathbf{u} + \\mu^t \\mathbf{v}\\\] To address the problem of predicting the CS major population, we assume that we can express the state vector \\(\\mathbf{x}\\) as a linear combination of eigenvectors \\(\\mathbf{u}\_1, \\mathbf{u}\_2, \\dots,\\mathbf{u}\_p\\). That is, \\\[\\mathbf{x}\_0 = a\_1\\mathbf{u}\_1 + a\_2\\mathbf{u}\_2+\\dots+a\_p\\mathbf{u}\_p.\\\] Then at time \\(t\\), \\\[\\mathbf{x}\_t = \\lambda\_1^ta\_1\\mathbf{u}\_1 + \\lambda\_2^ta\_2\\mathbf{u}\_2 + \\dots + \\lambda\_p^ta\_p\\mathbf{u}\_p.\\\] Which gives us an explicit solution for the number of students in the major in any future year. Note that the expression above is exponential in each of the \\(\\lambda\_1,\\dots,\\lambda\_p\\). The resulting value will be essentially determined by the largest \\(\\lambda\_i\\). To see this, let’s say that \\(\\lambda\_1\\) is the largest eigenvalue, and \\(\\lambda\_2\\) is the second largest. Then the proportion of \\(\\mathbf{u}\_1\\) to \\(\\mathbf{u}\_2\\) contained in \\(\\mathbf{x}\_t\\) will be given by \\((\\lambda\_1/\\lambda\_2)^t\\). Since this ratio is larger than 1, the relative proportion of \\(\\mathbf{u}\_1\\) will grow exponentially. This shows the importance of the largest eigenvalue in the value of \\(\\mathbf{x}\_t\\) for large \\(t\\). In the long run, \\(\\mathbf{x}\_t\\) will be very close to \\(\\lambda\_1^t\\mathbf{u}\_1.\\) The largest eigenvalue shows the asymptotic rate of growth (or shrinkage) of the state vector \\(\\mathbf{x}\_t\\). For the matrix \\(A\\) that describes CS major population growth, the largest eigenvalue is 1.19. In other words, this models the population growth rate of CS majors at that time as increasing about 19% per year. [Dimension and Rank](https://www.cs.bu.edu/fac/snyder/cs132-book/L15DimensionRank.html "previous page") [The Characteristic Equation](https://www.cs.bu.edu/fac/snyder/cs132-book/L17CharacteristicEqn.html "next page") By Mark Crovella © Copyright 2020-2021.
Readable Markdown
``` cohortMatrix = np.array([ [56, 42, 49, 39, 40, 31, 33, 46, 55, 91, 83, 129, 153, 178], [43, 36, 27, 34, 24, 29, 39, 56, 74, 69, 111, 136, 168, 171], [32, 24, 22, 21, 26, 25, 44, 64, 52, 77, 105, 130, 139, 175], [25, 16, 19, 28, 24, 30, 37, 40, 49, 56, 79, 93, 90, 126]]) nyears = np.shape(cohortMatrix)[1] # index rows by time, and columns by cohort Year = pd.DateOffset(years=1) # need to fliplr because spreadsheet rows are in decreasing cohort order datestrs=['09-2004','09-2005','09-2006','09-2007','09-2008','09-2009','09-2010','09-2011','09-2012','09-2013','09-2014','09-2015','09-2016','09-2017'] dates = [datetime.strptime(x, '%m-%Y') for x in datestrs[:nyears]] cohorts = pd.DataFrame(np.fliplr(cohortMatrix.T),index=dates,columns=pd.Index(['U1','U2','U3','U4'])) # learning the model b = np.array(cohorts[1:]) a = np.array(cohorts[:-1]) x, resid, rank, s = np.linalg.lstsq(a,b,rcond=None) A = x.T # cohorts.sum(axis=1).plot() plt.ylim(ymin=0) # plt.legend(['Model','Actual'],loc='best') plt.title('Number of BU CS Majors',size=20); ``` ![\_images/L16Eigenvectors\_2\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_2_0.png) Currently the number of CS majors is growing very rapidly. We’d like to have some estimate of where the numbers might be going. Modeling the number of students in the CS major is challenging because students enter and leave the major at various points during their undergraduate degree. We will use a simple linear model that we will train on historical data. Our state variable is \\\[\\begin{split}\\mathbf{x}\_t = \\begin{bmatrix}x\_{1,t}\\\\ x\_{2,t}\\\\ x\_{3,t}\\\\ x\_{4,t}\\end{bmatrix}\\end{split}\\\] where \\(x\_{i,t}\\) is the number of students who are declared CS majors and in their \\(i\\)th year at BU, in the fall of year \\(t\\). Our model is a linear dynamical system (or linear recurrence): \\\[ \\mathbf{x}\_{t+1} = A \\mathbf{x}\_t.\\\] Given historical data which are measurements of our state variable from past years, we can estimate \\(A\\) by a method called *least squares* (which we will cover later in the semester.) ``` print(A) ``` ``` [[ 0.62110633 0.28792787 0.0204491 0.10003897] [ 0.65239203 0.55574243 0.30323349 -0.16349735] [ 0.33101614 1.24636712 -0.26153545 0.07684781] [ 0.49319575 -0.30684656 1.00419585 0.07532737]] ``` Now, using \\(A\\), we can predict growth into the future via \\(\\mathbf{x}\_{t+1} = A \\mathbf{x}\_t.\\) ``` OneYearAhead = cohorts.dot(A.T).shift(1,freq=Year) OneYearAhead.columns=pd.Index(['U1','U2','U3','U4']) OneYearAhead.sum(axis=1).plot() cohorts.sum(axis=1).plot() plt.ylim(ymin=0) plt.legend(['Model','Actual'],loc='best'); ``` ![\_images/L16Eigenvectors\_8\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_8_0.png) Now, the matrix \\(A\\) captures the complex way that one year’s student population relates to the next year’s population. It can be hard to understand what is going on just by looking at the values of \\(A\\). But, when we look at the plot above, it appears that there is a relatively simple kind of exponential growth going on. Can we “extract” this exponential growth model from \\(A\\)? How? Today’s lecture will start to develop the tools to answer such questions (and more). ## Eigenvectors and Eigenvalues[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#id1 "Permalink to this headline") As we’ve seen, linear transformations (thinking geometrically) can “move” a vector to a new location. For example, a linear transformation \\(A\\mathbf{x}\\) can do the following to \\(\\mathbf{x}\\): - rotate \\(\\mathbf{x}\\) - reflect \\(\\mathbf{x}\\) - project \\(\\mathbf{x}\\) - scale \\(\\mathbf{x}\\) Of the above transformations, one is a bit different: **scaling**. That is because, if a matrix \\(A\\) scales \\(\\mathbf{x}\\), that transformation could also have been expressed **without** a matrix-vector multiplication i.e., as \\(\\lambda\\mathbf{x}\\) for some scalar value \\(\\lambda\\). In many ways, life would be simpler if we could avoid all of the complexities of rotation, reflection, and projection … and only work in terms of scaling. That is the idea we will develop today. Demo: **eigshow**. ``` # import eigshow # eigshow.eigshow() ``` Let’s start with a simple example: a **shear** matrix: \\\[\\begin{split} A = \\begin{bmatrix}1&0.1\\\\0&1\\end{bmatrix} \\end{split}\\\] Let’s look at what this matrix does to each point in the plane. ``` A = np.array([[1, 0.1], [0, 1]]) ax = ut.plotSetup(-7,7,-7, 7, size=(12,8)) ut.centerAxes(ax) for x in range(-6,7): for y in range(-6,7): v = np.array([x,y]) ut.plotVec(ax, v) ut.plotVec(ax, A.dot(v),'b') ``` ![\_images/L16Eigenvectors\_20\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_20_0.png) We can use arrows to show how each red point is moved to the corresponding blue point. ``` sl.hide_code_in_slideshow() ax = ut.plotSetup(-7,7,-7, 7, size=(12,8)) ut.centerAxes(ax) A = np.array([[1,0.1],[0,1]]) for x in range(-6,7): for y in range(-6,7): ut.plotArrowVec(ax, A.dot(np.array([x,y])), [x,y]) ``` ![\_images/L16Eigenvectors\_22\_1.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_22_1.png) Now let’s look at a more complicated situation. Consider the matrix \\(A = \\begin{bmatrix}1.0&0.2\\\\0.1&0.9\\end{bmatrix}\\). ``` sl.hide_code_in_slideshow() V = np.array([[2,-1],[1,1]]) L = np.array([[1.1,0], [0,0.8]]) A = V.dot(L).dot(np.linalg.inv(V)) ax = ut.plotSetup(-7,7,-7, 7, size=(12,8)) ut.centerAxes(ax) for x in range(-6,7): for y in range(-6,7): v = np.array([x,y]) ut.plotArrowVec(ax, A.dot(v), v) ``` ![\_images/L16Eigenvectors\_24\_1.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_24_1.png) The starting point for understanding what \\(A\\) does is to look for special vectors which **do not change their direction** when multiplied by \\(A\\). **Example.** Let \\(A = \\begin{bmatrix}3&-2\\\\1&0\\end{bmatrix}, \\mathbf{u} = \\begin{bmatrix}-1\\\\1\\end{bmatrix}, \\mathbf{v}=\\begin{bmatrix}2\\\\1\\end{bmatrix}.\\) The images of \\(\\mathbf{u}\\) and \\(\\mathbf{v}\\) under multiplication by \\(A\\) are shown here: ``` ax = ut.plotSetup(size=(12,8)) ut.centerAxes(ax) A = np.array([[3,-2],[1,0]]) u = np.array([-1,1]) v = np.array([2,1]) # ut.plotArrowVec(ax, v, [0,0], color='Blue') ut.plotArrowVec(ax, A.dot(v), [0,0], color='Blue') ax.text(v[0],v[1]+0.2,r'${\bf v}$',size=20) ax.text(A.dot(v)[0],A.dot(v)[1]+0.2,r'$A{\bf v}$',size=20) # ut.plotArrowVec(ax, u, [0,0], color='Red') ut.plotArrowVec(ax, A.dot(u), [0,0], color='Red') ax.text(u[0]-0.5,u[1]+0.1,r'${\bf u}$',size=20) ax.text(A.dot(u)[0]-0.7,A.dot(u)[1]+0.3,r'$A{\bf u}$',size=20); ``` ![\_images/L16Eigenvectors\_27\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_27_0.png) For a “typical” vector (like \\(\\mathbf{u}\\)) the action of \\(A\\) is to scale, reflect, and/or rotate it. But for some special vectors (like \\(\\mathbf{v}\\)) the action of \\(A\\) is only to “stretch” it (without any rotation or reflection). In fact, \\(A\\mathbf{v}\\) is simply \\(2\\mathbf{v}.\\) Today we will by studying equations such as \\\[ A\\mathbf{x} = 2\\mathbf{x}\\;\\;\\;\\;\\mbox{or}\\;\\;\\;\\;A\\mathbf{x} = -4\\mathbf{x}\\\] where special vectors are transformed by \\(A\\) into scalar multiples of themselves. **Definition.** An **eigenvector** of an \\(n\\times n\\) matrix \\(A\\) is a nonzero vector \\(\\mathbf{x}\\) such that \\(A\\mathbf{x} = \\lambda\\mathbf{x}\\) for some scalar \\(\\lambda.\\) A scalar \\(\\lambda\\) is called an **eigenvalue** of \\(A\\) if there is a nontrivial solution \\(\\mathbf{x}\\) of \\(A\\mathbf{x} = \\lambda\\mathbf{x}.\\) Such an \\(\\mathbf{x}\\) is called an *eigenvector corresponding to \\(\\lambda.\\)* ## Checking Eigenvectors and Eigenvalues[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#checking-eigenvectors-and-eigenvalues "Permalink to this headline") Before considering how to compute eigenvectors, let’s look at simply how to determine if a given vector is an eigenvector of a matrix. **Example.** Let \\(A = \\begin{bmatrix}1&6\\\\5&2\\end{bmatrix}, \\mathbf{u} = \\begin{bmatrix}6\\\\-5\\end{bmatrix}, \\mathbf{v}=\\begin{bmatrix}3\\\\-2\\end{bmatrix}.\\) Are \\(\\mathbf{u}\\) and \\(\\mathbf{v}\\) eigenvectors of \\(A\\)? **Solution.** \\\[\\begin{split}A\\mathbf{u} = \\begin{bmatrix}1&6\\\\5&2\\end{bmatrix}\\begin{bmatrix}6\\\\-5\\end{bmatrix} = \\begin{bmatrix}-24\\\\20\\end{bmatrix} = -4\\begin{bmatrix}6\\\\-5\\end{bmatrix} = -4\\mathbf{u}.\\end{split}\\\] \\\[\\begin{split}A\\mathbf{v} = \\begin{bmatrix}1&6\\\\5&2\\end{bmatrix}\\begin{bmatrix}3\\\\-2\\end{bmatrix} = \\begin{bmatrix}-9\\\\11\\end{bmatrix} \\neq \\lambda\\begin{bmatrix}3\\\\-2\\end{bmatrix}.\\end{split}\\\] So \\(\\mathbf{u}\\) is an eigenvector corresponding to the eigenvalue \\(-4\\), but \\(\\mathbf{v}\\) is not an eigenvector of \\(A\\). **Example.** Show that 7 is an eigenvalue of the matrix \\(A\\), and find the corresponding eigenvectors. For 7 to be an eigenvalue of \\(A\\), it must be the case that \\\[ A\\mathbf{x}=7\\mathbf{x}\\\] has a nontrivial solution. This is equivalent to: \\\[ A\\mathbf{x} - 7\\mathbf{x} = {\\bf 0}\\\] or \\\[ (A - 7I)\\mathbf{x} = {\\bf 0}.\\\] To solve this homogeneous equation, form the matrix \\\[\\begin{split}A - 7I = \\begin{bmatrix}1&6\\\\5&2\\end{bmatrix}-\\begin{bmatrix}7&0\\\\0&7\\end{bmatrix} = \\begin{bmatrix}-6&6\\\\5&-5\\end{bmatrix}.\\end{split}\\\] The columns of \\(A-7I\\) are obviously linearly dependent, so \\((A - 7I)\\mathbf{x} = {\\bf 0}\\) has a nontrivial solution. So 7 is an eigenvalue of \\(A\\). To find the corresponding eigenvectors, we solve \\((A - 7I)\\mathbf{x} = {\\bf 0}\\) using row operations: \\\[\\begin{split}\\begin{bmatrix}-6&6&0\\\\5&-5&0\\end{bmatrix}\\sim\\begin{bmatrix}1&-1&0\\\\0&0&0\\end{bmatrix}.\\end{split}\\\] This says that \\(x\_1 = x\_2,\\) and \\(x\_2\\) is free. So the general solution has the form \\(x\_2\\begin{bmatrix}1\\\\1\\end{bmatrix}.\\) Each vector of this form with \\(x\_2 \\neq 0\\) is an eigenvector corresponding to \\(\\lambda = 7.\\) Question TIme! Q16.1 ## Eigenspace[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#eigenspace "Permalink to this headline") Now, let’s generalize what we just did. From the previous example, we can conclude that \\(\\lambda\\) is an eigenvalue of an \\(n\\times n\\) matrix \\(A\\) if and only if the equation \\\[(A-\\lambda I)\\mathbf{x} = {\\bf 0}\\\] has a nontrivial solution. Notice that the set of *all* solutions of that equation is just **the null space of the matrix \\(A-\\lambda I.\\)** So the set of all eigenvectors corresponding to a particular \\(\\lambda\\) is a *subspace of \\(\\mathbb{R}^n\\)* and is called the **eigenspace** of \\(A\\) corresponding to \\(\\lambda\\). **Example.** For matrix \\(A = \\begin{bmatrix}1&6\\\\5&2\\end{bmatrix},\\) we found that the general solution for the eigenvector corrsponding to \\(\\lambda = 7\\) is the expression \\(x\_2\\begin{bmatrix}1\\\\1\\end{bmatrix}.\\) This means that the eigenspace corresponding to \\(\\lambda = 7\\) consists of all multiples of \\(\\begin{bmatrix}1\\\\1\\end{bmatrix},\\) which is the line through \\((1,1)\\) and the origin. We could also show that the eigenspace corresponding to \\(\\lambda = -4\\) is the line through \\((-6,5)\\). **Demo:** confirm using eigshow. ``` ax = ut.plotSetup(-10,10,-6,10,size=(12,8)) ut.centerAxes(ax) A = np.array([[1,6],[5,2]]) u = np.array([1,1]) v = np.array([3./2,-5./4]) # ax.plot(v[0],v[1],'bo') ax.plot(A.dot(v)[0],A.dot(v)[1],'bo') ax.text(v[0],v[1]+0.2,r'${\bf v}$',size=20) ax.text(A.dot(v)[0]+0.2,A.dot(v)[1]+0.2,r'$A{\bf v}$',size=20) ax.plot([-10,10],[5*10/6.0,-5*10/6.0],'b-',lw=2) # ax.plot(u[0],u[1],'ro') ax.plot(A.dot(u)[0],A.dot(u)[1],'ro') ax.text(u[0]+0.7,u[1],r'${\bf u}$',size=20) ax.text(A.dot(u)[0]+0.7,A.dot(u)[1],r'$A{\bf u}$',size=20) ax.plot([-6,10],[-6,10],'r-',lw=2) # ax.annotate('', xy=(A.dot(u)[0], A.dot(u)[1]), xycoords='data', xytext=(u[0], u[1]), textcoords='data', size=18, #bbox=dict(boxstyle="round", fc="0.8"), arrowprops={'arrowstyle': 'simple', 'fc': '0.5', 'ec': 'none', 'connectionstyle' : 'arc3,rad=0.5'}, ) ax.annotate('', xy=(A.dot(v)[0], A.dot(v)[1]), xycoords='data', xytext=(v[0], v[1]), textcoords='data', size=18, #bbox=dict(boxstyle="round", fc="0.8"), arrowprops={'arrowstyle': 'simple', 'fc': '0.5', 'ec': 'none', 'connectionstyle' : 'arc3,rad=-0.5'}, ) ax.text(3,9,r'Eigenspace for $\lambda$ = 7',size=16) ax.text(-9.5,8,r'Eigenspace for $\lambda = -4$',size=16); ``` ![\_images/L16Eigenvectors\_54\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_54_0.png) **Example.** Let \\(A = \\begin{bmatrix}4&-1&6\\\\2&1&6\\\\2&-1&8\\end{bmatrix}.\\) An eigenvalue of \\(A\\) is \\(2.\\) Find a basis for the the corresponding eigenspace. **Solution.** The eigenspace we are looking for is the nullspace of \\(A - 2I.\\) We will use the strategy for finding a basis for a nullspace that we learned in the last lecture. Form \\\[\\begin{split}A - 2I = \\begin{bmatrix}4&-1&6\\\\2&1&6\\\\2&-1&8\\end{bmatrix}-\\begin{bmatrix}2&0&0\\\\0&2&0\\\\0&0&2\\end{bmatrix} = \\begin{bmatrix}2&-1&6\\\\2&-1&6\\\\2&-1&6\\end{bmatrix}\\end{split}\\\] and row reduce the augmented matrix for \\((A-2I)\\mathbf{x}={\\bf 0}\\): \\\[\\begin{split}\\begin{bmatrix}2&-1&6&0\\\\2&-1&6&0\\\\2&-1&6&0\\end{bmatrix} \\sim \\begin{bmatrix}2&-1&6&0\\\\0&0&0&0\\\\0&0&0&0\\end{bmatrix}\\end{split}\\\] At this point, it is clear that 2 is indeed an eigenvalue of \\(A\\) because the equation \\((A-2I)\\mathbf{x} = {\\bf 0}\\) has free variables. The general solution is \\(x\_1 = \\frac{1}{2}x\_2 +-3x\_3.\\) Expressed as a vector, the general solution is: \\\[\\begin{split}\\begin{bmatrix}\\frac{1}{2}x\_2 - 3x\_3\\\\x\_2\\\\x\_3\\end{bmatrix}.\\end{split}\\\] Expressed as a vector sum, this is: \\\[\\begin{split}x\_2\\begin{bmatrix}1/2\\\\1\\\\0\\end{bmatrix} + x\_3\\begin{bmatrix}-3\\\\0\\\\1\\end{bmatrix}\\;\\;\\;\\;\\mbox{with \$x\_2\$ and \$x\_3\$ free.}\\end{split}\\\] So this eigenspace is a two-dimensional subspace of \\(\\mathbb{R}^3,\\) and a basis for this subspace is \\\[\\begin{split}\\left\\{\\begin{bmatrix}1/2\\\\1\\\\0\\end{bmatrix},\\begin{bmatrix}-3\\\\0\\\\1\\end{bmatrix}\\right\\}.\\end{split}\\\] ``` fig = ut.three_d_figure((16, 1), fig_desc = '2D Eigenspace for lambda = 2', xmin = -3, xmax = 3, ymin = -3, ymax = 3, zmin = -3, zmax = 3, qr = qr_setting, figsize = (12, 8)) v = np.array([1.,2.,0]) v = v/np.sqrt(np.sum(v*v)) u = np.array([-3.0,0,1.0]) u = u/np.sqrt(np.sum(u*u)) w = -v-2.0*u w = w/np.sqrt(np.sum(w*w)) pts = np.array([v,u,w]) #ax.text(v[0],v[1],v[2],r'$\bf v$',size=20) #ax.text(u[0],u[1],u[2],r'$\bf u$',size=20) fig.text(-2, -2, -3, r'Eigenspace for $\lambda=2$', 'Eigenspace for lambda = 2', size=20) #ax.text(0.2,0.2,-4,r'$\bf 0$',size=20) # plotting the span of v fig.plotSpan(u, v, 'Green') fig.plotPoint(u[0], u[1], u[2], 'r') fig.plotPoint(v[0], v[1], v[2], 'r') fig.plotPoint(w[0], w[1], w[2], 'r') fig.plotPoint(0, 0, 0, 'k') plt.quiver(pts[:,0],pts[:,1],pts[:,2],pts[:,0],pts[:,1],pts[:,2],length=1.0,color='Red',lw=2) pts = 2.0*pts plt.quiver(pts[:,0],pts[:,1],pts[:,2],pts[:,0],pts[:,1],pts[:,2],length=1.0,color='Blue',lw=2) u = 2*u v = 2*v w = 2*w fig.plotPoint(u[0], u[1], u[2], 'b') fig.plotPoint(v[0], v[1], v[2], 'b') fig.plotPoint(w[0], w[1], w[2], 'b') # plotting the axes # ut.plotIntersection3d(ax,[0,0,1,0],[0,1,0,0]) # ut.plotIntersection3d(ax,[0,0,1,0],[1,0,0,0]) fig.plotIntersection([0,1,0,0], [1,0,0,0]) fig.save(); ``` ![\_images/L16Eigenvectors\_63\_0.png](https://www.cs.bu.edu/fac/snyder/cs132-book/_images/L16Eigenvectors_63_0.png) Question Time! Q16.2 ## Eigenvalues of a Triangular Matrix[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#eigenvalues-of-a-triangular-matrix "Permalink to this headline") **Theorem.** The eigenvalues of a triangular matrix are the entries on its main diagonal. **Proof.** We’ll consider the \\(3\\times 3\\) case. If \\(A\\) is upper triangular, then \\(A-\\lambda I\\) has the form \\\[\\begin{split}A - \\lambda I = \\begin{bmatrix}a\_{11}\&a\_{12}\&a\_{13}\\\\0\&a\_{22}\&a\_{23}\\\\0&0\&a\_{33}\\end{bmatrix} - \\begin{bmatrix}\\lambda&0&0\\\\0&\\lambda&0\\\\0&0&\\lambda\\end{bmatrix}\\end{split}\\\] \\\[\\begin{split} = \\begin{bmatrix}a\_{11}-\\lambda\&a\_{12}\&a\_{13}\\\\0\&a\_{22}-\\lambda\&a\_{23}\\\\0&0\&a\_{33}-\\lambda\\end{bmatrix}\\end{split}\\\] Now, \\(\\lambda\\) is an eigenvalue of \\(A\\) if and only if the equation \\((A-\\lambda I)\\mathbf{x} = {\\bf 0}\\) has a nontrivial solution – in other words, if and only if the equation has a free variable. The special pattern that the zero entries have in \\(A\\) means that there will be a free variable if any diagonal entry is also zero, because then there will be a column without a pivot. That is, \\((A-\\lambda I)\\mathbf{x} = {\\bf 0}\\) has a free variable if and only if at least one of the entries on the diagonal of \\((A-\\lambda I)\\) is zero. This happens if and only if \\(\\lambda\\) equals one of the entries \\(a\_{11}, a\_{22},\\) or \\(a\_{33}\\) on the diagonal of \\(A\\). **Example.** Let \\(A = \\begin{bmatrix}3&6&-8\\\\0&0&6\\\\0&0&2\\end{bmatrix}\\) and \\(B = \\begin{bmatrix}4&0&0\\\\-2&1&0\\\\5&3&4\\end{bmatrix}\\). Then the eigenvalues of \\(A\\) are 3, 0, and 2. The eigenvalues of \\(B\\) are 4 and 1. ## Invertibility and Eigenvalues[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#invertibility-and-eigenvalues "Permalink to this headline") So far we haven’t probed what it means for a matrix to have an eigenvalue of 0. This happens if and only if the equation \\(A\\mathbf{x} = 0\\mathbf{x}\\) has a nontrivial solution. But that equation is equivalent to \\(A\\mathbf{x} = {\\bf 0}\\) which has a nontrivial solution if and only if \\(A\\) is not invertible. This means that 0 is an eigenvalue of \$A\$ if and only if \$A\$ is not invertible. This draws an important connection between invertibility and zero eigenvalues. So we have **yet another** addition to the Invertible Matrix Theorem\! Question Time! Q16.3 ## Eigenvectors and Difference Equations[¶](https://www.cs.bu.edu/fac/snyder/cs132-book/L16Eigenvectors.html#eigenvectors-and-difference-equations "Permalink to this headline") Let’s return to the problem we considered at the outset: predicting future values of \\(\\mathbf{x}\_t\\) (the number of CS majors of each class in year \\(t\\)). We modeled the future as \\(\\mathbf{x}\_{t+1} = A\\mathbf{x}\_t\\;\\;\\;(t = 0,1,2,\\dots).\\) So then \\(\\mathbf{x}\_{t} = A^t\\mathbf{x}\_0\\;\\;\\;(t = 0,1,2,\\dots).\\) A **solution** of such an equation is an explicit description of \\(\\{\\mathbf{x}\_t\\}\\) whose formula for each \\(\\mathbf{x}\_t\\) does not depend directly on \\(A\\) or on the preceding terms in the sequence other than the initial term \\(\\mathbf{x}\_0.\\) The simplest way to build a solution is to take an eigenvector \\(\\mathbf{x}\_0\\) and its corresponding eigenvalue \\(\\lambda\\). Then \$\\( \\mathbf{x}\_t = \\lambda^t \\mathbf{x}\_0 \\;\\;\\;(t = 0,1,2,\\dots)\\)\$ This sequence is a solution because \\\[A\\mathbf{x}\_t = A(\\lambda^t\\mathbf{x}\_0) = \\lambda^t(A\\mathbf{x}\_0) = \\lambda^k(\\lambda\\mathbf{x}\_0) = \\lambda^{t+1}\\mathbf{x}\_0 = \\mathbf{x}\_{t+1}.\\\] We can extend this to combinations of eigenvectors. If \\(A\\) has two eigenvectors \\(\\mathbf{u}\\) and \\(\\mathbf{v}\\) corresponding to eigenvalues \\(\\lambda\\) and \\(\\mu\\), then another solution is \\\[ x\_t = \\lambda^t\\mathbf{u} + \\mu^t \\mathbf{v}\\\] To address the problem of predicting the CS major population, we assume that we can express the state vector \\(\\mathbf{x}\\) as a linear combination of eigenvectors \\(\\mathbf{u}\_1, \\mathbf{u}\_2, \\dots,\\mathbf{u}\_p\\). That is, \\\[\\mathbf{x}\_0 = a\_1\\mathbf{u}\_1 + a\_2\\mathbf{u}\_2+\\dots+a\_p\\mathbf{u}\_p.\\\] Then at time \\(t\\), \\\[\\mathbf{x}\_t = \\lambda\_1^ta\_1\\mathbf{u}\_1 + \\lambda\_2^ta\_2\\mathbf{u}\_2 + \\dots + \\lambda\_p^ta\_p\\mathbf{u}\_p.\\\] Which gives us an explicit solution for the number of students in the major in any future year. Note that the expression above is exponential in each of the \\(\\lambda\_1,\\dots,\\lambda\_p\\). The resulting value will be essentially determined by the largest \\(\\lambda\_i\\). To see this, let’s say that \\(\\lambda\_1\\) is the largest eigenvalue, and \\(\\lambda\_2\\) is the second largest. Then the proportion of \\(\\mathbf{u}\_1\\) to \\(\\mathbf{u}\_2\\) contained in \\(\\mathbf{x}\_t\\) will be given by \\((\\lambda\_1/\\lambda\_2)^t\\). Since this ratio is larger than 1, the relative proportion of \\(\\mathbf{u}\_1\\) will grow exponentially. This shows the importance of the largest eigenvalue in the value of \\(\\mathbf{x}\_t\\) for large \\(t\\). In the long run, \\(\\mathbf{x}\_t\\) will be very close to \\(\\lambda\_1^t\\mathbf{u}\_1.\\) The largest eigenvalue shows the asymptotic rate of growth (or shrinkage) of the state vector \\(\\mathbf{x}\_t\\). For the matrix \\(A\\) that describes CS major population growth, the largest eigenvalue is 1.19. In other words, this models the population growth rate of CS majors at that time as increasing about 19% per year.
Shard40 (laksa)
Root Hash13121796707767007440
Unparsed URLedu,bu!cs,www,/fac/snyder/cs132-book/L16Eigenvectors.html s443