How To Find Change Of Basis Matrix

Knowing how to convert a vector to a different basis has many applied applications. Gilbert Strang has a nice quote well-nigh the importance of basis changes in his volume [1] (emphasis mine):

The standard basis vectors for $\mathbb{R}^n$ and $\mathbb{R}^m$ are the columns of I. That option leads to a standard matrix, and $T(v)=Av$ in the normal mode. But these spaces as well accept other bases, and then the same T is represented by other matrices. A main theme of linear algebra is to choose the bases that give the all-time matrix for T.

This should serve as a good motivation, but I'll exit the applications for future posts; in this one, I volition focus on the mechanics of basis modify, starting from first principles.

Example: finding a component vector

Let's employ $\mathbb{R}^2$ as an instance. $U=(2,3), (4,5)$ is an ordered basis for $\mathbb{R}^2$ (since the two vectors in it are independent). Say nosotros have $v=(2,4)$ . What is $[v]_{\text{\tiny U}}$ ? We'll need to solve the system of equations:

$\begin{pmatrix} 2 \\ 4 \end{pmatrix}=c_1\begin{pmatrix} 2 \\ 3\end{pmatrix}+c_2\begin{pmatrix} 4 \\ 5 \end{pmatrix}$

In the two-D case this is trivial - the solution is $c_1=3$ and $c_2=-1$ . Therefore:

$[v]_{\text {\tiny U}}=\begin{pmatrix} 3 \\ -1 \end{pmatrix}$

In the more general example of $\mathbb{R}^n$ , this is akin to solving a linear organisation of n equations with n variables. Since the footing vectors are, by definition, linearly independent, solving the system is simply inverting a matrix [3].

Change of footing matrix

Now comes the key office of the mail service. Say we have ii different ordered bases for the aforementioned vector space: $U = u_1,u_2,...,u_n$ and $W= w_1,w_2,...,w_n$ . For some $v\in V$ , we tin find $[v]_{\text{\tiny U}}$ and $[v]_{\text{\tiny W}}$ . How are these two related?

Surely, given $[v]_{\text{\tiny U}}$ nosotros can observe its coefficients in basis $W$ the same way as nosotros did in the case above [4]. It involves solving a linear system of $n$ equations. We'll have to redo this operation for every vector $v$ we want to convert. Is there a simpler manner?

Luckily for science, yeah. The key here is to find how the basis vectors of $U$ look in ground $W$ . In other words, we have to find $[u_1]_{\text{\tiny W}}$ , $[u_2]_{\text{\tiny W}}$ and so on to $[u_n]_{\text{\tiny W}}$ .

Let's say we exercise that and observe the coefficients to be $a_{ij}$ such that:

$\begin{matrix} u_1=a_{11}w_1+a_{21}w_2+...+a_{n1}w_n \\ u_2=a_{12}w_1+a_{22}w_2+...+a_{n2}w_n \\ ... \\ u_n=a_{1n}w_1+a_{2n}w_2+...+a_{nn}w_n \end{matrix}$

Now, given some vector $v \in V$ , suppose its components in basis $U$ are:

$[v]_{\text{\tiny U}}=\begin{pmatrix} c_1 \\ c_2 \\ ... \\ c_n \end{pmatrix}$

Let's endeavour to effigy out how it looks in basis $W$ . The above equation (past definition of components) is equivalent to:

$v=c_1u_1+c_2u_2+...+c_nu_n$

Substituting the expansion of the $u$ s in ground $W$ , nosotros get:

$v=\begin{matrix} c_1(a_{11}w_1+a_{21}w_2+...+a_{n1}w_n)+ \\ c_2(a_{12}w_1+a_{22}w_2+...+a_{n2}w_n)+ \\ ... \\ c_n(a_{1n}w_1+a_{2n}w_2+...+a_{nn}w_n) \end{matrix}$

Reordering a bit to find the multipliers of each $w$ :

$v=\begin{matrix} (c_1a_{11}+c_2a_{12}+...+c_na_{1n})w_1+ \\ (c_1a_{21}+c_2a_{22}+...+c_na_{2n})w_2+ \\ ... \\ (c_1a_{n1}+c_2a_{n2}+...+c_na_{nn})w_n \end{matrix}$

Past our definition of vector components, this equation is equivalent to:

$[v]_{\text{\tiny W}}=\begin{pmatrix} c_1a_{11}+c_2a_{12}+...+c_na_{1n} \\ c_1a_{21}+c_2a_{22}+...+c_na_{2n} \\ ... \\ c_1a_{n1}+c_2a_{n2}+...+c_na_{nn} \end{pmatrix}$

Now we're in vector notation once more, so we can decompose the column vector on the correct mitt side to:

$[v]_{\text{\tiny W}}=\begin{pmatrix} a_{11} & a_{12} & ... & a_{1n} \\ a_{21} & a_{22} & ... & a_{2n} \\ ... & ... & ... \\ a_{n1} & a_{n2} & ... & a_{nn} \end{pmatrix}\begin{pmatrix}c_1 \\ c_2 \\ ... \\ c_n \end{pmatrix}$

This is matrix times a vector. The vector on the right is $[v]_{\text{\tiny U}}$ . The matrix should wait familiar too because it consists of those $a_{ij}$ coefficients nosotros've defined above. In fact, this matrix just represents the basis vectors of $U$ expressed in basis $W$ . Let'southward call this matrix $A_{\text{\tiny U}\rightarrow \text{\tiny W}}$ - the change of basis matrix from $U$ to $W$ . It has $[u_1]_{\text{\tiny W}}$ to $[u_n]_{\text{\tiny W}}$ laid out in its columns:

$A_{\text{\tiny U}\rightarrow \text{\tiny W}}=\begin{pmatrix}[u_1]_{\text{\tiny W}},[u_2]_{\text{\tiny W}},...,[u_n]_{\text{\tiny W}}]\end{pmatrix}$

So we have:

$[v]_{\text{\tiny W}}=A_{\text{\tiny U}\rightarrow \text{\tiny W}}[v]_{\text{\tiny U}}$

To recap, given two bases $U$ and $W$ , we can spend some endeavor to compute the "change of basis" matrix $A_{\text{\tiny U}\rightarrow \text{\tiny W}}$ , but then we can easily convert any vector in ground $U$ to footing $W$ if we simply left-multiply it by this matrix.

A reasonable question to ask at this signal is - what about converting from $W$ to $U$ ? Well, since the computations above are completely generic and don't special-case either base, we can just flip the roles of $U$ and $W$ and get some other modify of basis matrix, $A_{\text{\tiny W}\rightarrow \text{\tiny U}}$ - it converts vectors in base $W$ to vectors in base $U$ as follows:

$[v]_{\text{\tiny U}}=A_{\text{\tiny W}\rightarrow \text{\tiny U}}[v]_{\text{\tiny W}}$

And this matrix is:

$A_{\text{\tiny W}\rightarrow \text{\tiny U}}=\begin{pmatrix}[w_1]_{\text{\tiny U}},[w_2]_{\text{\tiny U}},...,[w_n]_{\text{\tiny U}}]\end{pmatrix}$

We will soon run into that the two change of footing matrices are intimately related; but first, an example.

Example: changing bases with matrices

Let'southward work through another concrete example in $\mathbb{R}^2$ . Nosotros've used the basis $U=(2,3), (4,5)$ earlier; let's apply it again, and also add the basis $W=(-1,1), (1,1)$ . We've already seen that for $v=(2,4)$ we have:

$[v]_{\text {\tiny U}}=\begin{pmatrix} 3 \\ -1 \end{pmatrix}$

Similarly, we can solve a set of two equations to discover $[v]_{\text {\tiny W}}$ :

$[v]_{\text {\tiny W}}=\begin{pmatrix} 1 \\ 3 \end{pmatrix}$

OK, let'south come across how a change of basis matrix can exist used to easily compute one given the other. First, to observe $A_{\text{\tiny U}\rightarrow \text{\tiny W}}$ we'll need $[u_1]_{\text {\tiny W}}$ and $[u_2]_{\text {\tiny W}}$ . Nosotros know how to practise that. The result is:

$[u_1]_{\text {\tiny W}}=\begin{pmatrix} 0.5 \\ 2.5 \end{pmatrix}\qquad[u_2]_{\text {\tiny W}}=\begin{pmatrix} 0.5 \\ 4.5 \end{pmatrix}$

At present we can verify that given $[v]_{\text {\tiny U}}$ and $A_{\text{\tiny U}\rightarrow \text{\tiny W}}$ , nosotros tin hands find $[v]_{\text {\tiny W}}$ :

$[v]_{\text{\tiny W}}=A_{\text{\tiny U}\rightarrow \text{\tiny W}}[v]_{\text{\tiny U}}= \\ \begin{pmatrix} 0.5 & 0.5 \\ 2.5 & 4.5 \end{pmatrix} \\ \begin{pmatrix} 3 \\ -1 \end{pmatrix}=\\ \begin{pmatrix} 1 \\ 3 \end{pmatrix}$

Indeed, it checks out! Let'southward also verify the other direction. To notice $A_{\text{\tiny W}\rightarrow \text{\tiny U}}$ we'll need $[w_1]_{\text {\tiny U}}$ and $[w_2]_{\text {\tiny U}}$ :

$[w_1]_{\text {\tiny U}}=\begin{pmatrix} 4.5 \\ -2.5 \end{pmatrix}\qquad[w_2]_{\text {\tiny U}}=\begin{pmatrix}- 0.5 \\ 0.5 \end{pmatrix}$

And now to find $[v]_{\text {\tiny U}}$ :

$[v]_{\text{\tiny U}}=A_{\text{\tiny W}\rightarrow \text{\tiny U}}[v]_{\text{\tiny W}}= \\ \begin{pmatrix} 4.5 & -0.5 \\ -2.5 & 0.5 \end{pmatrix} \\ \begin{pmatrix} 1 \\ 3 \end{pmatrix}=\\ \begin{pmatrix} 3 \\ -1 \end{pmatrix}$

Checks out again! If you take a keen eye, or have recently spent some time solving linar algebra problems, you'll notice something interesting well-nigh the two basis change matrices used in this example. 1 is an inverse of the other! Is this some sort of coincidence? No - in fact, information technology's always truthful, and we can prove it.

The inverse of a modify of basis matrix

Nosotros've derived the change of basis matrix from $U$ to $W$ to perform the conversion:

$[v]_{\text{\tiny W}}=A_{\text{\tiny U}\rightarrow \text{\tiny W}}[v]_{\text{\tiny U}}$

Left-multiplying this equation past $A_{\text{\tiny W}\rightarrow \text{\tiny U}}$ :

$A_{\text{\tiny W}\rightarrow \text{\tiny U}}[v]_{\text{\tiny W}}=\\ A_{\text{\tiny W}\rightarrow \text{\tiny U}}A_{\text{\tiny U}\rightarrow \text{\tiny W}}[v]_{\text{\tiny U}}$

Merely the left-paw side is now, by our before definition, equal to $[v]_{\text{\tiny U}}$ , so nosotros become:

$[v]_{\text{\tiny U}}=\\ A_{\text{\tiny W}\rightarrow \text{\tiny U}}A_{\text{\tiny U}\rightarrow \text{\tiny W}}[v]_{\text{\tiny U}}$

Since this is true for every vector $[v]_{\text{\tiny U}}$ , information technology must be that:

$A_{\text{\tiny W}\rightarrow \text{\tiny U}}A_{\text{\tiny U}\rightarrow \text{\tiny W}}=I$

From this, nosotros tin infer that $A_{\text{\tiny W}\rightarrow \text{\tiny U}}=A_{\text{\tiny U}\rightarrow \text{\tiny W}}^{-1}$ and vice versa [5].

Irresolute to and from the standard footing

Y'all may have noticed that in the examples above, we brusque-circuited a petty bit of rigor by making up a vector (such as $v=(2,4)$ ) without explicitly specifying the footing its components are relative to. This is because we're so used to working with the "standard footing" we oft forget it's there.

The standard ground (permit's call it $E$ ) consists of unit vectors pointing in the directions of the axes of a Cartesian coordinate organisation. For $\mathbb{R}^2$ nosotros accept the basis vectors:

$e_1=\begin{pmatrix} 1 \\ 0 \end{pmatrix}\qquad e_2=\begin{pmatrix} 0 \\ 1 \end{pmatrix}$

And more mostly in $\mathbb{R}^n$ we take an ordered list of $n$ vectors $\left\{ e_i:1\leq i \leq n \right\}$ where $e_i$ has ane in the $i$ th position and zeros elsewhere.

So when we say $v=(2,4)$ , what we really mean is:

$\begin{matrix} v=2e_1+4e_2 \\[1em] [v]_{\text {\tiny E}}=\begin{pmatrix} 2 \\ 4 \end{pmatrix} \end{matrix}$

The standard basis is and so ingrained in our intuition of vectors that we usually neglect to mention it. This is fine, equally long every bit we're only dealing with the standard footing. In one case change of basis is required, it's worthwhile to stick to a more consistent notation to avoid defoliation. Moreover, it's frequently useful to change a vector's basis to or from the standard one. Permit's run across how that works. Think how we use the modify of basis matrix:

$[v]_{\text{\tiny W}}=A_{\text{\tiny U}\rightarrow \text{\tiny W}}[v]_{\text{\tiny U}}$

Replacing the arbitrary footing $W$ by the standard basis $E$ in this equation, nosotros get:

$[v]_{\text{\tiny E}}=A_{\text{\tiny U}\rightarrow \text{\tiny E}}[v]_{\text{\tiny U}}$

And $A_{\text{\tiny U}\rightarrow \text{\tiny E}}$ is the matrix with $[u_1]_{\text {\tiny E}}$ to $[u_n]_{\text {\tiny E}}$ in its columns. But expect, these are merely the basis vectors of $U$ ! And so finding the matrix $A_{\text{\tiny U}\rightarrow \text{\tiny E}}$ for whatever given basis $U$ is trivial - simply line upwards $U$ 's basis vectors every bit columns in their order to get a matrix. This means that whatever square, invertible matrix can exist seen as a change of ground matrix from the footing spelled out in its columns to the standard basis. This is a natural consequence of how multiplying a matrix past a vector works by linearly combining the matrix's columns.

OK, so we know how to notice $[v]_{\text {\tiny E}}$ given $[v]_{\text {\tiny U}}$ . What virtually the other way around? We'll need $A_{\text{\tiny E}\rightarrow \text{\tiny U}}$ for that, and nosotros know that:

$A_{\text{\tiny E}\rightarrow \text{\tiny U}}=A_{\text{\tiny U}\rightarrow \text{\tiny E}}^{-1}$

Therefore:

$[v]_{\text{\tiny U}}=\\ A_{\text{\tiny E}\rightarrow \text{\tiny U}}[v]_{\text{\tiny E}}=\\ A_{\text{\tiny U}\rightarrow \text{\tiny E}}^{-1}[v]_{\text{\tiny E}}$

Chaining basis changes

What happens if we change a vector from one ground to another, and then change the resulting vector to yet another basis? I mean, for bases $U$ , $W$ and $T$ and some capricious vector $v$ , we'll do:

$A_{\text{\tiny W}\rightarrow \text{\tiny T}}A_{\text{\tiny U}\rightarrow \text{\tiny W}}[v]_{\text{\tiny U}}$

This is but applying the change of basis by matrix multiplication equation, twice:

$A_{\text{\tiny W}\rightarrow \text{\tiny T}}(A_{\text{\tiny U}\rightarrow \text{\tiny W}}[v]_{\text{\tiny U}})=\\ A_{\text{\tiny W}\rightarrow \text{\tiny T}}[v]_{\text{\tiny W}}\\ =[v]_{\text{\tiny T}}$

What this means is that changes of basis can be chained, which isn't surprising given their linear nature. It likewise means that we've merely found $A_{\text{\tiny U}\rightarrow \text{\tiny T}}$ , since we found how to transform $[v]_{\text{\tiny U}}$ to $[v]_{\text{\tiny T}}$ (using an intermediary ground $W$ ).

$A_{\text{\tiny U}\rightarrow \text{\tiny T}}=\\ A_{\text{\tiny W}\rightarrow \text{\tiny T}}A_{\text{\tiny U}\rightarrow \text{\tiny W}}$

Finally, let's say that the indermediary basis is non just some arbitrary $W$ , only the standard basis $E$ . So we have:

$A_{\text{\tiny U}\rightarrow \text{\tiny T}}=\\ A_{\text{\tiny E}\rightarrow \text{\tiny T}}A_{\text{\tiny U}\rightarrow \text{\tiny E}}=\\ A_{\text{\tiny T}\rightarrow \text{\tiny E}}^{-1}A_{\text{\tiny U}\rightarrow \text{\tiny E}}$

Nosotros adopt the last form, since finding $A_{\text{\tiny U}\rightarrow \text{\tiny E}}$ for any basis $U$ is, as nosotros've seen above, trivial.

Example: standard basis and chaining

It'southward time to solidify the ideas of the concluding two sections with a physical instance. We'll use our familiar bases $U=(2,3), (4,5)$ and $W=(-1,1), (1,1)$ from the previous example, along with the standard basis for $\mathbb{R}^2$ . Previously, nosotros transformed a vector $v$ from $U$ to $W$ and vice-versa using the alter of basis matrices betwixt these bases. This time, permit's practice it by chaining via the standard ground.

We'll pick $v=(2,4)$ . Formally, the components of $v$ relative to the standard ground are:

$[v]_{\text{\tiny E}} = \begin{pmatrix} 2 \\ 4 \end{pmatrix}$

In the terminal example we've already computed the components of $v$ relative to $U$ and $W$ :

$[v]_{\text {\tiny U}}=\begin{pmatrix} 3 \\ -1 \end{pmatrix}\qquad [v]_{\text {\tiny W}}=\begin{pmatrix} 1 \\ 3 \end{pmatrix}$

Previously, one was computed from the other using the "direct" basis change matrices from $U$ to $W$ and vice versa. At present nosotros tin use chaining via the standard basis to achieve the aforementioned result. For case, we know that:

$[v]_{\text{\tiny W}}=\\ A_{\text{\tiny E}\rightarrow \text{\tiny W}}A_{\text{\tiny U}\rightarrow \text{\tiny E}}[v]_{\text{\tiny U}}$

Finding the change of basis matrices from some basis to $E$ is just laying out the basis vectors as columns, then we immediately know that:

$A_{\text{\tiny U}\rightarrow \text{\tiny E}}=\begin{pmatrix} 2 & 4\\ 3 & 5 \end{pmatrix}\qquad \qquad \\ A_{\text{\tiny W}\rightarrow \text{\tiny E}}=\begin{pmatrix} -1 & 1\\ 1 & 1 \end{pmatrix}$

The change of ground matrix from $E$ to some basis is the changed, so by inverting the in a higher place matrices we find:

$A_{\text{\tiny E}\rightarrow \text{\tiny U}}=A_{\text{\tiny U}\rightarrow \text{\tiny E}}^{-1}=\begin{pmatrix} -2.5 & 2 \\ 1.5 & -1 \end{pmatrix}\qquad \qquad \\ A_{\text{\tiny E}\rightarrow \text{\tiny W}}=A_{\text{\tiny W}\rightarrow \text{\tiny E}}^{-1}=\begin{pmatrix} -0.5 & 0.5 \\ 0.5 & 0.5 \end{pmatrix}$

At present we have all we need to find $[v]_{\text{\tiny W}}$ from $[v]_{\text{\tiny U}}$ :

$[v]_{\text{\tiny W}}=\\ A_{\text{\tiny E}\rightarrow \text{\tiny W}}A_{\text{\tiny U}\rightarrow \text{\tiny E}}[v]_{\text{\tiny U}}=\begin{pmatrix} -0.5 & 0.5 \\ 0.5 & 0.5 \end{pmatrix}\begin{pmatrix} 2 & 4\\ 3 & 5 \end{pmatrix}\begin{pmatrix} 3 \\ -1 \end{pmatrix}=\begin{pmatrix} 1 \\ 3 \end{pmatrix}$

The other direction can be washed similarly.

[1]	Introduction to Linear Algebra, fourth edition, section 7.two

[two]

Why is this list unique? Considering given a footing $U$ for a vector space $V$ , every $v\in V$ can exist expressed uniquely as a linear combination of the vectors in $U$ . The proof for this is very simple - but presume in that location are two different ways to express $v$ - two alternative sets of components. Subtract ane from the other and employ linear independence of the footing vectors to conclude that the two ways must be the same 1.

[three]

The matrix here has the basis vectors laid out in its columns. Since the ground vectors are independent, the matrix is invertible. In our minor example, the matrix equation we're looking to solve is:

$\begin{pmatrix} 2 & 4 \\ 3 & 5 \end{pmatrix}\begin{pmatrix} c_1 \\ c_2 \end{pmatrix}=\begin{pmatrix} 2 \\ 4 \end{pmatrix}$

[4]	The example converts from the standard basis to some other basis, but converting from a non-standard ground to another requires exactly the aforementioned steps: we try to find coefficients such that a combination of some ready of basis vectors adds up to some components in another basis.

[five]