Chapter 18

Using Optical Matrices.

\( \def\bmatrix#1{\begin{bmatrix}#1\end{bmatrix}} \)

In Chapter 17, the basics of using matrices in optics were laid out, and the matrices for a gap, a thin lens, and a refracting surface were worked out. Here we will put these matrices to use in analyzing various systems.

Matrix Patterns.

It's sometimes useful to be able to swap between a matrix equation and a set of equations. To remind you, the two ways of writing equations are shown below.

Matrix Style Equation Style
\[\bmatrix{h_2\\ s_2}=\bmatrix{A & B\\ C & D}\bmatrix{h_1\\ s_1}\]
\[\begin{align} h_2 &= A h_1 + B s_1 \\ s_2 &= C h_1 + D s_1 \end{align}\]

While all optical matrices are interesting, there are four matrices that are particularly so, because they have a zero in them somewhere. Each of these matrices describes a particular kind of optical system. The four matrices, and their Equation-Style equivalents, are shown below:

Matrix Style Equation Style
1.
\[ \bmatrix{h_2\\ s_2}=\bmatrix{0 & B\\ C & D}\bmatrix{h_1\\ s_1} \]
\[\begin{align} h_2 &= B s_1 \\ s_2 &= C h_1 + D s_1 \end{align}\]
2.
\[ \bmatrix{h_2\\ s_2}=\bmatrix{A & 0\\ C & D}\bmatrix{h_1\\ s_1} \]
\[\begin{align} h_2 &= A h_1 \\ s_2 &= C h_1 + D s_1 \end{align}\]
3.
\[ \bmatrix{h_2\\ s_2}=\bmatrix{A & B\\ 0 & D}\bmatrix{h_1\\ s_1} \]
\[\begin{align} h_2 &= A h_1 + B s_1 \\ s_2 &= D s_1 \end{align}\]
4.
\[ \bmatrix{h_2\\ s_2}=\bmatrix{A & B\\ C & 0}\bmatrix{h_1\\ s_1} \]
\[\begin{align} h_2 &= A h_1 + B s_1 \\ s_2 &= C h_1 \end{align}\]

We've already come across a matrix like that in row 2 of the table in Chapter 17. If a matrix has a zero on the top right corner, where \( B \) is, then it describes an optical system where \( h_1 \) is the height of a point on an object, and \( h_2 \) is the height of a point on an image, and \( A \) is the linear magnification - how much bigger or smaller the image is compared to the object.

We will see the other three kinds of matrix in this Chapter.

A Lens and a Gap.

If we have a lens with power \( F \), and put an image screen at a distance of \( 1/F \) away, we have a system which creates images of distant objects on the screen, as in Figure 1(a). What does this system look like when written as an optical matrix?

fig1a
Figure 1(a). An optical system (shaded light blue) consists of a lens of power \( F \) followed by a gap of length \( 1/F \). The lens is centred on the optic axis (dotted line). At the end of the gap there is an image screen to catch the light.
fig1b
Figure 1(b). The matrix representing this system has a zero in the top left corner. That means the height \( h_2 \) at the end of the system (where the image screen is) depends only on the slant \( s_1 \) of rays entering the system (that is, hitting the lens). All rays with the same slope end up at the same height.
fig1c
Figure 1(c). Another bunch of rays with the same slope end up at the same height, but it is a different height from the previous image because the ray slopes are different.

There are two matrices involved. The first is the matrix for a lens of power \( F \), given in the first row of the table below. The second is for a gap of length \( 1/F \), in the second row. There is no optical matrix for an image screen, because the light simply stops when it hits the screen.

Matrix for a lens

\[ \begin{bmatrix} 1 & 0 \\ -F & 1 \end{bmatrix} \]

Matrix for a gap of size \( 1/F \)

\[ \begin{bmatrix} 1 & 1/F \\ 0 & 1 \end{bmatrix} \]

The matrix description of this system is given by the following matrix equation:

\[ \bmatrix{h_2\\ s_2} = \bmatrix{1 & 1/F\\ 0 & 1}\bmatrix{1&0\\ -F&1}\bmatrix{h_1\\ s_1} \]

Note that the lens and gap matrices are written in order from right-to-left. This equation can be simplified by multiplying the two matrices together to get a single matrix for the whole system:

\[ \bmatrix{1 & 1/F\\ 0 & 1}\bmatrix{1&0\\ -F&1} = \bmatrix{0&1/F\\ -F&1} \]

Thus the lens followed by gap system can be described by an equation with just a single matrix:

\[ \bmatrix{h_2\\ s_2} = \bmatrix{0&1/F\\ -F&1}\bmatrix{h_1\\ s_1} \]

The matrix for the whole system has a zero in the top left corner. That means that the height \( h_2 \) of the ray at the end of the system depends only on the slope \( s_1 \) of the ray at the start:

\[ h_2 = 0 h_1 + (1/F)s_1 = (1/F)s_1 \]

What this means is that all rays with the same slope at the start of this optical system \( s_1 \) end up having the same height \( h_2 \) at the end of the system, regardless of their start height \( h_1 \). This is shown in Figure 1(b) and Figure 1(c). A set of rays which all have the same slope \( s_1 \) is a bundle of parallel rays (with a vergence of zero). They all end up at the same point, with height \( h_2 \), at the end of the system.

We can conclude from this

Any optical matrix with a zero in the top left corner describes a system where parallel rays of light are converged to the same point, forming an image.

What about the slopes? From the matrix equation above

\[ s_2 = -F h_1 + s_1 \]

When \( h_1=0 \), the ray is striking the centre of the lens, and in that case \( s_2=s_1 \). We've come across this before, as the Third Rule of ray-tracing: a ray of light hitting the centre of a lens (that is, \( h_1=0 \)) is undeviated (that is, \( s_2=s_1 \)).

A Refracting Surface and a Gap

A slightly more complicated case is an optical system consisting of a refracting surface with power \( F_{surface} \) followed by a gap of length \( n_{out}/F_{surface} \), which is the focal length of that surface. The matrices for the surface (assuming \( n_{in}=1 \)) and the gap are given in the table below:

Matrix for a surface

\[ \begin{bmatrix} 1 & 0 \\ -F_{surface}/n_{out} & 1/n_{out} \end{bmatrix} \]

Matrix for a gap of size \( n_{out}/F_{surface} \)

\[ \begin{bmatrix} 1 & n_{out}/F_{surface} \\ 0 & 1 \end{bmatrix} \]

Multiplying these two matrices together gives the matrix for the whole system:

\[ \begin{bmatrix} 1 & n_{out}/F_{surface} \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ -F_{surface}/n_{out} & 1/n_{out} \end{bmatrix} = \begin{bmatrix} 0 & 1/F_{surface} \\ -F_{surface}/n_{out} & 1/n_{out} \end{bmatrix} \]

This is pretty similar to the matrix for a thin lens and a gap, except that the bottom row of the matrix is divided by \( n_{out} \). There is a zero in the top left corner of the matrix, because this system also focuses parallel light to form an image.

However, if we look at the slopes from this matrix, we can see that

\[ s_2 = \frac{-F_{surface}}{n_{out}} h_1 + \frac{1}{n_{out}}s_1 \]

When \( h_1=0 \), and the ray hits the centre of the surface, the slope is \( s_2 = s_1/n_{out} \). This system doesn't follow Rule 3 of ray-tracing.

A Gap and a Lens

A system consisting of a gap of length \( 1/F \) followed by a lens of power \( F \) is shown in Figure 2(a) . This is similar to the lens and gap system discussed above, except that the order of the matrices is reversed, as follows:

\[ \bmatrix{h_2\\ s_2} = \bmatrix{1&0\\ -F&1}\bmatrix{1 & 1/F\\ 0 & 1}\bmatrix{h_1\\ s_1} \]
fig2a
Figure 2(a). An optical system (shown by a shaded box) consists of a gap of length \( 1/F \) followed by a lens with power \( F \). The reference plane at the start of the system is shown by the dotted vertical line. The lens is centred on the optic axis (dotted line).
fig2b
Figure 2(b). The matrix representing this system has a zero in the bottom right corner. That means the slope \( s_2 \) at the end of the system (where the light leaves the lens) depends only on the height \( h_1 \) of rays at the start of the system (that is, where the reference plane is). All rays starting with the same height end up with the same slope - that is, leave the lens in parallel.
fig2c
Figure 2(c). Another bunch of rays starting with the same height end up with the same slope, but the slope is different from that in the previous figure because the rays start at a different height.

The gap and lens matrices can be multiplied together to get a single matrix for the whole system. The result is:

\[ \bmatrix{h_2\\ s_2} = \bmatrix{1& 1/F\\ -F& 0}\bmatrix{h_1\\ s_1} \]

This matrix has a zero in bottom right corner. That means the slope \( s_2 \) of the ray leaving the system depends only on the height \( h_1 \) of the ray at the start:

\[ s_2 = -F h_1 \]

That is, all rays with the same height \( h_1 \), no matter what slope they have, end up with the same slope \( s_2 \). A bundle of rays with the same slope is a bundle of parallel rays, so this matrix tells us that all rays from a single point emerege from the system in parallel. This is shown in Figure 2(b) and Figure 2(c).

We can conclude from this

Any optical matrix with a zero in the bottom right corner describes an optical system where all light rays from a single point at the start of the system emerge in parallel.

A Telescope

A telescope consists of an objective lens with power \( F_o \) , a gap, and an eyepiece with power \( F_e \). The gap is the sum of the focal lengths \( 1/F_o+1/F_e \) of the objective and eyepiece ( Figure 3(a)). The matrices for these three parts are given in the table below.

Matrix for objective

\[ \begin{bmatrix} 1 & 0 \\ -F_o & 1 \end{bmatrix} \]

Matrix for a gap of size \( 1/F_o+1/F_e \)

\[ \begin{bmatrix} 1 & (1/F_o+1/F_e) \\ 0 & 1 \end{bmatrix} \]

Matrix for eyepiece

\[ \begin{bmatrix} 1 & 0 \\ -F_e & 1 \end{bmatrix} \]

The matrix equation for the telescope is thus

\[ \bmatrix{h_2 \\ s_2} = \bmatrix{1 & 0 \\ -F_e & 1}\bmatrix{1 & (1/F_o+1/F_e) \\ 0 & 1}\bmatrix{1 & 0 \\ -F_o & 1} \bmatrix{h_1 \\ s_1} \]

As usual, the matrices are written from right to left in the order that the light ray hits them. We can reduce this to a simpler matrix equation by multiplying together the three matrices to get a single matrix:

\[ \begin{bmatrix} 1 & 0 \\ -F_e & 1 \end{bmatrix} \begin{bmatrix} 1 & 1/F_o+1/F_e \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ -F_o & 1 \end{bmatrix} = \begin{bmatrix} -F_o/F_e & 1/F_o+1/F_e \\ 0 & -F_e/F_o \end{bmatrix} \]

Then the matrix equation for a telescope looks like this:

\[ \bmatrix{h_2 \\ s_2} = \bmatrix{-F_o/F_e & (1/F_o+1/F_e) \\ 0 & -F_e/F_o}\bmatrix{h_1 \\ s_1} \]

If you recall, the magnification of the telescope is \( M=-F_e/F_o \), so the matrix equation really is this:

\[ \bmatrix{h_2 \\ s_2} = \bmatrix{1/M & \text{(length of scope)} \\ 0 & M}\bmatrix{h_1 \\ s_1} \]

Notice that this matrix has a zero in the lower left corner. That means that the slope \( s_2 \) at the end of the system depends only on the slope \( s_1 \) at the start, and not on the height \( h_1 \). That is:

\[ s_2 = M s_1 \]

What this is telling us is that a bundle or parallel rays (all with the same slope \( s_1 \)) come out of the telescope parallel, but all with the slope \( s_2 \). This is shown in Figure 3(b) and Figure 3(c).

fig3a
Figure 3(a). A telescope consists of an objective lens and an eyepiece separated by a gap. The gap is the sum of the focal lengths of the objective and eyepiece. Here, the eyepiece is a positive lens, so it is an astronomical telescope.
fig3b
Figure 3(b). A bundle of rays enters the telescope with the same slope \( s_1 \). Since they all have the same slope, the rays are parallel. The slope at the end of the telescope \( s_2 \) is just \( M s_1 \), so the rays all leave the telescope with the same slope, juts \( M \) times bigger than the slope at the start. An astronomical telescope has a negative magnification. The rays enetring the telescope have a positive slope (going upwards) but leave with a negative slope (going downwards).
fig3c
Figure 3(c). Another bundle of parallel rays with the same slope enters the telescope. They leave with the same slope, but it is again \( M \) times larger than the slope on entry.

We can conclude

Any optical matrix with a zero in the bottom left corner is describing a system where angular magnification takes place (like a telescope).

A Model Eye

Every optical system can be boiled down to a single matrix with four numbers. We have so far done this with relatively simple optical systems, but now we will use matrix methods to analyze a very complicated optical system: the LeGrand Model eye (introduced in Chapter 8). The LeGrand Model eye is a simplified version of the human eye, useful for optical calculations. It consists of a (very long) series of surfaces and gaps ( Figure 4).

fig4
Figure 4. The LeGrand model eye (from Chapter 8) is a complicated optical system consisting of four refracting surfaces and four gaps.

Every surface in the LeGrand eye can be written down as a matrix of the form

\[ \bmatrix{1& 0\\ \frac{-F_{surface}}{n_{out}} & \frac{n_{in}}{n_{out}}\\ } \]

where \( n_{in} \), \( n_{out} \) , and \( F_{surface} \) are the refractive indices to the left and right, and the power. Every gap in the LeGrand eye can be written down as a matrix of the form

\[ \bmatrix{1& d\\ 0 & 1\\ } \]

where \( d \) is the gap length in metres. These surfaces and gaps, with the appropriate values filled in, are given in the table below (to a large number of decimal places).

The front surface of the cornea:

\[ \bmatrix{1&0\\ -35.10722085988951 & 0.7261636772928618\\ } \]

The inside of the cornea:

\[ \bmatrix{1 & 0.00055 \\ 0 & 1\\ } \]

The back surface of the cornea:

\[ \bmatrix{1 & 0 \\ 4.566840367647913 & 1.029684462389711 } \]

The aqueous:

\[ \bmatrix{1 & 0.00305 \\ 0 & 1 } \]

The front surface of the lens:

\[ \bmatrix{1&0\cr -5.702844518088925&0.941830985915493\cr } \]

The inside of the lens:

\[ \bmatrix{1&0.004\cr 0&1\cr } \]

The back surface of the lens:

\[ \bmatrix{1&0\cr -10.47904191616765&1.062874251497006\cr } \]

The vitreous:

\[ \bmatrix{1&0.01659655247945186\cr 0&1\cr } \]

The matrix describing the entire eye can be obtained by multiplying all these matrices together in the right order. Obviously you wouldn't want to do this by hand, but luckily there are computer programs that can multiply matrices for us (for example here or here). If we use a computer program to multiply all of these matrices together, we get:

\[ \text{Le Grand} = \bmatrix{0 & 0.01668\\ 44.86559 & 0.67696 } \]

This is a remarkably simple result, given how many optical components went into it. The matrix for the LeGrand eye has a zero in the upper left corner, so it focuses parallel light to a single point, like a lens followed by a gap, or a surface followed by a gap.

So now, if we ever want to find out where a ray of light ends up when it goes into the LeGrand eye, we can use the matrix equation

\[ \bmatrix{h_2\\ s_2} = \bmatrix{0 & 0.01668\\ 44.86559 & 0.67696 }\bmatrix{h_1\\ s_1} \]

In this equation, \( h_1 \) and \( s_1 \) are the height and slope of a ray when it hits the cornea, and \( h_2 \) and \( s_2 \) are the height and slope of the ray when it hits the retina at the end of the system. One of the most useful things about this equation is it makes it very easy to work out the size of the retinal image from the slant of the rays hitting the eye. The zero in the upper left means that the height \( h_2 \) of the ray at the end (i.e. on the retina) depends only on the slant of the ray \( s_1 \) when it enters the eye

\[ h_2 = 0.01668 s_1 \]

For example, if a ray (or bundle of rays) enters the eye with a slope of \( s_1=0.0875 \) (which is an angle of 5 degrees), this equation says that the height on the retina (above or below the centre of the retina) is \( h_2 = 0.0875x0.01668 = 0.00146\text{m} \), or about \( 1.5 \)mm. From that, we can say that a visual angle of \( 1 \) degree gives an image height of \( 0.3 \)mm.

A Magnifying Lens and an Eye

We can use the LeGrand eye matrix to work out the size of an image formed by a magnifier. Suppose we hold a magnifying glass of power \( F \) at a distance of \( 1/F \) from the thing we want to magnify, and we have a gap of length \( d \) between the magnifier and the eye (Figure 5 ). We'll use the LeGrand matrix to represent the eye.

fig5
Figure 5. A magnifier. The object sits on the reference plane at far left (the vertical dashed line). There is a gap of width \( 1/F \), a magnifier of power \( F \) , a gap of length \( d \), and finally an eye. Each of these components can be described by a matrix.

The matrices involved are:

Gap of length \( 1/F \)
\[ \bmatrix{1 & 1/F \\ 0 & 1} \]
A lens with power \( F \)
\[ \bmatrix{1 & 0 \\ -F & 1} \]
Gap of length d
\[ \bmatrix{1 & d \\ 0 & 1} \]
LeGrand eye
\[ \bmatrix{0&0.01668\\ 44.86559&0.67696 } \]

If we multiply these together in the right order, we get:

\[ \begin{align} \bmatrix{0&0.01668\\ 44.86559&0.67696 }\bmatrix{1 & d \\ 0 & 1}\bmatrix{1 & 0 \\ -F & 1}\bmatrix{1 & 1/F \\ 0 & 1} = \\ \\ \bmatrix{-0.01668F & 0 \\ -44.86559(1-d F)-0.67696 F & -44.86559/F} \end{align} \]

So the matrix equation for a ray \( \bmatrix{h_1\\ s_1} \) which starts on the object being magnified, goes through the magnifier, across the gap, and into the eye, ending up on the eye's retina, is:

\[ \bmatrix{h_2\\ s_2} = \bmatrix{-0.01668F & 0 \\ -44.86559(1-d F)-0.67696 F & -44.86559/F}\bmatrix{h_1\\ s_1} \]

This is the messiest matrix equation we have come across so far, but the interesting thing is the zero in the top right corner of the matrix. We've previously come across a zero in this place when proving the thin lens equation in Chapter 17.

If a matrix has a zero in the top right, then it describes a system where an object forms an image, and the height of the ray at the end of the system (which is the height of the ray on the retina) only depends on the height of the ray at the start (which is the height of the object). In this equation, the heights are related as:

\[ h_2 = -0.01668F\; h_1 \]

The gap \( d \) doesn't come into this at all, so the size of the magnified image on the retina doesn't depend on the gap at all.

Full Circle.

We started in Chapter 1 by defining the vergence of a bundle of rays, and how that vergence is changed by a lens. In the matrix approach, we don't usually care about vergences, since we are following individual rays as they pass through the system. But we can connect the matrix approach to vergences, and we will do so here.

Imaging a bundle of rays diverging from a point on the optic axis, as in Figure 6(a). If we put a reference plane at a (positive) distance \( u \) from where the rays begin, then the vergence of the rays hitting the reference plane is

\[V = \frac{1}{-u}\]

Pick out one ray with height \( h \) (see Figure 6(b)). The slope of the ray is \( s=h/u \). Since \( V=1/(-u) \), the slope is also \( s = -h V \). Thus, the vergence of the ray is

\[V = \frac{-s}{h}\]
fig6a
Figure 6(a). A bundle of divergent rays crosses an imaginary reference plane (the vertical dashed line). The distance from where the rays start to the plane is \( u \), which is positive since we're measuring in the same direction as the ray travel. The vergence of the rays when they intersect the plane is \( V=1/(-u) \).
fig6b
Figure 6(b). If we look at just one ray, it has a height \( h \) above the optic axis when it intersects the plane. The slope of the ray is \( h/u \).

Thus we have two different ways of thinking about a ray

Vergence approach Matrix approach
All rays have vergence \( V \)

A ray has height and slope given by

\[\bmatrix{h \\ -h V}\]

Any optical system can be written as a 2-by-2 matrix \( \bmatrix{A & B \\ C & D} \). If a ray with height \( h_1 \) and vergence \( V_{in} \) hits this optical system, its slope is \( -h_1 V_{in} \) and the matrix equation is

\[ \bmatrix{h_2\\ s_2} = \bmatrix{A & B \\ C & D}\bmatrix{h_1 \\ -h_1 V_{in}} \]

From this, we have two equations

\[ \begin{align} h_2 &= A h_1 -B h_1 V_{in} &=h_1(A-B V_{in})\\ s_2 &= C h_1 -D h_1 V_{in} &=h_1(C-D V_{in}) \end{align} \]

The ray at the end of the system \( \bmatrix{h_2\\ s_2} \) has a vergence \( V_{out} \) given by \( V_{out}=-s_2/h_2 \). Thus

\[ V_{out} = \frac{-s_2}{h_2} = \frac{-h_1(C-D V_{in})}{h_1(A-B V_{in})}=\frac{D V_{in}-C}{A - B V_{in}} \]

That's a bit more complicated than the thin lens equation in Chapter 1, because this equation can deal with any optical system, not just a thin lens. But what if the optical system is just a thin lens? The matrix for a thin lens is

\[ \bmatrix{A & B \\ C & D} = \bmatrix{1 & 0 \\ -F & 1} \]

That is, \( A \) and \( D \) are both \( 1 \), \( C \) is \( -F \), and \( B \) is zero. Then the above vergence equation becomes

\[ V_{out} = \frac{D V_{in}-C}{A - B V_{in}}=\frac{V_{in}-(-F)}{1 - 0} = V_{in}+F \]

which is just the thin lens equation, again.