I’ve talked before on this blog about the propagation of electromagnetic waves, and that’s not stopping any time soon. This time, I’d like to go through from first principles and demonstrate how a lens forms an image in all the gory details. Beware, there be maths ahead…

**In the beginning there was Maxwell**

Now, light is an electromagnetic wave, so to describe it all we need are the following four equations:

These are the equations for the 6 vector components of the electric and magnetic fields, and includes dielectric materials through the factor (where is the refractive index). Light is a sinusoidal wave, so we know that we can assume that the only time dependence of these fields is a factor . Using this fact, one can show:

where is the light wavevector. If we further assume that gradients in refractive index are slow, we arrive at the Helmholtz equation:

where the refractive index has been absorbed into the definition of the material-dependent wavevector . It is possible to show that if a -function source is placed at the origin, the solution of the Helmholtz equation (the Green’s function) is:

Furthermore, in this limit of slow refractive index variations, it can be shown that all the components of the fields vary in the same way. We therefore only concentrate on one component, of the electric field say, and this is then scalar diffraction theory. Vector diffraction must be considered when calculating the field distribution of tightly-focussed laser pulses. The Green’s function simply tells us the phase of the light wave is constantly increasing with time at temporal frequency , and decreasing with propagation distance with spatial frequency .

The intensity of the light wave we consider is given by the Poynting vector

which scales with distance as as expected, and confirms that the Green’s function above represents the amplitude of an energy flux conserved over radial shells of area .

**The setup**

Now we’ve (rather rapidly) travelled from Maxwell’s equation to an expression for the wave emitted by a single point, we need to use this to figure out how a lens builds up an image. I’ll refer to the following conceptual setup in the derivation:

We have an object in plane , with coordinates . The light from this object travels a distance to the plane , just before the lens. The lens of focal length f applies some magic, and transforms to . Finally, this transformed wave travels a further distance to form an image at plane . Let’s get started.

**Paraxial facts**

Here on out we stipulate a further restriction: the light waves don’t travel too far from the optical axis, or where is the transverse size of the object/lens/image. This is the paraxial approximation, and makes things much simpler. In the real world the paraxial approximation is too simple, and instead lenses must take into account the fact that it isn’t perfectly valid. We’ll ignore this inconvenient fact here.

Suppose we propagate a light ray from a point to over a longitudinal distance . The total distance the ray travels is:

Substituting this expression into the Green’s function, in the paraxial approximation we have:

We have omitted a constant phase factor from the Green’s function as it is constant across the wave field, and when we consider physical intensities constant phase factors are physically irrelevant.

**Getting to the lens**

In the object plane at the wave amplitude is given by . From the Green’s function, we know that this point contributes to the point in the plane a factor . Adding up all of the waves emitted by the object we arrive at our first expression for the propagated wave, in this case just before the lens:

**What the lens does**

Let’s consider a simple lens, flat on one side and consisting of a spherical cap of radius of curvature on the other. The thickness of the lens as a function of position is then given by

Assuming the radius of curvature is large, we may expand the square root. If the refractive index of the lens is , the excess phase shift imposed by the lens is:

where again we have omitted a constant phase shift. You might recognise from the lensmakers equation the definition of the focal length of the lens

and so

The act of the lens is then to apply a phase shift to the incoming light which increases with the square of the radius, and which is proportional to the inverse of the focal length. The light field immediately after the lens is thus:

We’ve assumed the two fields at and occupy the same position, and so implicitly assume the lens is infinitely thin. Unsurprisingly this is known as the thin lens approximation, and is something else which doesn’t necessarily hold in the real world.

**Making an image**

We’re almost there. We’ve made it to and through the lens, the last step is to form an image. To avoid writing too many integral signs, let’s define a transfer function which dictates how the input field transforms into the output field:

We have most of the ingredients for above, we just need to apply a second propagation step over the distance . Doing this we end up with the slightly scary expression:

Let’s look at this term-by-term:

- The first term is a pure phase term in the image coordinates which will disappear when we calculate the intensity, so it can go.
- The second term is second-order in the object coordinates, so becomes negligible for sufficiently small objects. We’re in the paraxial approximation, so neglecting this is OK.
- The third term is inside the integral, so does complicated things. However, we notice that if we make the stipulation , then this term disappears. Of course, this is the relation in geometric optics relating the object and image distances.

With these terms out the way, the expression is much more manageable. We have an inkling about the geometric properties of the problem now, so define the image magnification :

Let’s take the limit that – the geometric optics limit. This is equivalent to scaling the integration variables such that the limits go to infinity, and each integral becomes a function:

This transfer function tells us that the object and image planes are the same, as long as we change coordinates such that and . We therefore have a a new plane containing a perfect, inverted, scaled copy of the wave field at the object plane – also known as an image!

If we had included effects like the finite lens aperture, rather than a -function we would instead have an Airy function. The image would then be a convolution between the object and the Airy pattern, reducing resolution below the theoretically perfect one. If the imaging distances don’t fulfil the condition above, the transfer function is additionally broadened in a complex way which reduces resolution further.

**The payoff**

Well done for making it this far. If you were eagerly anticipating a fancy graphic or something, I’m afraid you’re out of luck. However, don’t despair! Now I’ve set the groundwork, the next post is pretty much all pretty GIFs. I’ll see you then, assuming you haven’t already unsubscribed.