I was playing pool recently, rather badly, and remembered it was much easier to play on a computer, when one can look down on the table from above and see where the balls are more easily. Wouldn’t it be great if there was a way to see this in real time when playing pool? I haven’t done that here, yet, but lets have a look at the steps which might be involved in a solution.
First things first: we need an image of a pool table taken from an angle, as it might be seen when standing back to take a look at which shot to go for. The following is simply from a Google image search, and is suitable for our purposes:
We want to take this image, and generate a new one which visualises the table in top-down view. I therefore need to extract the position of the table surface from this image. The simplest method is to notice that pool tables are generally a bright green colour, so let’s visualise the image according to how much each pixel is dominated by its green channel:
This is better, the table surface is much more prominent, and because each pixel is normalised to its brightness the shadows from the balls are lessened. However there are still lots of other features we don’t need. Look at the histogram of this image though:
There is a peak around 0.8, which corresponds to the table (the larger peak is due to the white background). If we take every pixel which lies under this peak and fill it white, with all other pixels black, we get the following:
Getting better still, but there are some annoying holes where the balls were, along with some noise here and there. Here we can turn to a useful function in the Image Processing toolbox of Matlab: bwareaopen. This removes all components of a monochrome image which have an area lower than a specified size. Applying this separately to light and dark components, we get the following:
Progress! We’re nearly at our bounding rectangle, lets look at the edges of this image using a Canny edge finding algorithm:
There are still a few divots left by the pockets, but the dominant lines in the image are from the edges of the table. To extract these dominant edges, we turn to the Hough transform of this ‘edge’ image. This is a very useful transform, white iterates through the pixels of an image and determines which possible straight lines that pixel could lie on. Every possible line is parameterised by its angle , and distance of closest approach to the centre of the image . If lots of pixels lie upon a given line , that point in the Hough transform will become significantly larger than its surroundings. If we look at the transform of the edge image below, we see 4 prominent peaks:
Each of these corresponds to a main edge of the table, with some other smaller peaks around caused by the pockets. If we extract these 4 lines and plot them over the original image, we get something which looks like this:
As expected, the biggest 4 peaks are from the dominant edges of the image. Finally, we should extend these lines and find where they cross to find the bounding quadrilateral of the table:
Finally finally, we can apply a projective transformation, a type of affine transformation, to warp the table into its flat top-down form. This assumes we know the real dimensions of the table surface:
So there we go, an automated way to take an arbitrary image of a pool table and get a ‘flattened’ view of its surface. There is much more to be done here – given the transformation matrix we could extract the 3D positions of the balls and figure out where they would go if we hit the white ball in a certain direction. Perhaps then we could back-transform the resultant trajectories in real time and overlay them as an augmented-reality display for the dodgy pool player…
That’s for another day though, and another blog post.