A common problem when working with spatial data is the closest-pair-of-points problem.

Wikipedia defines the problem as: given n points in a metric space, find a pair of points with the smallest distance between them.

In a GIS for example, the 2-dimensional (planar) case is of great interest.

A naive solution is to just check each point against all other points and keep track of the shortest distance found so far.

We could call this the ‘brute-force’ solution and this is of course O(n^{2}).

Here’s a pseudocode implementation courtesy of Wikipedia:

minDist = infinity

for each p in P:

for each q in P:

if p ≠ q and dist(p, q) < minDist:

minDist = dist(p, q)

closestPair = (p, q)

return closestPair

It may surprise you to find out that we can do a lot better than this!

There is a divide and conquer strategy that recursively splits the space and solves the sub-problems first.

Take the 2D case for illustration purposes: If we split the plane along a vertical line, find the closest pair on each side of the vertical line, then the only problem left to solve is to find if there is a closer pair where the first item of the pair is on one side of our dividing line, and the second is on the other side.

At this point we are basically where we started – if we examine all the points on the left hand side and all the points on the right we are back at O(n^{2}).

The Wikipedia page has a nice explanation, but the core observation that lets us improve the complexity of the algorithm here is called the ‘sparse-box observation’.

Basically, we only need to check the points from each partition that are close to the dividing line! How close? well, the minimum distance between pairs on either side (because if a point is right on the dividing line, the other point can be that far away and still end up in the pair with the shortest distance).

With this observation in hand, we know that the pair we are looking for is either:

- The pair with the shortest distance on the left hand side
- The pair with the shortest distance on the right hand side
- The pair with the shortest distance where one member of the pair is on the right and one member is on the left

Check those distances and we’re done!

The running time of this algorithm is O(n log n).

You can probably imagine that this algorithm can be extended to higher dimensions and you would be right, we could split along a hyperplane instead of a line and go from there. I may tackle that in a future blog post.

I was surprised to see that there was no F# or Haskell solution on Rosetta Code.

Eran Leiserowitz has been doing a great series on computational geometry in Haskell and covered this problem back in November.

Inspired by the Haskell, I thought I’d try my hand at an F# implementation.

open System let dist (x1,y1) (x2,y2) = Math.Sqrt((x1-x2)*(x1-x2) + (y1-y2)*(y1-y2)) let closestBf points = let n = Seq.length points let list = points |> Seq.toList seq { for i in 0..n-2 do for j in i+1..n-1 do yield list.[i], list.[j] } |> Seq.minBy (fun (a, b) -> dist a b) let rec closestInternal points = match points with | _ when points |> Seq.length < 4 -> closestBf points | _ -> //partition points about a vertical line let sorted = points |> Seq.sortBy(fun (x,y) -> x) let left = sorted |> Seq.take((points |> Seq.length)/2) let right = sorted |> Seq.skip((points |> Seq.length)/2) //recurse each side of the vertical line let lMin = closestInternal left let rMin = closestInternal right //find minimum distance between closest pairs on each side of the line let lDist = match lMin with | (a,b) -> dist a b let rDist = match rMin with | (a,b) -> dist a b let minDist = Math.Min(lDist,rDist) let dividingX = left |> Seq.toList |> List.rev |> List.head |> fst //find close points on the right to the dividing line let closePoints = right |> Seq.takeWhile(fun (x,y) -> x - dividingX < minDist) |> Seq.sortBy(fun (x,y) -> y) //take the close points and merge them with the close points to the dividing line on the left hand side let pairs = left |> Seq.skipWhile(fun (x,y) -> dividingX - x > minDist) |> Seq.collect(fun (x,y) -> closePoints |> Seq.skipWhile(fun (x1,y1) -> y1 < y - minDist) |> Seq.takeWhile(fun (x2,y2) -> y2 < y + minDist) |> Seq.map(fun a -> ((x,y),a))) |> Seq.toList //return the closest pair of points from the three groups pairs |> List.append [lMin;rMin] |> List.sortBy(fun (a,b) -> dist a b) |> List.head

And.. it works (I think!):

let list = [(2.0,2.0);(0.0,0.0);(2.0,1.0);(5.0,5.0);(9.0,9.0);(10.0,10.0);(8.0,8.0);(4.5,6.0)] let closest = closestInternal list

val closest : (float * float) * (float * float) = ((2.0, 2.0), (2.0, 1.0))

It’s up on GitHub here, and at FSharp Snippets (which is ridiculously awesome and told me that the F# compiler produced errors when I first pasted in my snippet!) here.

[…] January 1, 2011 Martin Szarski Leave a comment Go to comments Looking at the previous post people unfamiliar with F# might have noticed the copious use […]

Thanks. Always looking for COmpGeom and F#.