Finding the Closest X to Y

Finding the distance between different locations is a programming task that I've done a number of times. The idea seems simple - the user wants to know the closest branch, store, or distributor relative to them. Ignoring some possible use cases (commuting routes, varying convenient starting locations, on-the-way errand stops) you end up with the relatively simple problem of comparing distances between a list of points to a single point.

Follow the Roads with Google Maps Distance Matrix

Following roads to get distance information

Following roads to get distance information

For the most accurate results I often default to a distance matrix call provided by an online service. The goto for this is Google Maps API. Of course, this is assuming that people will be measuring distance by driving (and not going as-the-crow-flies), which seems reasonable. All you need to do is pass in two sets of parameters: the list of possible destinations and the primary location. The API will return some solid information back, including the interpreted geolocation, level of accuracy, and distance.

  1. // let's say that a google employee wanted to go off-site for lunch

  2. $origin = '1600 Amphitheatre Parkway, Mountain View, CA 94043';

  3. $burger_joints = array(

  4. 'In-N-Out Burger, 1159 N Rengstorff Ave, Mountain View, CA',

  5. 'McDonalds, 1060 N Rengstorff Ave, Mountain View, CA',

  6. 'The Counter, 369 S California Ave, Mountain View, CA',

  7. 'Shoreline Grill, 1020 N Rengstorff Ave, Mountain View, CA');

  8. // all the values need to be safely encoded

  9. $origin = urlencode($origin);

  10. $burger_joints = implode('|', $burger_joints);

  11. $burger_joints = urlencode($burger_joints);

  12. // construct the request to api

  13. $request = '';

  14. $request .= 'http://maps.googleapis.com/maps/api/distancematrix/json';

  15. $request .= "?origins={$origin}";

  16. $request .= "&destinations={$burger_joints}";

  17. $request .= '&sensor=false';

  18. // make the request, pull the response

  19. $response = file_get_contents($request);

  20. $response = json_decode($response);

  21. // just in case something is not accepted

  22. if($response->status != 'OK')

  23. exit("Huh, there was an error. Response: {$response->status}");

  24. // loop through response

  25. // verify that address was accepted and store in sortable array

  26. $distance_container = array();

  27. foreach($response->rows[0]->elements as $index => $element)

  28. {

  29. if($element->status != 'OK')

  30. continue;

  31. $distance = $element->distance->value;

  32. $address = $response->destination_addresses[$index];

  33. $distance_container[$distance] = $address;

  34. }

  35. // sort by distance!

  36. ksort($distance_container);

Where this approach fails is when you have more than 100 possible locations. The free services are limited to around 100 locations per request (or, like Google's API, 100 locations per 10 seconds or 2500 locations per 24 hours (if you try to get tricky)). If your application could exceed any of these limits you'll either need to pay for the API or try a different solution.

Geocoding, because Address are Meh

Online services (like Google Maps) are nice because they will convert our everyday nomenclature (street addresses) into specific points (latitude and longitude). Once we start moving into custom distance formulas we'll need to use specific points. If you're dealing with street addresses you'll need to convert them into latitude/longitude with a geocoding services. And yes, you can use Google Maps API, although they will only convert so many per unit of time. You should store the coordinates to avoid geocoding on each application load.

  1. // let's find the position of the Googleplex

  2. $address = '1600 Amphitheatre Parkway, Mountain View, CA 94043';

  3. $address = urlencode($address);

  4. // construct the request to api

  5. $request = '';

  6. $request .= 'http://maps.googleapis.com/maps/api/geocode/json';

  7. $request .= "?address={$address}";

  8. $request .= '&sensor=false';

  9. // make the request, pull the response

  10. $response = file_get_contents($request);

  11. $response = json_decode($response);

  12. // just in case something is not accepted

  13. if($response->status != 'OK')

  14. exit("Huh, there was an error. Response: {$response->status}");

  15. // get the latitude and longitude, which will be in degree format

  16. $latitude = $response->results[0]->geometry->location->lat;

  17. $longitude = $response->results[0]->geometry->location->lng;

Gross Approximation: Welcome to Flatland

Distance as-the-crow-flies

Distance as-the-crow-flies

Earth is a big place. For short distances you can pretend that the earth is flat (Pythagoras is totally overrated) and do some simple maths (but we'll still use his formula). This gets increasingly inaccurate as the points move away from each other and/or they move towards the poles. If you're worried about such things (or are dealing with distances more than 20 km away) you may want to use a spherical earth projected to a plane.

Approximating a sphere on a flat plane

Approximating a sphere on a flat plane

  1. // if you want the approximate distance, not just relative

  2. $earth_radius = 6371; // km

  3. // Googleplex!

  4. $latitude1 = 37.422163;

  5. $longitude1 = -122.083059;

  6. // that Shoreline Grill sounded nommy

  7. $latitude2 = 37.419192;

  8. $longitude2 = -122.093386;

  9. // deg2rad is not really necessary if you are just comparing

  10. $latitude_difference = deg2rad($latitude1) - deg2rad($latitude2);

  11. $longitude_difference = deg2rad($longitude1) - deg2rad($longitude2);

  12. // flat earth, pythagorean style

  13. $distance = pow($latitude_difference, 2) + pow($longitude_difference, 2);

  14. $distance = sqrt($distance) * $earth_radius; // unnecessary for comparison

  15. // sphere projected on a plane

  16. // note: php cos requires the arg to be in radians

  17. $mean_latitude = ($latitude1 + $latitude2) / 2;

  18. $distance = pow($latitude_difference, 2);

  19. $distance += pow($longitude_difference * cos($mean_latitude), 2);

  20. $distance = sqrt($distance) * $earth_radius; // unnecessary for comparison

Where the flat projection approach really shines is the computation speed. You can simplify these formulas if you aren't truly interested in the distance, just in the relative closeness. After all, the distances may be inaccurate with this method but their relative closeness will be less off.

Spherical Approximation on Perfect Earth

If you are dealing with a lot of varying distances, especially in the latitudinal dimension, the next step for accuracy would use spherical trigonometry. This will get accuracy to about .5%, since Earth is not a perfect sphere (the bulge around the equator). There are two main methods here - tunnels (like, through Earth and stuff) and circles (more like planes flying around). Either one will get you a good idea of relative closeness, while the circle one will get you a better idea of as-the-crow-flies distances.

Tunnels and circles on an awesome sphere drawing

Tunnels and circles on an awesome sphere drawing

  1. // if you want the approximate distance, not just relative

  2. $earth_radius = 6371; // km

  3. // Googleplex!

  4. $latitude1 = 37.422163;

  5. $longitude1 = -122.083059;

  6. // still trying to find Shoreline...

  7. $latitude2 = 37.419192;

  8. $longitude2 = -122.093386;

  9. // for tunnels we'll need 'colatitude'

  10. $colatitude1 = 90 - $latitude1;

  11. $colatitude2 = 90 - $latitude2;

  12. // since we're using sine and cosine, everything needs to be converted

  13. $latitude1 = deg2rad($latitude1);

  14. $colatitude1 = deg2rad($colatitude1);

  15. $longitude1 = deg2rad($longitude1);

  16. $latitude2 = deg2rad($latitude2);

  17. $colatitude2 = deg2rad($colatitude2);

  18. $longitude2 = deg2rad($longitude2);

  19. // okay, tunnels first!

  20. $distance = 0;

  21. $distance += pow(cos($colatitude2) * cos($longitude2)

  22. - cos($colatitude1) * cos($longitude1), 2);

  23. $distance += pow(cos($colatitude2) * sin($longitude2)

  24. - cos($colatitude1) * sin($longitude1), 2);

  25. $distance += pow(sin($colatitude2) - sin($colatitude1), 2);

  26. // again, this is not necessary if you're just comparing

  27. $distance = sqrt($distance) * $earth_radius;

  28. // if tunnels aren't your thing, try as-the-crow-flies

  29. $distance = 0;

  30. $distance += sin($latitude1) * sin($latitude2);

  31. $distance += cos($latitude1) * cos($latitude2)

  32. * cos($longitude2 - $longitude1);

  33. $distance = acos($distance);

  34. // unneccessary for vanilla comparisons

  35. $chord *= $earth_radius;

Too Close to Perfection - Ellipsoids

The formulas necessary for trigonometry with ellipsoids are tough and computationally ridiculous. If you are worried about this level of accuracy you'd be better off worrying about how far off your distances are going to be with roads, topology, and lakes. Seriously. I'm not going to talk about ellipsoids.

Conclusion

These different methods can be summed up very nicely based off of the relative number of points you're dealing with and the speed that your application needs to return data by.

  • Less than 100 points - use online mapping service
  • More than 100 points - use spherical tunnels
  • Way more than 100 points - use spherical projected on a plane
  • Way, way more than 100 points that are really close - use Pythagorean