2014-04-21

Wonders of Google Maps API


As you can see, we have new Hyperbuses which must have been difficult to map, with all their jumps between stations, so we understand Google's delays ;)

All jokes aside, the transit data is still very valuable, even if not perfect. I realized I could use these directions, through the Google Maps API Web Services, to find out how long the commute would take. Also, I felt like procrastinating on the final paper that I must do for college.

So, as I read through the documentation, I come up with the following query:

https://maps.googleapis.com/maps/api/directions/json
?origin=46.774699,23.621780
&destination=46.76817,23.579759
&sensor=false
&mode=transit&arrival_time=1399356000
This returns a JSON of the available buses from my current location (the Economica 2 dorm) to my workplace. The arrival time is a random Tuesday at 9AM (EEST), and mode is set to "transit", to get transit directions, instead of ones for a personal vehicle.

Among the plethora of data returned, I find the most important piece:
"duration" : {
    "text" : "31 mins",
    "value" : 1883
},

That's exactly what I'm looking for. The text is 31 minutes, and the value looks like it's the duration in seconds, which is even more precise (though not necessarily more accurate): 31.383 minutes.

So, how to automate all this?

First, I check with a completely browser-unrelated tool that I don't need any login cookies set or anything.

wget "https://maps.googleapis.com/...

And it still works. Well, it seems it's very easy to use, and freely available to anyone. The usage limits say the limit is 2,500 directions requests per 24 hour period. That should be about enough for creating a tool which generates, say, 50x50 pixel map of duration via bus, in one day. But I could easily create a more detailed map, given more time.

This should be easy to implement using Python and matplotlib.

Getting data for one point

The first step is getting the distance from one point to my workplace using Python. This is done very easily, since Python comes pre-packaged with lots of stuff, such as urllib2 for fetching a resource at a URL, and json, for parsing, you guessed it, JSON data. Here's a Python snippet:

import urllib2
import json

response = urllib2.urlopen('http://maps.googleapis.com/maps/api/directions/json?origin=46.774699,23.621780&destination=46.76817,23.579759&sensor=false&mode=transit&arrival_time=1399356000')

data = json.load(response.read())

I paste the JSON into a collapsible viewer, JsonFormatter, so I can more easily understand its structure. The data I'm looking for, in my particular query, seems to be data['routes'][0]['legs'][0]['duration']['value'] .

I'm worried because there are a lot of 0's in my path, because those elements happened to be the first in a list. But what if there are more possible ways of reaching my destination? Are they sorted? In what order?

Thankfully, Google's documentation provides these details, and says that there is a leg for every destination or waypoint in my trip. Since I only have one destination and no waypoints, I don't need to worry, and my query is always going to be the one above. I get the minimum time, however, if you look at the code on GitHub.

Which points to get?

If you study what techniques geographers use when mapping, you'll find one called kriging. This is what an oil surveyor would use, to determine the points which would offer the most information. However, I'm a simple man, and the Google API allows me to bruteforce it, so I'll use simpler (haha, just go on reading and find out) maths.

I could use Google's API to determine if a point is in Cluj-Napoca. But I prefer a simpler way, which abuses their servers less. Here's where linear algebra comes into play. If you look at Cluj-Napoca, it kind of looks like a parallelogram:



If you read the following Stack Overflow response, a more mathematically inclined peer enlightens us. We're going to use part A of his answer. His words are as follows:
p = A + u*AB + v*AD 
This equation is able to generate any point in a parallelogram defined by A, B, and D, by tuning u and v from 0 to 1.

So, we choose u's and v's from 0 to 1, to cover our map uniformly. Actually, it's only going to be really uniform if the lengths of the sides are quite similar, otherwise, the points are going to be denser over one axis. To compensate, one would multiply the number of points on the longer axis by the ratio of the axes' lengths. But I don't bother. I just pick 11 points on each side, since the lengths are approximately equal.

Using numpy's linspace function, you can easily create a grid of n points - pairs of u's and v's covering the rectangle (0,0)-(1,0)-(1,1)-(0,1) uniformly. Here's how:
us = numpy.linspace(0, 1, int(numpy.sqrt(n)))
vs = us

for u in us:
    for v in vs:
        point = numpy.array([u, v])
(Edit: after this, I found out about numpy.mgrid. Facepalm.)

Actually, we plop those points into the equation, and we get our coordinates:
ab = b-a
ad = d-a

...
point = a + (u*ab) + (v*ad)
 

Fancy some style?

When we have all points, we group together their coordinates (into lists which I name X, Y, Z), and pass them to an interpolation function, to force them into a grid for creating a pretty contour plot.

xi = numpy.linspace(min(X), max(X), 100)
yi = numpy.linspace(min(Y), max(Y), 100)
zi = griddata(X,Y,Z,xi,yi,interp='nn')
CS = plt.contourf(xi,yi,zi,15,cmap=plt.cm.rainbow,
                  vmax=abs(zi).max(), vmin=-abs(zi).max())
plt.show()
I go on to make sure that no datapoint is ever lost, by using a Python dictionary to record them uniquely, and by loading and saving them to a file.

Also, I overlaid this map over Google's, so we can find out what exactly we're looking at. I tried plotting a few dozen times. First to get the coordinates right, and then to try out various styles.

I reached Google's query limit - but I want to thank them for offering any amount of this service for free. It's like a taxi worker losing a client because of telling you where something is, instead of taking you himself.

I ended up using complete transparency when the commute is quickest, and completely opaque purple when it's slowest, because purple is the least used in Google Maps. Also, I slightly darkened the datapoints' positions.

I'm proud of the result (click to see a larger view):



You can download the source code and play with it. Feel free to send me a pull request if you tweak it into something even more awesome.

commute-duration-map

2 comments:

  1. Wouldn't it have been easier to query only the points where there are bus stations, and do a linear interpolation between those points? With your walking speed.

    ReplyDelete
    Replies
    1. I would have needed something like a Distance transform:

      http://homepages.inf.ed.ac.uk/rbf/HIPR2/distance.htm

      http://wiki.blender.org/index.php/Doc:2.6/Manual/Textures/Types/Procedural/Voronoi

      Heh, Blender even has Manhattan distance :D

      However, it probably wouldn't have matched up with the streets anyway, and Google's guess is probably better than I could obtain.

      Delete

Share your insights!