Developer Blog
This post was originally published by Ryan Hafen on his website, RyanHafen.com, and has been reprinted here with his permission.
There are many map plotting features in rBokeh that I haven’t been able to cover in detail in the documentation. This post will go into a few of those, including google map types, custom map styles, and using different layer functions to plot on top of a map.
Bokeh has the ability to use a Google map as a backdrop for plotting. For example, here’s a map of Manhattan and surrounding area:
# install.packages("rbokeh", repos = "http://packages.tessera.io") library(rbokeh) gmap(lat = 40.73306, lng = -73.97351, zoom = 12, width = 680, height = 600)
This is what you typically expect to get with a Google map - you can pan and zoom (click the mouse zoom tool in the toolbar to zoom), etc. To get a map we simply call gmap()
and specify the latitude and longitude of the center of the area we want to see and the level of zoom, ranging from 1 (world) to 20 (buildings).
For the sake of having an example to use in this post, I downloaded the December 2015 NYC Citi Bike data from here and summarized the number of times a bike was picked up or dropped off at each of the stations. This summary is stored in a github gist here.
Let’s start by reading it in.
bike - read.csv("https://gist.githubusercontent.com/hafen/3d534ee95b964ef753ab/raw/dbe9f0cbe29d17151d852e8cc1c3466f7a7f02e9/201512_nycbike_summ.csv", stringsAsFactors = FALSE) head(bike)
# station lat lon n_start n_end # 1 1 Ave & E 15 St 40.73222 -73.98166 4258 4272 # 2 1 Ave & E 18 St 40.73381 -73.98054 2854 2860 # 3 1 Ave & E 30 St 40.74144 -73.97536 3241 3251 # 4 1 Ave & E 44 St 40.75002 -73.96905 2020 2013 # 5 1 Ave & E 62 St 40.76123 -73.96094 2165 2156 # 6 1 Ave & E 68 St 40.76501 -73.95818 3533 3530
We have the names and geographic coordinates of each station and how many bikes started and ended at each station in December 2015.
To start, let’s overlay the station locations on the Google map. We do this just as we do with any rBokeh plot, by adding layers.
gmap(lat = 40.73306, lng = -73.97351, zoom = 12, width = 680, height = 600, map_type = "hybrid") %>% ly_points(lon, lat, data = bike, hover = c(station, n_start, n_end))
This simply shows the locations of the bike stations (using ly_points()
) on top of a “hybrid” map, one of the few map_type
options available with gmap()
. When we hover a station we get the station name and the number of arrivals and departures. We can zoom in to see where the stations are located (note: exactness of placement can be questionable due to lat/lon rounding in our data or slight mis-alignment of the times, but we will see that it does a pretty good job here).
Using different map types
There are four different map types that can be used, “hybrid”, “satellite”, “roadmap”, “terrain”. We will use “roadmap” for the rest of the examples here but feel free to experiment with the others.
Here is how to make the map with “roadmap” map type.
gmap(lat = 40.73306, lng = -73.97351, zoom = 12, width = 680, height = 600, map_type = "roadmap") %>% ly_points(lon, lat, data = bike, hover = c(station, n_start, n_end))
Using different styles
You can specify different Google map styles for your plots using the map_style
argument and retrieving one of the built-in map styles using gmap_style()
. For example, below we create a map with the “blue_water” style.
gmap(lat = 40.73306, lng = -73.97351, zoom = 12, width = 680, height = 600, map_style = gmap_style("blue_water")) %>% ly_points(lon, lat, data = bike, hover = c(station, n_start, n_end))
See ?gmap_style
for a list of styles that I have built in to the package.
Using custom styles
If you do not like any of the built-in styles, you can use your own. For example, snazzymaps.com has several Google map styles.
You can simply copy the json and pass that as map_style
. For example, we can take the json from the “Brooikline Blank” style shown in the image above and specify it as our map style.
style - '[{"featureType":"administrative","elementType":"all","stylers":[{"hue":"#ffffff"},{"lightness":100},{"visibility":"off"}]},{"featureType":"landscape","elementType":"all","stylers":[{"hue":"#ffffff"},{"saturation":-100},{"lightness":100},{"visibility":"on"}]},{"featureType":"poi","elementType":"all","stylers":[{"hue":"#ffffff"},{"saturation":-100},{"lightness":100},{"visibility":"off"}]},{"featureType":"road","elementType":"geometry","stylers":[{"hue":"#000000"},{"saturation":-100},{"lightness":-100},{"visibility":"simplified"}]},{"featureType":"road","elementType":"labels","stylers":[{"hue":"#ffffff"},{"saturation":-100},{"lightness":100},{"visibility":"off"}]},{"featureType":"transit","elementType":"geometry","stylers":[{"hue":"#000000"},{"lightness":-100},{"visibility":"on"}]},{"featureType":"transit","elementType":"labels","stylers":[{"hue":"#ffffff"},{"lightness":100},{"visibility":"off"}]},{"featureType":"water","elementType":"all","stylers":[{"hue":"#ffffff"},{"saturation":-100},{"lightness":100},{"visibility":"on"}]}]' gmap(lat = 40.73306, lng = -73.97351, zoom = 12, width = 680, height = 600, map_style = style) %>% ly_points(lon, lat, data = bike, hover = c(station, n_start, n_end))
A slightly more interesting plot
So far what we’ve been plotting is a bit boring. Let’s go a little further and make the point sizes relative to the difference between the number of times it was a starting vs. ending destination. We’ll make the points green if there were more departures than arrivals and red if there were more arrivals than departures. Finally, let’s get even more fancy and add a set of polygons on top of the map the delieate the NYC community districts. I don’t know why this would be particularly useful for looking at bike stations, but it’s a nice illustration of the versatility you have in adding layers to a plot just like you do with normal rBokeh figures.
The polygon data for the community districts needs to be read in from a gist, and then we can make the plot.
cdist - read.csv("https://gist.githubusercontent.com/hafen/a447521ff8b24ddefba5/raw/044e174d7b9e6a370fff429f9cda4d0903b4c0a6/communitydistricts.csv") bike$diff - bike$n_start - bike$n_end bike$color = ifelse(bike$diff > 0, "#2CA02C", "#D62728") gmap(lat = 40.73306, lng = -73.97351, zoom = 12, width = 680, height = 600, map_style = gmap_style("blue_water")) %>% ly_polygons(x, y, group = which, data = cdist, fill_alpha = 0.1, line_width = 2, color = "orange") %>% ly_points(lon, lat, data = bike, hover = c(station, n_start, n_end, diff), fill_alpha = 0.8, size = abs(diff), color = color, legend = FALSE)
Yippee.
Why use rBokeh for maps
While there are many R packages with map plotting capabilities, often with these packages you either get the map as the primary output and a limited / less R-like set of tools for plotting on top or alternatively you get a lot of tools for plotting but the result is a static image. With rBokeh you get all of rBokeh’s plotting functionality with an interactive map. Of course I could be completely ignorant of features in other packages.
For the future of map plotting, Bokeh has a generic tile rendering capability which could open up a lot more possibilities, but I’m not sure how mature this capability is right now beyond Google maps.
Remember, you can install rBokeh from the R channel in Anaconda Cloud:
$ conda install -c r r-rbokeh