All these problems can now be overcome by using Rgooglemap package that is available from the R CRAN repository and so the process has become very simple. Finally, the map is generated like any R plot and can be saved as a PNG or PDF file for offline viewing.
Please see the code here :
# # R program to show specific addresses on a Google Map # # the correct version was not available on CRAN install.packages("/home/hduser/Downloads/rjson_0.2.13.tar.gz",repos=NULL, type="source") setwd("/home/xxxx/xxx/maps") library(rjson) library(ggmap) library(RgoogleMaps) library(png) # # input data scraped off the web by running the python program # https://github.com/prithwis/WebScraper/blob/master/SchoolDataScraper0.py # the actual tsv file used in this exercise can be downloaded at # https://github.com/prithwis/WebScraper/blob/master/CalcuttaSchools.tsv # Schools = read.csv(file="CalcuttaSchools.tsv",head=FALSE, sep="\t") # since there is a limit on the number of geocoding requests that can be made, we work with only 5 schools Schools = Schools[sample(1:nrow(Schools), 5, replace=FALSE),] colnames(Schools) = c("Name", "Address") # Lat, Lon is extracted along with address as understood by Google GeoLocations = geocode(as.character(Schools$Address),output ='latlona') MapData = cbind(Schools,GeoLocations) names(MapData) = "GooglePlace" MapData = MapData[c("Name","lon","lat","Address","GooglePlace")] # sanity check whether Address is similar to GooglePlace. If different, possible geolocation error print(MapData[c("Address","GooglePlace","Name")]) # -------------------------------------------------------------------------- # Map is defined in terms of centre and zoom level cent2 = c(mean(MapData$lat), mean(MapData$lon)) zoom2 = min(MaxZoom(range(MapData$lat), range(MapData$lon))) # first get the map from Google as a png file SchoolMap = GetMap(center = cent2, zoom = zoom2, destfile = "MapSchools.png", maptype = "map") imgSchoolMap = readPNG("MapSchools.png") grid::grid.raster(imgSchoolMap) # Define set of long, lat to be plotted on map LatSet = MapData$lat LonSet = MapData$lon # Plot points on the map # to change plot symbols look at http://www.statmethods.net/advgraphs/parameters.html PlotOnStaticMap(SchoolMap,lat = LatSet, lon = LonSet, cex = 0.7, pch = 6, col = "red", FUN = points, NEWMAP = TRUE) # Name of the school, truncated to first 4 char, will be used as identify the points NameSet = substr(as.character(MapData$Name),1,4) # Location where name is printed, slightly different from the point plotted LonOffSet2 = 0.005+LonSet # Write names PlotOnStaticMap(SchoolMap,lat = LatSet, lon = LonOffSet2, cex = 0.7, labels= NameSet, col = "black", FUN = text, add = T)
Couple of observations :
- The input to the program is a TSV file containing the names and addresses of 93 Schools in Calcutta. The TSV format is used because addresses typically contain "," and this can impact the reading process. The actual file used in this demo can be downloaded from github.
- This input file has been created with a Python program that has been used to "scrape" data from the a specific website. This Python program is also available in github.
- Google sets some limits on the number of geocoding requests that can be sent. So during the testing process, we take a random sample of 5 schools from the list of school addresses that we have downloaded.
- Finally, please note that the Google geocoding process is not totally reliable for addresses in India. Given the variety of address format, sometimes the Lat/Long retrieved is erroneous and the Calcutta schools can be placed in Iran or Mozambique! Or even in other locations in Calcutta, or West Bengal. Such things happen about 10% of the time. To spot and eliminate such obvious errors, we take a printout of the address supplied and the address generated by Google and compare the same. If they show significant differences it is best to remove this data or hand code the Lat / Long
- Finally putting text into a map is always tricky because text labels can overlap and cause a mess. In such cases it is far simpler to avoid text labels in R. Once the PNG file is generated, it is very easy to put in the text using any image editing software like Gimp or PhotoShop