This document describes how to use R to generate KML files.
Part 1: Introduction to Creating KML Files for Geospatial Visualizations
Keyhole Markup Language (KML) files are used to denote features in a geospatial context. The KML file format was created to be used for Google Earth but eventually was made into an open standard. Although usually associated with Google Earth, KML files can be used by many online web mapping applications.
There are other R plugins that can export results directly to Google Maps or Leaflet maps. However the creation of a KML file independent of R code is useful for further geographic analysis, presentations, and online distribution.
The goal of this project walkthrough will be to demonstrate the following:
- Draw regional boundaries from shapefiles.
- Shade boundaries with different colours (i.e. a choropleth). An insurance example would be to use this to map territorial relativities.
- Draw points representing different geographic locations. The obvious example of using this would be to plot locations of claims and severity.
- View the KML files in Google Earth or using online mapping APIs.
This tutorial will rely on the following resources:
This tutorial was initially conceived before Google Earth Pro was made free (previously an annual subscription was required). Google Earth Pro can also accomplish the same goals however it is not open source and secondly the tutorial can be used to integrate with user created results in R.
Please ensure the following packages are installed:
- dplyr - Hadley Wickham's (new) data tools.
- plotKML - Package used to create the KML files.
- rgdal - This is used to change the map projection.
- sp - A library of spatial functions.
install.packages(c("dplyr", "plotKML", "rgdal", "sp"))
Part 2: Drawing a Choropleth
A choropleth is a map that is shaded to represent a statistical value. For example this choropleth shows migration patterns in China (source Financial Times).

To build a choropleth, we will first require a file that contains the geographic boundaries for each area. In this tutorial, we will be using the English, Canadian 2011 Census Boundary Files which can be downloaded here. We will be using the English, ArcGIS ® (.shp), Forward Sortation Area, Cartographic Boundary File (about 64 MB). Each boundary will then be assigned a value. This is described later in the tutorial.
- ArcGIS is a Geographic Information System (GIS) software sold by esri. It is regarded as the most established Geographic Information System software in the market and is used by most GIS professionals. Many geographic files are saved in their shapefile (.shp) format. When we extract the .zip file, notice there are multiple files that have the same filename but different extensions. They are required as they are used to describe the shapefile and also include attributes (like a database) for each record. Typically a shapefile will contain the spatial information in the .shx file with attributes attached to each record in the .dbf file.
- Forward Sortation Area (FSA) are the first three characters of the postal code (i.e. zip code in the US). It is used by Canada Post to help map mailing areas. It is commonly used by insurers for rating geography.
- The Cartographic Boundary file is simpler and does not have the full detail for the coastal areas.
Let's begin by initializing the libraries:
library("dplyr")
library("plotKML")
library("rgdal")
library("sp")
Then we will reference two datasets that store colour palettes:
data(R_pal)
data(SAGA_pal)
We will need to define the working directory. This can either be done through the GUI or using the setwd statement.
setwd("C:/Users/Username/Desktop/KMLR")
Make sure to place the contents of the zip file into the working directory. Now let's read the shapefile into the dataset using the readOGR command. Note that a reference was required to the filename as well as the name of the spatial information that is contained in a layer.
> shpfile <- readOGR(dsn="gfsa000b11a_e.shx", layer="gfsa000b11a_e")
OGR data source with driver: ESRI Shapefile
Source: "gfsa000b11a_e.shx", layer: "gfsa000b11a_e"
with 1621 features
It has 3 fields
We can view some of the properties of this file and also see what the data attributes look like by using the head and @data command:
> shpfile
class : SpatialPolygonsDataFrame
features : 1621
extent : -141.0181, -52.61941, 41.68144, 83.1355 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0
variables : 3
names : CFSAUID, PRUID, PRNAME
min values : A0A, 10, Alberta
max values : Y1A, 62, Yukon
> head(shpfile@data)
CFSAUID PRUID PRNAME
0 R0G 46 Manitoba
1 R0H 46 Manitoba
2 R0J 46 Manitoba
3 R0K 46 Manitoba
4 R0L 46 Manitoba
5 R0M 46 Manitoba
The dataset contains the following columns:
- CFSAUID is the Forward Sortation Area.
- PRUID is the Province ID.
- PRNAME is the Province Name.
For this exercise, we can use some simple commands to keep only the records for Toronto.
shpfile.subset <- shpfile[shpfile$PRUID==35,] # Keep only Ontario
shpfile.subset <- shpfile.subset[substr(shpfile.subset$CFSAUID,1,1)=="M",] # Toronto only
The list of categories can be extracted to a CSV file for review and to append values.
write.csv(as.data.frame(shpfile.subset@data), file = "ShapefileChoroplethList.csv")
In order to use this shapefile with Google Earth, we will need to change its map projection to WGS 84.
shpfile.subset <- spTransform(shpfile.subset, CRS("+proj=longlat +datum=WGS84"))
Now we need to assign a value to each record in the Shapefile. For this tutorial we will use random numbers as this will somewhat emulate a case where R analysis is used. Some alternate code is also provided if an input file is used.
#Use random numbers
shpfile.category <- as.data.frame(shpfile.subset@data)
set.seed(123) #Ensure reproducible results
shpfile.category$CATEGORY <- sample(1:100, nrow(shpfile.category), replace=T)
shpfile.subset@data <- left_join(shpfile.subset@data, shpfile.category)
#Alternate method to use an input file
#shpfile.category <- read.csv("ShapefileChoropleth.csv",colClasses=c("character","numeric"))
#shpfile.subset@data <- left_join(shpfile.subset@data, shpfile.category, by = c("CFSAUID" = "FSA"))
We will create a label that will contain the FSA name and the imported category. This will show on the KML file when an area is clicked.
shpfile.subset$LABEL <- paste(shpfile.subset$CFSAUID,"(",shpfile.subset$Category,")",sep="")
> head(shpfile.subset@data)
CFSAUID PRUID PRNAME CATEGORY LABEL
1 M1B 35 Ontario 29 M1B(29)
2 M1C 35 Ontario 79 M1C(79)
3 M1E 35 Ontario 41 M1E(41)
4 M1G 35 Ontario 89 M1G(89)
5 M1H 35 Ontario 95 M1H(95)
6 M1J 35 Ontario 5 M1J(5)
To create the KML file, we will be using the KML command from the plotKML package which has the following format and parameters:
- Function call
- kml(obj, folder.name, file.name, kmz, …)
- Parameters
- obj: object inheriting from the Spatial* or the Raster* classes
- folder.name: character; folder name in the KML file
- file.name: character; output KML file name
- kmz: logical; specify whether to compress the KML file
From using plotKML for a bit, I have noticed that some of the aesthetic parameters can be a little difficult to work with. They are better described in the tutorial. The ones we will be exploring are:
- colour: The variable that determines the colour to use. In this case we will use CATEGORY.
- colour_scale: This which is can be selected amongst one of the samples displayed here.
- altitude: This gives the choropleth a vertical dimension which can be defined by another dimension in the dataset. Useful for representing elevations or volumes.
- alpha: Alpha transparency factor, 0 is fully transparent.
- plot.labpt: An option to plot a label for each point.
- labels: The variable that contains the label.
- LabelScale: Control the size of the label.
After executing the KML command, a KML file will be written into your working directory. Some screenshots from Google Earth are shown below to illustrate what the KML file looks like.
Version without altitude parameter:
kml(obj=shpfile.subset, folder.name="", file.name="Shapefile Choropleth.kml", kmz=FALSE,colour=CATEGORY,colour_scale=R_pal[["heat_colors"]],alpha=0.75,altitude=0,plot.labpt=TRUE,labels=LABEL,LabelScale=0.5)

Version with altitude parameter:
kml(obj=shpfile.subset, folder.name="", file.name="Shapefile Choropleth Altitude.kml", kmz=FALSE,colour=CATEGORY,colour_scale=SAGA_pal[["SG_COLORS_WHITE_RED"]],alpha=0.25,altitude=CATEGORY*10,plot.labpt=FALSE,labels=LABEL,LabelScale=0.5)

Part 3: Drawing Points
Drawing points on the map is a simpler operation than a choropleth. First we would need to have a file with latitude and longitude coordinates. In practice, this would assume that there would be some geocoding already performed to map postal codes to coordinates. If the postal codes are not geocoded, there are some sources online with data that can be used to merge postal codes to coordinates.
Similar to the choropleth, we can use random values or import a file as input for the map. For this tutorial, we will use the random number method again but alternate code is also provided to read an input file. In addition to the coordinates, we can include a value to plot as well (e.g. a claim size).
#Use random numbers
set.seed(123) #Ensure reproducible results
LAT <- runif(100, 43.725, 43.872)
LONG <- runif(100, -79.579, -79.249)
MAGNITUDE <- sample(1:1000, length(LAT), replace=T)
shppoints <- data.frame(LAT,LONG,MAGNITUDE)
#Alternate method to use an input file
#shppoints <- read.csv("Map Points.csv")
Note that for western hemisphere coordinates, the longitude record should be negative. e.g. Toronto would be represented as Latitude +43.7° and Longitude -79.4°. Depending how accurate the geocoding is done or the frequency of having events at the same place, it would be advised that the points be slightly "shaken" so that events that occur at the same spot or in close proximity will not have overlapping points. This can be done by adjusting latitude longitude by a certain amount randomly. It is worthwhile to use a coordinate calculator to convert degrees to a distance in order to conceptualize the distance for which a point has been shifted.
This can be calculated and visualized using the tool here. Toronto is centered around 43.7° N and 79.4° W. 0.035° in both directions approximately represents a shift of about 500 meters.
randomDegrees <-0.0035
set.seed(123) #Ensure reproducible results
shppoints$LAT <- shppoints$LAT + runif(length(shppoints$LAT),-randomDegrees,+randomDegrees)
shppoints$LONG <- shppoints$LONG + runif(length(shppoints$LONG),-randomDegrees,+randomDegrees)
The points now need to be converted to a spatial dataset. Again we will need to change its map projection to WGS 84.
coordinates(shppoints) <- ~LONG+LAT #represents X+Y
proj4string(shppoints) <- CRS("+proj=longlat +datum=WGS84")
The points will need an icon assigned to each point. Usually this is referenced to an image that can be found on the web via a hyperlink. A sample of icons that could be used can be found here. If we use the white circle, the size and the colour of the circle can be adjusted. We can create a field called SIZEBUBBLE which can be set to different functions of MAGNITUDE. This could be useful to reflect the size of a claim as an example.
#shppoints$SIZEBUBBLE <- 1 #Same size
#shppoints$SIZEBUBBLE <- shppoints$MAGNITUDE #Proportional to magnitude
shppoints$SIZEBUBBLE <- log(shppoints$MAGNITUDE) #Proportional to some log value if the differences are too extreme, considering capping for outliers as well
Now we can plot the points using the KML command. The shape reference is made to a plotKML image online. The colour of the point can also be adjusted based upon the magnitude as well. Notice that that for large magnitudes, the circle is larger and darker.
kml(file.name="Shapefile Points.kml",shppoints,shape="http://plotkml.r-forge.r-project.org/circle.png",size=SIZEBUBBLE,colour=MAGNITUDE,colour_scale=SAGA_pal[["SG_COLORS_RED_BLACK"]],labels=MAGNITUDE,points_names="")

Part 4: Compressing KML files
It is adviseable to compress the KML file into a KMZ file which will reduce the file size and speed up the rendering of the maps. This can be done in any ZIP program. After the ZIP file is created, rename the extension to KMZ. There is a ZIP parameter in the KML command however I was not able to get this to successfully work with WinZip Command Line or gzip in Windows.
Another method to compress the KML file would be to open the file in Google Earth and save as a KMZ file.
Part 5: Viewing KML Files
The easiest way to view the KML file would be to open the file in Google Earth but this will not be discussed here. An alternative method would be to upload the file into a public dropbox folder and then displaying the KML file using an online mapping API.
For Google Maps API, there is a nice tool created by Chris Bell "Doogal" which allows you to enter a location of a KML and have it displayed on Google Maps. However there are some bugs in displaying the KML files. Below are some links to the output files viewed through Google Maps.
- Choropleth (labels don't seem to give the correct information)
- Choropleth with altitude paramenter (note that the altitude cannot be shown in Google Maps)
- Points (notice the circles are incorrectly all the same size)
An Open Source alternative would be to use the tool here that uses OpenStreetMaps. Unfortunately, from my testing it does not appear to like files from dropbox.