1  Introduction

1.1 Use of R

REMINDER : R TIPS
  1. Comment your code ! (# important informations on the code)

  2. Check your R objects ! (plot(), print(), View() , …)

  3. Listen to R outputs ! (Errors AND Warnings)

  4. Get help ! (?name_of_function, internet, other users)

  5. Keep calm and take a break !

1.1.1 Installation

Note

The installation part is based on “An Introduction to R” book writed by Alex Douglas, Deon Roos, Francesca Mancini, Ana Couto & David Lusseau

1.1.1.1 R

1.1.1.1.1 Windows users

For Windows users select the ‘Download R for Windows’ link and then click on the ‘base’ link and finally the download link ‘Download R 4.2.1 for Windows’. This will begin the download of the ‘.exe’ installation file. When the download has completed double click on the R executable file and follow the on-screen instructions. Full installation instructions can be found at the CRAN website.

1.1.1.1.2 Mac users

For Mac users select the ‘Download R for (Mac) OS X’ link. The binary can be downloaded by selecting the ‘R-4.2.1.pkg’. Once downloaded, double click on the file icon and follow the on-screen instructions to guide you through the necessary steps. See the ‘R for Mac OS X FAQ’ for further information on installation.

1.1.1.1.3 Linux users

For Linux users, the installation method will depend on which flavour of Linux you are using. There are reasonably comprehensive instruction here for Debian, Redhat, Suse and Ubuntu. In most cases you can just use your OS package manager to install R from the official repository. On Ubuntu fire up a shell (Terminal) and use (you will need root permission to do this):

sudo apt update
sudo apt install r-base r-base-dev

which will install base R and also the development version of base R (you only need this if you want to compile R packages from source but it doesn’t hurt to have it).

If you receive an error after running the code above you may need to add a ‘source.list’ entry to your etc/apt/sources.list file. To do this open the terminal and enter this:

sudo apt install -y --no-install-recommends software-properties-common dirmngr
# Add keys
wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | sudo tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc

sudo add-apt-repository "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/"

Once you have done this then re-run the apt commands above and you should be good to go.

Install the following packages to allow for future spatial data analysis:

sudo apt install -y libgdal-dev libproj-dev libgeos-dev libudunits2-dev libv8-dev libnode-dev libcairo2-dev libnetcdf-dev

1.1.1.2 RStudio

Whilst its eminently possible to just use the base installation of R (many people do), we will be using a popular Integrated Development Environment (IDE) called RStudio. RStudio can be thought of as an add-on to R which provides a more user-friendly interface, incorporating the R Console, a script editor and other useful functionality (like R markdown and Git Hub integration). You can find more information about RStudio here.

RStudio is freely available for Windows, Mac and Linux operating systems and can be downloaded from the RStudio site. You should select the ‘RStudio Desktop’ version. Note: you must install R before you install RStudio.

1.1.1.2.1 Windows and Mac users

For Windows and Mac users you should be presented with the appropriate link for downloading. Click on this link and once downloaded run the installer and follow the instructions. If you don’t see the link then scroll down to the ‘All Installers’ section and choose the link manually.

1.1.1.2.2 Linux users

For Linux users scroll down to the ‘All Installers’ section and choose the appropriate link to download the binary for your Linux operating system. RStudio for Ubuntu (and Debian) is available as a *.deb package.

To install the *.deb file navigate to where you downloaded the file and then enter the following command with root permission

sudo apt install ./rstudio-2022.07.2-576-amd64.deb

You can then start RStudio from the Console by simply typing

rstudio

or you can create a shortcut on you Desktop for easy startup.

1.1.2 Help

The R help is very useful for the use of functions.

?plot #displays the help page for the plot function
help("*") #for unconventional characters

Calling the help opens a page (the exact behavior depends on the operating system) with information and usage examples about the documented function(s) or operators.

1.1.3 Functions

The basic syntax is:

afunction <- function(arg1, arg2){
  arg1 + arg2
}
afunction(10, 5)
[1] 15

1.2 Spatial in R : History and evolutions

Historically, 4 packages make it possible to import, manipulate and transform spatial data:

  • The package rgdal (Bivand, Keitt, and Rowlingson 2022) which is an interface between R and the GDAL (GDAL/OGR contributors, n.d.) and PROJ (PROJ contributors 2021) libraries allow you to import and export spatial data (shapefiles for example) and also to manage cartographic projections
  • The package sp (E. J. Pebesma and Bivand 2005) provides class and methods for vector spatial data in R. It allows displaying background maps, inspectiong an attribute table etc.
  • The package rgeos (Bivand and Rundel 2021) gives access to the GEOS spatial operations library and therefore makes classic GIS operations available: calculation of surfaces or perimeters, calculation of distances, spatial aggregations, buffer zones, intersections, etc.
  • The package raster (Hijmans 2022a) is dedicated to the import, manipulation and modeling of raster data.

Today, the main developments concerning vector data have moved away from the old 3 (sp, rgdal, rgeos) to rely mainly on the package sf ((E. Pebesma 2018a), (E. Pebesma 2018b)). In this manual we will rely exclusively on this package to manipulate vector data.

The packages stars (E. Pebesma 2021) and terra (Hijmans 2022b) come to replace the package raster for processing raster data. We have chosen to use the package here terra for its proximity to the raster.

1.3 The package sf

The package sf was released in late 2016 by Edzer Pebesma (also author of sp). Its goal is to combine the feature of sp, rgeos and rgdal in a single, more ergonomic package. This package offers simple objects (following the simple feature standard) which are easier to manipulate. Particular attention has been paid to the compatibility of the package with the pipe syntax and the operators of the tidyverse.

sf directly uses the GDAL, GEOS and PROJ libraries.

From r-spatial.org

Website of package sf : Simple Features for R

Many of the spatial data available on the internet are in shapefile format, which can be opened in the following way

library(sf)
Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.1; sf_use_s2() is TRUE
district <- st_read("data_cambodia/district.shp")
Reading layer `district' from data source 
  `/home/lucas/Documents/ForgeIRD/rspatial-for-onehealth/data_cambodia/district.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 197 features and 10 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 211534.7 ymin: 1149105 xmax: 784612.1 ymax: 1625495
Projected CRS: WGS 84 / UTM zone 48N
Shapefile format limitations

For the multiple limitations of this format (multi-file, limited number of records…) we advise you to prefer another format such as the geopackage *.gpkg. All the good reasons not to use the shapefile are here.

A geopackage is a database, to load a layer, you must know its name

st_layers("data_cambodia/cambodia.gpkg")
Driver: GPKG 
Available layers:
  layer_name     geometry_type features fields              crs_name
1    country     Multi Polygon        1     10 WGS 84 / UTM zone 48N
2   district     Multi Polygon      197     10 WGS 84 / UTM zone 48N
3  education     Multi Polygon       25     19 WGS 84 / UTM zone 48N
4   hospital             Point      956     13 WGS 84 / UTM zone 48N
5      cases       Multi Point      972      2 WGS 84 / UTM zone 48N
6       road Multi Line String        6      9 WGS 84 / UTM zone 48N
road <- st_read("data_cambodia/cambodia.gpkg", layer = "road")
Reading layer `road' from data source 
  `/home/lucas/Documents/ForgeIRD/rspatial-for-onehealth/data_cambodia/cambodia.gpkg' 
  using driver `GPKG'
Simple feature collection with 6 features and 9 fields
Geometry type: MULTILINESTRING
Dimension:     XY
Bounding box:  xmin: 212377 ymin: 1152214 xmax: 784654.7 ymax: 1625281
Projected CRS: WGS 84 / UTM zone 48N

1.3.1 Format of spatial objects sf

Objectssf are objects in data.frame which one of the columns contains geometries. This column is the class of sfc (simple feature column) and each individual of the column is a sfg (simple feature geometry). This format is very practical insofa as the data and the geometries are intrinsically linked in the same object.

Thumbnail describing the simple feature format: Simple Features for R

Tip

A benchmark of vector processing libraries is available here.

1.4 Package mapsf

The free R software spatial ecosystem is rich, dynamic and mature and several packages allow to import, process and represent spatial data. The package mapsf (Giraud 2022) relies on this ecosystem to integrate the creation of quality thematic maps into processing chains with R.

Other packages can be used to make thematic maps. The package ggplot2 (Wickham 2016), in association with the package ggspatial (Dunnington 2021), allows for example to display spatial objects and to make simple thematic maps. The package tmap (Tennekes 2018) is dedicated to the creation of thematic maps, it uses a syntax close to that of ggplot2 (sequence of instructions combined with the ‘+’ sign). Documentation and tutorials for using these two packages are readily available on the web.

Here, we will mainly use the package mapsf whose functionalities are quite complete and the handling rather simple. In addition, the package is relatively light.

mapsf allows you to create most of the types of map usually used in statistical cartography (choropleth maps, typologies, proportional or graduated symbols, etc.). For each type of map, several parameters are used to customize the cartographic representation. These parameters are the same as those found in the usual GIS or cartography software (for example, the choice of discretizations and color palettes, the modification of the size of the symbols or the customization of the legends). Associated with the data representation functions, other functions are dedicated to cartographic dressing (themes or graphic charters, legends, scales, orientation arrows, title, credits, annotations, etc.), the creation of boxes or the exporting maps.
mapsf is the successor of cartography (Giraud and Lambert 2016), it offers the same main functionalities while being lighter and more ergonomic.

To use this package several sources can be consulted:

  • The vignettes associated with the package show sample scripts,

  • The R Geomatics blog which provides resources and examples related to the package and more generally to the R spatial ecosystem.

1.5 The package terra

The package terra was release in early 2020 by Robert J. Hijmans (also author of raster). Its objective is to propose methods of treatment and analysis of raster data. This package is very similar to the package raster; but it has more features, it’s easier to use, and it’s faster.

Website of package terra : Spatial Data Science with R and “terra”

Tip

A benchmark of raster processing libraries is available here.