Rasters in R

Introduction


   A 'raster' is a spatial (geographic) data structure that divides the space into elements of equal size (in units of the coordinate reference system) called 'cells'. Cells have a square or rectangular shape and can store one or more values. Raster are also sometimes referred to as 'grids' and cells as 'pixels'. Rasters typically represent continuous spatial. Raster are therefore in sharp contrast with the other main structure used to store and manipulate geographical data, 'vectors'. Vectors represent discrete (i.e. object based) spatial data, such as points, lines, polygons.

   The fundamental R package for working with spatial data in raster format is 'raster', originally developed by Robert J. Hijmans. The 'raster' package provides classes and functions to create, read, manipulate and write raster data. In this tutorial we will describe and experiment with many of these classes and functions. A notable feature of this package is that it can work with very large spatial datasets that cannot be loaded into computer memory. Functions process these large datasets in chunks, without attempting to load all values into memory at once. 'raster' has revolutionised the manipulation, geo-processing and analysis of raster data in R. More information on 'raster' can be found in the package vignette here.

   The 'raster' package relies on the 'rgdal' package to read, write, and geo-process raster data. 'rgdal' provides bindings to the Geospatial Data Abstraction Library ('GDAL') and access to projection/transformation operations from the 'PROJ.4' library, both external to 'rgdal'. It is also possible to call GDAL functionalities directly from R and the command line in a terminal (e.g. using the function 'system' in R). Many spatial software packages also use GDAL to read/write gridded spatial data (e.g. ArcGIS, QGIS, GRASS, etc.). More information on the GDAL can be found on this link and more information in PROJ.4 can be found in this other link.


Raster Classes


   The 'raster' package creates and uses objects of several new classes. The main new classes provided by this package are: 'RasterLayer', ''RasterStack', and 'RasterBrick'. Objects in these three classes are collectively referred as 'Raster*' objects.

  • 'RasterLayer': Object containing a single-layer raster.
  • 'RasterStack': Object containing a multi-layer (band) raster. It can ''virtually'' connect several raster objects written to different files or in memory and/or a few layers in a single file.
  • 'RasterBrick': Object containing a multi-layer (band) raster. It is a truly multi-layered object. It can only be linked to a single multi-layer file (i.e. all data must be stored in a single file on disk) or is in itself a multi-layer object with data loaded in memory. Typical examples of multi-layered raster files are multi-band satellite images and rasters containing time series (e.g. each layer contains values for a different day or month).


   RasterStack and RasterBrick objects are quite similar. However, RasterBricks have a shorter processing time than RasterStacks and RasterStacks are more flexible (e.g. with RasterStacks pixel-based calculations on separate raster layers can be performed).


   In both multi-layered object classes, individual layers must have the same spatial extent and resolution. That is, individual layers must represent the same locations with the same level of detail.

   In all 3 classes the data can be loaded in memory or on disk depending on the size of the grid(s). Raster objects are typically created from files, but even RasterBrick objects can exist entirely in memory.