Obtaining AusPlots data: 'get_ausplots' function


The 'get_ausplots' function extracts and compiles AusPlots data.


Data of specific types, sites, geographical locations, and/or species can be requested via the function arguments.


DATA TYPES: Up to 8 different types of data can be obtained by setting the corresponding arguments to TRUE/FALSE. Each type of data will be retrieved and compiled into a distinct dataset contained in a separate data frame. 'get_ausplots' returns a list, which elements are the data frames for the requested types of data. The 8 types of data include:

* 'site_info': Site summary data. Includes (among others): plot and visit details, landform data, geographic coordinates, and notes. Included by default. Site summary data are stored in the 'site.info' data frame.
* 'structural_summaries': Site vegetation structural summaries. Site vegetation structural summary data are stored in the 'struct.summ' data frame.
* 'veg.vouchers': Complete set of species records for the plot determined by a herbarium plus ID numbers for silica-dried tissue samples. Included by default. Vegetation vouchers data are stored in the 'veg.vouch' data frame.
* 'veg.PI': Point Intercept (PI) data. Includes data on: substrate, plant species, growth form and height, etc at each of (typically) 1010 points per plot. Included by default. Vegetation point intercept data are stored in the 'veg.PI' data frame.
* 'basal.wedge': Basal Wedge Data Raw Hits. These data are required for the calculation of Basal Area by Species by Plot. Basal wedge data are stored in the 'veg.basal' data frame.
* 'soil_subsites': Information on what soil and soil metagenomics samples were taken at nine locations across the plot and their identification barcode numbers. Soil and soil metagenomics data are stored in the 'soil.subsites' data frame.
* 'soil_bulk_density': Soil bulk density. Soil bulk density data are stored in the 'soil.bulk' data frame.
* 'soil_character': Soil characterisation and sample ID data at 10 cm increments to a depth of 1 m. Soil characterisation and sample ID data are stored in the 'soil.char' data frame.


SPATIAL (PLOT & BOUNDING BOX) FILTERING: AusPlot data can be spatially subset via the 'get_ausplots' function arguments in two ways:

* 'my.Plot_IDs': Character vector with the plots IDs of specific AusPlots plots.
* 'bounding_box': Spatial filter for selecting AusPlots based on a rectangular box, in the format of e.g. c(xmin, xmax, ymin, ymax). AusPlots spatial data are are in longlat, thus x is the longitude and y is the latitude of the box/extent object (e.g., c(120, 140, -30, -10)).



SPECIES FILTERING: AusPlots data can also be subset by particular or sets of genus and/or species (i.e. as determined for the herbarium voucher) using the argument 'species_name_search'. This optional argument takes the form of a character string indicating the terms to search and subset. Search terms are not case sensitive and do not require an exact taxonomic match (e.g. "Eucalyptus moderata","Eucalyptus", and "euca" are all acceptable search terms).

Species Filtering behaviour slightly differs among Data Types (i.e. for the different types of created Data Frames):

* For 'veg.vouch' and 'basal.wedge', when these arguments are set to 'TRUE', 'get_ausplots' returns data.frames with the corresponding data (i.e. voucher records and raw basal wedge data respectively) that match the species_name_search.
* For the remaing data types arguments, when these arguments are set to 'TRUE', 'get_ausplots' returns data.frames with the corresponding data (e.g. point intercept data,...) for all plots where the species_name_search occurs.


The R object resulting from calling 'get_ausplots' is a list of data frames containing the requested AusPlots data. The list includes a data frame for each type of data requested (i.e. up to 8 data frames: 'site_info', 'structural_summaries',...) and an auto-generated citation for the data extracted. Please cite ausplotsR and the TERN AusPlots data you use. In each data frame the columns correspond to the variables supplied for each type of data and the number of rows (directly or indirectly) depends on the sites (i.e. via 'my.Plot_IDs' or 'bounding_box' if subsetted) or species (i.e. via 'species_name_search' if subset) retrieved.


There are several variables common to all data frames. These include 'site_location_name', 'site_location_visit_id', and 'site_unique' (a combination of the previous two). These variables can be used to merge data frames. For example, the contents of two data frames can be combined using the common variable as a link (i.e. guidance to add the merged contents in the correct row). The variable 'site_unique' is typically the best option to link data frames in a merge, as it is the most specific variable representing a single visit to a particular site and it should be used in most analyses. Otherwise, errors such including data from the wrong visit to a site can occur.




EXAMPLES

To run the examples below the 'ausplosR' library should have been installed and loaded in R, as show in 'Installing and Loading 'ausplotsR'' the Step-by-Step Guide

Boxes with grey background contain code snippets, and boxes with white background containt code (text) outputs.



Example 1: All available data (i.e. all data types) for 3 plots

.

# Obtain the data ('site_info', 'veg.vouchers', and 'veg.PI' are retraived by default)
AP.data = get_ausplots( my.Plot_IDs=c("SATFLB0004", "QDAMGD0022", "NTASTU0002"),
                        structural_summaries=TRUE, basal.wedge=TRUE,
                                  soil_subsites=TRUE, soil_bulk_density=TRUE, soil_character=TRUE  )

.

## User-supplied Plot_IDs located.

.

.

# Explore retrieved data
class(AP.data)

.

## [1] "list"

.

.

summary(AP.data)

.

##               Length Class      Mode     
## site.info     43     data.frame list     
## struct.summ   15     data.frame list     
## soil.subsites 12     data.frame list     
## soil.bulk     15     data.frame list     
## soil.char     34     data.frame list     
## veg.basal     10     data.frame list     
## veg.vouch     12     data.frame list     
## veg.PI        13     data.frame list     
## citation       1     -none-     character

.

.

str(AP.data)

.

## List of 9
##  $ site.info    :'data.frame':   4 obs. of  43 variables:
##   ..$ site_location_name        : chr [1:4] "QDAMGD0022" "SATFLB0004" "SATFLB0004" "NTASTU0002"
##   ..$ established_date          : chr [1:4] "2013-06-04T00:00:00" "2012-09-18T00:00:00" "2012-09-18T00:00:00" "2016-05-01T16:58:00"
##   ..$ description               : chr [1:4] "Mackunda Downs Station, 500m east of homestead.  26km west of Middleton." "Brachina Gorge Heysen Range Lower. 63km North North East of Adelaide" "Brachina Gorge Heysen Range Lower. 63km North North East of Adelaide" "Maryfield Station, 7.6km north north west of homestead. 27.5km south east of Larrimah"
##   ..$ bioregion_name            : chr [1:4] "MGD" "FLB" "FLB" "STU"
##   ..$ landform_pattern          : chr [1:4] "ALP" "MOU" "MOU" "PLA"
##   ..$ landform_element          : chr [1:4] "PLA" "HSL" "HSL" "PLA"
##   ..$ site_slope                : chr [1:4] "1" "17" "17" "0"
##   ..$ site_aspect               : chr [1:4] "180" "225" "225" NA
##   ..$ comments                  : chr [1:4] "Astrebla pectinata / Cenchrus ciliaris / Astrebla elymoides low open tussock grassland on alluvial plain adjoin"| __truncated__ "Grazing impact high- goat tracks and droppings. Rabbit droppings also. 

.

.

.

Example 2: Default data for a particular Geographic Extent

.

# 'site_info', 'veg.vouchers', and 'veg.PI' data retrived for Brisbane (27.4698S, 153.0251E) and its sourrounding area
AP.data = get_ausplots(bounding_box=c(152.5, 153.5, -28, -27))

# Explore retrieved data
#class(AP.data)   # As in Example 1 (can run uncommented if curious) 
summary(AP.data)

.

##           Length Class      Mode     
## site.info 43     data.frame list     
## veg.vouch 12     data.frame list     
## veg.PI    13     data.frame list     
## citation   1     -none-     character

.

.

.

Example 3: 'Default data' + 'basal.wedge' + 'structural_summaries' for the genus Eucalyptus

.

# Default data frames ('site_info', 'veg.vouchers', and 'veg.PI') + 'basal.wedge' + structural_summaries data frames for the genus Eucalyptus
AP.data = get_ausplots(basal.wedge=TRUE, structural_summaries=TRUE, species_name_search="Eucalyptus") 

# Explore retrieved data
#class(AP.data)   # As in Example 1 (can run uncommented if curious) 
summary(AP.data)

.

##             Length Class      Mode     
## site.info   43     data.frame list     
## struct.summ 15     data.frame list     
## veg.basal   10     data.frame list     
## veg.vouch   12     data.frame list     
## veg.PI      13     data.frame list     
## citation     1     -none-     character

.

.

#str(AP.data)   # Similar to Example 1 (can run uncommented if curious) 

# Explore species contained in each data frame
head(AP.data$veg.vouch) # Includes Records that match 'eucalyptus'

.

##   site_location_name veg_barcode             herbarium_determination
## 1         QDAMUL0003  QDA 001432 Eucalyptus crebra x e. melanophloia
## 2         SASMDD0002  SAS 000461                   Eucalyptus oleosa
## 3         SASMDD0002  SAS 000462                   Eucalyptus dumosa
## 4         SASMDD0002  SAS 000463 Eucalyptus socialis subsp. socialis
## 5         SASMDD0002  SAS 000038     Eucalyptus oleosa subsp. oleosa
## 6         SASMDD0002  SAS 000039                   Eucalyptus dumosa
##   is_uncertain_determination    visit_start_date site_location_visit_id
## 1                      FALSE 2013-04-26T00:00:00                  53595
## 2                      FALSE 2012-09-23T00:00:00                  53711
## 3                      FALSE 2012-09-23T00:00:00                  53711
## 4                      FALSE 2012-09-23T00:00:00                  53711
## 5                      FALSE 2012-09-23T00:00:00                  53711
## 6                      FALSE 2012-09-23T00:00:00                  53711
##   primary_gen_barcode secondary_gen_barcode_1 secondary_gen_barcode_2
## 1                <NA>                    <NA>                    <NA>
## 2       not collected                    <NA>                    <NA>
## 3       not collected                    <NA>                    <NA>
## 4       not collected                    <NA>                    <NA>
## 5         SAS  000521             SAS  000522             SAS  000523
## 6         SAS  000526             SAS  000528             SAS  000529
##   secondary_gen_barcode_3 secondary_gen_barcode_4      site_unique
## 1                    <NA>                    <NA> QDAMUL0003-53595
## 2                    <NA>                    <NA> SASMDD0002-53711
## 3                    <NA>                    <NA> SASMDD0002-53711
## 4                    <NA>                    <NA> SASMDD0002-53711
## 5             SAS  000524             SAS  000525 SASMDD0002-53711
## 6             SAS  000527                    <NA> SASMDD0002-53711

.

.

head(AP.data$veg.PI)  # Includes Plots where 'eucalyptus' occurs

.

##   site_location_name site_location_visit_id transect point_number
## 1         WAACOO0006                  53438    S1-N1            0
## 2         WAACOO0006                  53438    S1-N1            1
## 3         WAACOO0006                  53438    S1-N1            2
## 4         WAACOO0006                  53438    S1-N1            3
## 5         WAACOO0006                  53438    S1-N1            4
## 6         WAACOO0006                  53438    S1-N1            5
##   veg_barcode                  herbarium_determination substrate
## 1        <NA>                                     <NA>    Crypto
## 2        <NA>                                     <NA>    Crypto
## 3 WAA  001053 Melaleuca pauperiflora subsp. fastigiata    Litter
## 4 WAA  001053 Melaleuca pauperiflora subsp. fastigiata    Litter
## 5 WAA  001053 Melaleuca pauperiflora subsp. fastigiata    Litter
## 6 WAA  001053 Melaleuca pauperiflora subsp. fastigiata    Litter
##   in_canopy_sky  dead growth_form height hits_unique      site_unique
## 1            NA    NA        <NA>     NA     S1-N1 0 WAACOO0006-53438
## 2            NA    NA        <NA>     NA     S1-N1 1 WAACOO0006-53438
## 3         FALSE FALSE       Shrub    3.5     S1-N1 2 WAACOO0006-53438
## 4         FALSE FALSE       Shrub    3.5     S1-N1 3 WAACOO0006-53438
## 5         FALSE FALSE       Shrub    3.7     S1-N1 4 WAACOO0006-53438
## 6         FALSE FALSE       Shrub    3.5     S1-N1 5 WAACOO0006-53438

.

.

head(AP.data$veg.basal) # Includes Records that match 'eucalyptus'

.

##   site_location_name site_location_visit_id site_location_id point_id
## 1         WAACOO0006                  53438            59857       NW
## 2         WAACOO0006                  53438            59857       NW
## 3         WAACOO0006                  53438            59857       NW
## 4         WAACOO0006                  53438            59857        N
## 5         WAACOO0006                  53438            59857        N
## 6         WAACOO0006                  53438            59857        N
##   herbarium_determination veg_barcode hits basal_area_factor basal_area
## 1     Eucalyptus moderata WAA  001048    1               0.1       0.10
## 2     Eucalyptus salubris WAA  001083    5               0.1       0.50
## 3     Eucalyptus salubris WAA  001093    7               0.1       0.70
## 4     Eucalyptus moderata WAA  001048    0               0.1       0.00
## 5     Eucalyptus salubris WAA  001083    2               0.1       0.15
## 6     Eucalyptus salubris WAA  001093    4               0.1       0.40
##        site_unique
## 1 WAACOO0006-53438
## 2 WAACOO0006-53438
## 3 WAACOO0006-53438
## 4 WAACOO0006-53438
## 5 WAACOO0006-53438
## 6 WAACOO0006-53438

.

.

head(AP.data$struct.summ)  # Includes Plots where 'eucalyptus' occurs

.

##   site_location_name site_location_visit_id
## 1         QDAMUL0003                  53595
## 2         SASMDD0002                  53711
## 3         SASMDD0016                  57000
## 4         NSAMDD0005                  56969
## 5         QDAMUL0001                  53594
## 6         NTAGFU0032                  53679
##                                                                                                                                           phenology_comment
## 1  Mulga have just finished flowering but no fruit. Tussock grasses mostly dry. Dom hibiscus in ground layer has just finished fruiting throughout the site
## 2                                                                                                                                                      None

.

.

.

Example 4: 'site_info', 'veg.PI', and 'basal.wedge' data for all sites

.

# Retreive data
start.time = Sys.time()
AP.data = get_ausplots(veg.vouchers=FALSE, basal.wedge=TRUE) 
end.time = Sys.time()
end.time - start.time

.

## Time difference of 1.126162 mins

.

.

# Explore 
#class(AP.data) # As in Example 1 (can run uncommented if curious) 
summary(AP.data)

.

##           Length Class      Mode     
## site.info 43     data.frame list     
## veg.basal 10     data.frame list     
## veg.PI    13     data.frame list     
## citation   1     -none-     character

.

.

# Explore 'site_info' data
dim(AP.data$site.info)

.

## [1] 662  43

.

.

names(AP.data$site.info)

.

##  [1] "site_location_name"         "established_date"          
##  [3] "description"                "bioregion_name"            
##  [5] "landform_pattern"           "landform_element"          
##  [7] "site_slope"                 "site_aspect"               
##  [9] "comments"                   "outcrop_lithology"         
## [11] "other_outcrop_lithology"    "plot_dimensions"           
## [13] "site_location_visit_id"     "visit_start_date"          
## [15] "visit_end_date"             "visit_notes"               
## [17] "location_description"       "erosion_type"              
## [19] "erosion_abundance"          "erosion_state"             
## [21] "microrelief"                "drainage_type"             
## [23] "disturbance"                "climatic_condition"        
## [25] "vegetation_condition"       "observer_veg"              
## [27] "observer_soil"              "described_by"              
## [29] "pit_marker_easting"         "pit_marker_northing"       
## [31] "pit_marker_mga_zones"       "pit_marker_datum"          
## [33] "pit_marker_location_method" "soil_observation_type"     
## [35] "a_s_c"                      "plot_is_100m_by_100m"      
## [37] "plot_is_aligned_to_grid"    "plot_is_permanently_marked"
## [39] "latitude"                   "longitude"                 
## [41] "point"                      "state"                     
## [43] "site_unique"

.

.

head(AP.data$site.info)

.

##   site_location_name    established_date
## 1         WAANUL0007 2014-09-06T15:24:41
## 2         NTAFIN0031 2012-10-25T00:00:00
## 3         QDAMUL0003 2013-04-26T00:00:00
## 4         NTAFIN0004 2011-10-06T00:00:00
## 5         NTAFIN0004 2011-10-06T00:00:00
## 6         SASMDD0002 2012-09-23T00:00:00
##                                                                           description
## 1           Great Victoria Desert Nature Reserve, 102.2km south east of Tjuntjuntjara
## 2 Umbeara Station 26.5km South East of Umbeara Homestead. 11km North of SA/Not Border

.

.

# Explore 'veg_PI' data
dim(AP.data$veg.PI)

.

## [1] 734464     13

.

.

names(AP.data$veg.PI)

.

##  [1] "site_location_name"      "site_location_visit_id" 
##  [3] "transect"                "point_number"           
##  [5] "veg_barcode"             "herbarium_determination"
##  [7] "substrate"               "in_canopy_sky"          
##  [9] "dead"                    "growth_form"            
## [11] "height"                  "hits_unique"            
## [13] "site_unique"

.

.

head(AP.data$veg.PI)

.

##   site_location_name site_location_visit_id transect point_number
## 1         WAACOO0006                  53438    S1-N1            0
## 2         WAACOO0006                  53438    S1-N1            1
## 3         WAACOO0006                  53438    S1-N1            2
## 4         WAACOO0006                  53438    S1-N1            3
## 5         WAACOO0006                  53438    S1-N1            4
## 6         WAACOO0006                  53438    S1-N1            5
##   veg_barcode                  herbarium_determination substrate
## 1        <NA>                                     <NA>    Crypto
## 2        <NA>                                     <NA>    Crypto
## 3 WAA  001053 Melaleuca pauperiflora subsp. fastigiata    Litter
## 4 WAA  001053 Melaleuca pauperiflora subsp. fastigiata    Litter
## 5 WAA  001053 Melaleuca pauperiflora subsp. fastigiata    Litter
## 6 WAA  001053 Melaleuca pauperiflora subsp. fastigiata    Litter
##   in_canopy_sky  dead growth_form height hits_unique      site_unique
## 1            NA    NA        <NA>     NA     S1-N1 0 WAACOO0006-53438
## 2            NA    NA        <NA>     NA     S1-N1 1 WAACOO0006-53438
## 3         FALSE FALSE       Shrub    3.5     S1-N1 2 WAACOO0006-53438
## 4         FALSE FALSE       Shrub    3.5     S1-N1 3 WAACOO0006-53438
## 5         FALSE FALSE       Shrub    3.7     S1-N1 4 WAACOO0006-53438
## 6         FALSE FALSE       Shrub    3.5     S1-N1 5 WAACOO0006-53438

.

.

.