Plot-X Standard
A data exchange standard for ecological plot-based survey data.
Abstract
Plot-X is a data exchange standard for exchanging ecological plot-based survey data. Created to be the Australian national ecological survey data exchange standard, Plot-X extends on the Observations and Measurements standard to describe core concepts for sites, site visits, domain features, measurements, observations, and samples. The preferred exchange format is CSV. Support for popular exchange formats such as RDF or JSON is possible in the future. This document is a specification for users of Plot-X. It also details background information as well as the design rationale made while creating Plot-X.
Status of this document
This document is a draft specification for Plot-X. Plot-X is currently in active development by TERN and the Australian ecology community.
Participation in the development of this standard includes but is not limited to the following parties: Department of Agriculture, Water and the Environment (DAWE), Queensland Government, New South Wales Government, Northern Territory Government, Western Australian Government and Terrestrial Ecosystem Research Network (TERN).
While this specification is in the draft state, comments from users are encouraged, especially in sections with the review tag.
Table of contents
- 1 Abstract
- 2 Status of this document
- 3 Table of contents
- 4 Acronyms and definitions
- 5 Introduction
- 6 Background
- 7 Guiding principles
- 8 Definitions
- 9 Plot-X core concepts
- 9.1 Plot-X exchange ER diagram
- 9.1.1 Core classes description
- 9.2 Organisation
- 9.3 Person
- 9.4 Project
- 9.5 Site
- 9.6 Site visit
- 9.7 Feature
- 9.8 Observation
- 9.9 Sampling
- 9.10 Non-core attributes of entities
- 9.11 Controlled vocabularies
- 9.11.1 Domain feature types
- 9.11.2 Parameters
- 9.11.3 Methods
- 9.11.4 Attributes
- 9.11.5 Roles
- 9.1 Plot-X exchange ER diagram
- 10 Validation
- 11 Acknowledgements
- 12 References
Acronyms and definitions
Acronym | Definition |
---|---|
Plot-X | plot exchange standard |
O&M | Observations and Measurements |
CSV | comma-separated values |
RDF | Resource Description Framework |
JSON | JavaScript Object Notation |
DAWE | Australian Government Department of Agriculture, Water and the Environment |
TERN | Terrestrial Ecosystem Research Network |
NVIS | National Vegetation Information System |
W3C | World Wide Web Consortium |
SSN | Semantic Sensor Network |
SOSA | Sensor, Observation, Sample, and Actuator |
XML | Extensible Markup Language |
Veg-X | vegetation exchange standard |
ODM | Observations Data Model |
ODM2 | Observations Data Model 2 |
SKOS | SImple Knowledge Organisation System |
URI | Universal Resource Identifier |
WKT | Well-known text |
PK | primary key |
FK | foreign key |
ID | identifier |
UOM | unit of measure |
CURIE | Compact URI |
BDR | Biodiversity Data Repository |
EPBC Act | Environment Protection and Biodiversity Conservation Act 1999 |
DEAP | Digital Environmental Assessment Program |
FAIR | Findable, Accessible, Interoperable and Reusable |
Introduction
The main impediment to the widespread sharing of ecological plot-based survey data within Australia is the lack of established standards for data representation and data exchange. In Australia, state and territory governments collect and store ecological data in siloed databases with application schemas designed for their business and reporting needs. The data stored in these databases are the result of plot-based surveys carried out across Australia. But despite the data being the same kind of data, they are often collected with different survey protocols and stored in varying formats and structures.
The unstandardised language and structure used across the different datasets make it difficult for data aggregators such as the Australian Federal Government. Enormous amounts of time and effort are required to interpret and integrate cross-jurisdictional data to deliver timely reports such as environmental impact assessments for policymakers.
To fix this, TERN has developed Plot-X, a data exchange standard for plot-based survey data. Plot-X will enable lossless data exchange between organisations by utilising a unified conceptual model based on the Observations and Measurements standard and supported by community-driven controlled vocabularies.
Background
The DAWE, TERN, and other aggregators of ecological data in Australia spend large amounts of manpower and machine power to integrate disparate datasets across Australia. The integration of each new dataset into one of the aggregator systems requires a phenomenal amount of domain knowledge as well as technical expertise to work. The difficulty in exchanging this complex data resulted in siloed data with very little value outside of its original purpose of collection. Furthermore, with natural disasters such as the 2019-2020 bushfires across Australia, the Australian Government, and many ecological researchers acknowledged and highlighted the need for a way to deliver vegetation reports in a timely manner across government jurisdictions by data exchange.
Some frameworks exist within Australia to integrate vegetation data across state and territory governments. The National Vegetation Information System (NVIS) is a framework developed to make Australian vegetation data comparable at a high level. It contains some key essential variables, but most importantly, NVIS describes the structural formation of Australian vegetation in a standardised way.
TERN aggregates plot-based survey data from different sources as part of its new ecological integration platform named EcoPlots. TERN’s solution to harmonising and integrating ecological plot-based survey data employs a Semantic Web solution, largely based on the W3C’s SSN and SOSA ontologies. The Plot-X standard is mostly based on the pre-existing work of TERN’s EcoPlots platform.
Internationally, some frameworks and standards are available for vegetation and earth observations. Firstly, Veg-X, an XML-based data exchange standard for vegetation data, published in 2011. Veg-X is one of the few standards for ecological plot-based surveys that acknowledge the importance of representing the feature of interest (i.e. measured variable) accurately at the different granularity levels (e.g. measurements made at the plant individual level and measurements made at the aggregation of plant individuals level). It is difficult to assess whether Veg-X is still actively being used or developed.
ODM2 is an information model that supports interoperability across multiple scientific domains.
Guiding principles
The Plot-X exchange standard uses CSV (comma separated values) format to represent the data due to its ease of use for both technical and non-technical users. Since Plot-X is based on O&M and the core concepts are quite minimal, adoption of Plot-X and its CSV exchange format should be fairly easy.
The key guiding principle is the ease of use and ease of adoption with a lossless exchange format. There will be some sacrifices with normalisation, but the tradeoff will allow for wider adoption of the standard.
Controlled vocabularies encoded as SKOS (Simple Knowledge Organisation System) in RDF (Resource Description Framework) will be used for all controlled lists. Following Linked Data best practices, controlled vocabulary concepts will be identified with a persistent URI with support for content negotiation for both human and machine-readable consumption.
Identifying the feature of interest for which the data was measured or recorded (e.g. the measured variable) is crucial to correctly model a dataset for exchange. Guidelines for common types of data at different granularity levels should be recorded and accepted by the community to maximise interoperability.
Definitions
Defining the Plot-X information model and its entities and attributes are shown below.
Defining a core concept and its attributes
Attribute
The attribute of the concept.
Definition
A definition describing the attribute in the context of the concept.
Required
A field denoting whether the attribute is required or not.
Format
A comma-separated field denoting the format type of the value indicated by the following controlled fields:
URI - Universal Resource Identifier, a globally unique identifier for the resource
Date - an ISO 8601 date string in UTC
Date time - an ISO 8601 date and time string in UTC
Email - a valid email address according
WKT - Well-known text representation of geometry
Decimal degrees - latitude and longitude values expressed as decimal degrees
PK - primary key
FK - foreign key
Example
A field providing an example in the context of the concept and its attribute.
Data type
A simple data type indicator with the following controlled fields:
String
Boolean
Number
Date
Date time
Plot-X core concepts
This section defines the core classes in the Plot-X information model. Each core class defines a set of core attributes with their definition and data type described.
Non-core attributes managed as controlled vocabularies are available to each of the core classes. See the Controlled Vocabularies section.
Plot-X exchange ER diagram
Figure: Plot-X ER diagram of the CSV schema.
Core classes description
Core classes | Description | Required |
---|---|---|
Organisation | describes any organisations associated with the collecting data, funding the project etc, | Yes |
Person | represent all persons related to conducting surveys and managing the project. | Yes |
Project | represent a collaborative work package under which data was collected, most often, survey is conducted as part of the project. | Yes |
Site | site is a geographical bounding location where sampling and observations happens. The site class represent all attributes related to the site. | No |
Site Visit | describes a visit details as part of the sampling program. All observations happens during a site visit assuming that most of the surveys are human assisted. | No |
Feature | Describe the domain features associated with the site, these are called feature of interest. All observations and samplings are made on domain features. feature attributes can be added based on the types of feature. The feature of interest will be represented as controlled vocabularies to ensure maximum reuse and reduce the ambiguity of the meaning. | Yes |
Observation | Describes an act observation to which results in estimating the value of the observed property of the feature of interest using a specific method or procedure. The Observation class will have relationships to a site and details about the results and result type. The observation will represent observed property, procedure (method) used to make observation, result and result time. All observed properties, methods and result types are controlled vocabularies. | Yes |
Sampling | Describes all attributes related to physical samples collected as part of the site, the samples collected may be part of the feature of interest. | No |
Organisation
An organisation such as a government agency, university, or research group.
Attribute | Definition | Required | Format | Example | Data type | Comment |
---|---|---|---|---|---|---|
orgID | The globally unique identifier for the organisation. | Yes | PK, URI |
| String |
|
name | The name of the organisation. | Yes |
|
| String |
|
The email address of the organisation. | No |
| String |
|
Person
A person who is part of an Organisation and is involved in one or more Projects.
Attribute | Definition | Required | Format | Example | Data type | Comment |
---|---|---|---|---|---|---|
personID | The globally unique identifier for the person. | Yes | PK, URI |
| String |
|
orgID | Relationship to an organisation in which the person belongs to. | Yes | FK, URI |
| String |
|
givenName | Given name, generally the first name of a person. | Yes |
|
| String |
|
middleName | Middle name of a person, or their other name. | No |
|
| String |
|
familyName | Family name, generally the last name of a person. | Yes |
|
| String |
|
The email of the person. | Yes |
| String |
| ||
role | Reference to the role from a controlled vocabulary. | No | URI | String |
|
Project
A survey project with the main goal to observe, measure, and sample plots.
review do we need to represent projects with sub-projects?
Attribute | Definition | Required | Format | Example | Data type | Comment |
---|---|---|---|---|---|---|
projectID | The globally unique identifier for the project. | Yes | PK, URI |
| String |
|
projectLocalID | The identifier of the project in the context of the source dataset. | Yes |
|
| String |
|
projectStartDate | The date in which the project was started. | Yes | Date |
| Date |
|
projectEndDate | The date in which the project ended. | No | Date |
| Date |
|
projectName | The name of the project. | No |
|
| String | review Should this core attribute be mandatory? Some sample data do not have a name for the project. |
projectOwnerPersonID | The owner or lead of the project. Relationship to a person. | Yes | FK, URI |
| String |
|
projectContactPointOrgID | The contact point of the project. Relationship to an organisation. | Yes | FK, URI |
| String |
|
projectDescription | The description or purpose of the project. | Yes |
|
| String |
|
projectFunderOrgID | The funder of the project. Relationship to an organisation. | Yes | FK, URI |
| String |
|
Site
A spatially bounded location where observations, measurements, and samples are made.
review should site shape be a core attribute?
Attribute | Definition | Required | Format | Example | Data type | Comment |
---|---|---|---|---|---|---|
siteID | The globally unique identifier for the site. | Yes | PK, URI | String |
| |
siteLocalID | The identifier of the site in the context of the source dataset. | No |
| 24099 | String |
|
projectID | Relationship to the project. | Yes | FK, URI |
| String |
|
parentSiteID | Relationship to the parent site. | No | FK, URI |
| String |
|
siteType | Reference to the site type from a controlled vocabulary. | No | URI | http://linked.data.gov.au/def/tern-cv/74aa68d3-28fd-468d-8ff5-7e791d9f7159 | String |
|
dateCommissioned | The date in which the site was commissioned or established. | Yes | Date |
| Date |
|
dateDecommissioned | The date in which the site was decommissioned. | No | Date |
| Date |
|
locationDescription | A sentence or two describing the location of the site. | No |
|
| String |
|
siteDescription | A sentence or two describing the state of the site when established. | No |
|
| String | review See whether this attribute is widely used or not. Otherwise it can be moved to as a site attribute. |
dimension | Dimensions of the site. | No |
| 100x100 | String | review Can and most likely will be problematic if this is represented as a string. |
geometry | Geometry of the site as a WKT literal. | No |
|
| String |
|
swPoint | South-west point of the site. | No | WKT | POINT(145.95682361 -21.68109217) | String | review should this be moved to a non-core attribute? This is very specific as it’s asking for the south-west point. |
Site visit
A discrete time-bounded visit to a site, during which sampling, measurement, or observation activities are undertaken.
Attribute | Definition | Required | Format | Example | Data type | Comment |
---|---|---|---|---|---|---|
siteVisitID | The globally unique identifier for the site visit. | Yes | PK, URI | String |
| |
siteVisitLocalID | The identifier of the site visit in the context of the source dataset. | No |
| 24099 | String |
|
siteID | Relationship to the site. | Yes | FK, URI | String |
| |
siteVisitStartDate | The site visit start date. | Yes | Date |
| Date |
|
siteVisitEndDate | The site visit end date. | No |
|
|
|
|
locationDescription | A sentence or two describing the location of the site in the context of the site visit. | No |
|
| String |
|
siteDescription | A sentence or two describing the site at the time of the site visit. | No |
|
| String |
|
Feature
A feature is a feature of interest whose property is being estimated or calculated in the course of observation to arrive at a result, or which is being sampled or transformed in an act of sampling.
Attribute | Definition | Required | Format | Example | Data type | Comment |
---|---|---|---|---|---|---|
featureID | The globally unique identifier for the feature of interest. | Yes | PK, URI |
| String |
|
featureLocalID | The identifier of the feature in the context of the source dataset. | No |
|
|
| Some datasets may not have identifiers for individual features of interest or is infeasible to provide one. |
siteID | Relationship to the site. | No | FK, URI |
| String | This core attribute is optional as there is a use case where some observations are made on features without an association to an established site. |
siteVisitID | Relationship to the site visit. | No | FK, URI |
| String | This core attribute is optional as there is a use case where some observations are made on features without an association to a site visit. |
featureType | Reference to the feature type from a controlled vocabulary. | Yes | URI | http://linked.data.gov.au/def/tern-cv/68af3d25-c801-4089-afff-cf701e2bd61d | String |
|
parentFeatureID | The relationship to the parent feature. | No | FK, URI | Relating observations made on a stratum (feature) and linking it to a plant population (related parent feature). | String |
|
comment | A comment to provide context as to what the source was for this feature. | No |
| BASAL_AREA table. | String | Useful to provide the developer or reviewer context as to which source table this feature was generated from. |
latitude | Latitude value of the feature. | No | Decimal degrees | -31.25188888888889 | Number |
|
longitude | Longitude value of the feature. | No | Decimal degrees | 120.34244722222222 | Number |
|
elevation | Elevation of the feature. | No |
|
| Number |
|
altitude | Altitude of the feature. | No |
|
| Number |
|
depth | Depth of the feature, if it is below ground. | No |
|
| Number |
|
geometry | Geometry of the feature as a WKT literal. | No |
|
| String |
|
Observation
An act of measuring or otherwise determining the value of a property.
Attribute | Definition | Required | Format | Example | Data type | Comment |
---|---|---|---|---|---|---|
observationID | The globally unique identifier for the observation. | Yes | URI |
| String |
|
observationLocalID | The identifier of the observation in the context of the source dataset. | No |
|
| String | Source datasets may not have an identifier for observations. |
siteID | Relationship to the site. | No |
|
| String | This core attribute is optional as there is a use case where some observations are made on features without an association to an established site. |
siteVisitID | Relationship to the site visit. | No |
|
| String | This core attribute is optional as there is a use case where some observations are made on features without an association to a site visit. |
featureID | Relationship to the feature. | Yes | URI |
| String |
|
resultTime | Time when the observation was completed. | Yes | Date time |
| Date time |
|
parameterLabel | A human-readable label of the parameter. | Yes |
|
| String |
|
parameter | Reference to the parameter from a controlled vocabulary. | Yes | URI | http://linked.data.gov.au/def/tern-cv/5699eca7-9ef0-47a6-bcfb-9306e0e2b85e | String |
|
result | The result value of the observation. | Yes |
|
| String |
|
resultType | Reference to the result type from a controlled vocabulary. | Yes | URI | http://linked.data.gov.au/def/tern-cv/07020422-336e-47c7-8676-fde1e21bcdca | String |
|
resultUOM | Indicate the unit of measure of the result, if applicable. | No | URI | String |
| |
comment | A comment to provide context as to what the source was for this observation. | No |
| BASAL_AREA table. | String | Useful to provide the developer or reviewer context as to which source table this observation was generated from. |
methodLabel | A human-readable label of the method. | No |
|
| String |
|
method | Reference to the method from a controlled vocabulary. | No | URI | Point intercept method. | String |
|
instrumentType | Reference to the instrument type from a controlled vocabulary. | No | URI | Basal wedge. http://linked.data.gov.au/def/tern-cv/a3088b5c-622d-4e25-8a75-4c4961b0dfe8 | String |
|
observer | Relationship to the person who observed or measured this feature. | No | FK, URI |
| String |
|
Sampling
An act of sampling carries out a (sampling) procedure to create or transform one or more samples.
Attribute | Definition | Required | Format | Example | Data type | Comment |
---|---|---|---|---|---|---|
samplingID | The globally unique identifier for the sampling event. | Yes | PK, URI |
| String |
|
samplingLocal | The identifier of the sampling event in the context of the source dataset. | No |
|
| String |
|
featureID | Relationship to the feature being sampled. | No | FK, URI |
| String | This may not always be available or represented in source databases. |
sampleFeatureID | Relationship to the feature that was a result of the sampling event. | Yes | FK, URI |
| String |
|
resultTime | Time when the sampling event was completed. | Yes | Date time |
| Date time |
|
comment | A comment to provide context as to what the source was, for this sampling event. | No |
| POINT_INTERCEPT table. | String | Useful to provide the developer or reviewer context as to which source table this sampling event was generated from. |
methodLabel | A human-readable label of the method. | No |
|
| String |
|
method | Reference to the method from a controlled vocabulary. | No | URI | Vegetation vouchering method. | String |
|
instrumentType | Reference to the instrument type from a controlled vocabulary. | No | URI | Auger boring. http://linked.data.gov.au/def/tern-cv/a3088b5c-622d-4e25-8a75-4c4961b0dfe8 | String |
|
observer | Relationship to the person who sampled this feature. | No | FK, URI |
| String |
|
Non-core attributes of entities
Each core entity in the Plot-X exchange schema has a one-to-many relationship to an Attribute entity. This entity is used to express the non-core attributes of a core entity. The basic shape of an Attribute entity is the same across all entities.
Attribute | Definition | Required | Format | Example | Data type | Comment |
---|---|---|---|---|---|---|
attributeID | The globally unique identifier for the non-core attribute. | Yes | PK, URI |
| String |
|
EntityID | The entity identifier which this non-core attribute applies on. Relationship to the core entity record. This attribute is changed to match the PK of the entity. | Yes | FK, URI | Instead of EntityID, this attribute would be siteID if it is describing a non-core attribute of a site. | String |
|
attributeLabel | A human-readable label for the controlled vocabulary attribute. | Yes |
|
| String |
|
attribute | Reference to the attribute from a controlled vocabulary. | Yes | URI | http://linked.data.gov.au/def/tern-cv/dd085299-ae86-4371-ae15-61dfa432f924 | String |
|
result | The result value of the attribute. | Yes |
|
| String |
|
resultType | Indicate the data type of the result. | Yes | URI | http://linked.data.gov.au/def/tern-cv/07020422-336e-47c7-8676-fde1e21bcdca | String |
|
resultUOM | Indicate the unit of measure of the result, if applicable. | No | URI | String |
|
Controlled vocabularies
review - this section is incomplete.
The representation of the core conceptual model of Plot-X and the ecological domain is done through the core concepts of O&M and the use of controlled vocabularies. Controlled vocabularies allow the ecological community to maximise reuse and establish a consistent and standardised set of concepts. In contrast to standards such as ANZSoilML or GeoSciML, where domain features are modelled as classes, Plot-X represents the types of domain features as a controlled vocabulary. Attributes of domain features and observations are also maintained as a controlled vocabulary similar to Darwin Core terms. Other concepts such as parameters, observable properties, methods, result types, unit of measure, instrument type, site type and any code lists are also managed as controlled vocabularies. This approach is in line with the ODM2 conceptual model, which expands its capabilities in modelling observational data in different domains through the use of controlled vocabularies.
Below are some controlled vocabularies expressed as SKOS controlled vocabularies used by TERN.
Domain feature types
URI of the scheme: http://linked.data.gov.au/def/tern-cv/68af3d25-c801-4089-afff-cf701e2bd61d
Parameters
URI of the scheme: http://linked.data.gov.au/def/tern-cv/5699eca7-9ef0-47a6-bcfb-9306e0e2b85e
Methods
Methods are dataset-specific.
URI of CORVEG methods: http://linked.data.gov.au/def/corveg-cv/2561a78f-bc77-4cb2-9b40-25dd7d20c614
URI of AusPlots Rangelands methods: http://linked.data.gov.au/def/ausplots-cv/3ee6feb4-20aa-4dd8-9f15-48bbbebbac7a
Attributes
URI of the scheme: http://linked.data.gov.au/def/tern-cv/dd085299-ae86-4371-ae15-61dfa432f924
Roles
URI of the scheme: http://registry.it.csiro.au/def/isotc211/CI_RoleCode
Validation
Use CSV Schema Language 1.1 to validate Plot-X exchange schema. See: https://digital-preservation.github.io/csv-schema/csv-schema-1.1.html
Acknowledgements
The work of Plot-X as a data exchange standard was started and funded by the Australian Department of Agriculture, Water and the Environment in collaboration with Australian state and territory governments, TERN and participating universities. Much of the groundwork and conceptual ideas were based on existing work such as Observations and Measurements, Veg-X, ODM2 and Semantic Sensor Network Ontology.
References
Strong- vs weak- typing for features
https://confluence.csiro.au/display/seegrid/Strong-+vs+weak-+typing+for+features accessed 2021-05-04
CSV Schema Language 1.1
https://digital-preservation.github.io/csv-schema/csv-schema-1.1.html accessed 2021-05-04
CURIE Syntax 1.0
https://www.w3.org/TR/curie/ accessed 2021-05-04
Guidelines for a biological survey and map data
Useful species observation data template.
https://www.environment.gov.au/about-us/environmental-information-data/information-policy/guidelines-for-biological-survey-mapped-data accessed 2021-05-05
Veg-X
Wiser, S.K., Spencer, N., De Cáceres, M., Kleikamp, M., Boyle, B. and Peet, R.K. (2011), Veg-X – an exchange standard for plot-based vegetation data. Journal of Vegetation Science, 22: 598-609. https://doi.org/10.1111/j.1654-1103.2010.01245.x
Related content
We at TERN acknowledge the Traditional Owners and Custodians throughout Australia, New Zealand and all nations.
We honour their profound connections to land, water, biodiversity and
culture and pay our respects to their Elders past, present and emerging.
TERN is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy, NCRIS.