Ecosystem Processes Data Management

Introduction

TERN Data Services and Analytics is creating an information model to accurately capture all artefacts of data including platform data was collected, an instrument from which data was captured, people and organisations responsible for data collection. In our information model, the platform is an entity that host other entities to enable data collection, for example, ecology sites, flux tower, remote sensing satellite are platforms. The Instrument or sensors are a device used to measure the phenomenon.

The information model will capture metadata-level information of the different platforms that host sensors. Currently, From the EP perspective, the two upper-level platform types are Sites and Flux Towers. It will also enable the recording of ownership information, responsibilities based on roles (owner, principal investigator), deployment information, how a sensor has been configured, and a time series of calibrations performed on a sensor. In addition, the information model will support tagging platform to regional classifiers such as IBRA, subIBRA, Ecoregion, NRM regions and state and territories. The information will be stored in a database and a web application will be created to allow users to make changes to the information.

All assets (sites, sensors, etc.) will be named with a unique resource identifier (URI) so that any number of applications can refer to it. Check out the URI that has been minted for identifying TERN: https://w3id.org/tern/resources/a083902d-d821-41be-b663-1d7cb33eea66. Now we can simply use this URI to refer to TERN artefacts whenever we need to (e.g. author field in a metadata record). Likewise, all sensors will be named with a URI as well. When publishing flux data, we can express which sensor was used to collect the data by simply referring to it by its URI. If someone resolves the URI in the web browser, they will get additional information such as where it is deployed, calibration information, etc. This will also enable to make each infrastructure citable assets.

Having site metadata in a unified database will allow TERN to share its sites with third parties such as ILTER and international aggregators. A web feed can be enabled to allow any third party to harvest TERN’s site information.

The tagging of different classification schemes enables users to have more control when searching in TERN’s data portal. It will enable them to find exactly what they want without downloading the entire dataset.

The information model will enable TERN’s first unified asset registry (a web application to the database) and encourage project managers of sites and flux towers to routinely update the information. This information will be used to provide users to refine search based on platform, instrument data was collected and will also help TERN project office to have a TERN asset registry, review time to replace instruments etc.

The information model records serial numbers of all sensors within TERN and its calibration activities (when it was performed and by who). This enable system to send an automatic reminder when the next calibration is due.

The information model can be extended to support other TERN Ecosystem Processes business needs. If there is any useful information which should be recorded, then it can be added easily to the information model.

The information model allows for easy reporting to stakeholders and users. Example, report the number of new sensors in the year 2020 or the total number of sites within some region, a number of instruments from X manufacturer, give me data from the instrument X etc.

Summary

  • the database is built to better manage platform and instrument information so that it will be a point of reference for any infrastructure-related data query.

  • the information model is compliant to international standard Semantic Sensor Ontology

  • Standard data exchange format for sharing site information with other parties will be built as part of the infrastructure and share information with ILTER.

  • Meets TERN’s reporting requirements in term so asset registry.

  • Delivers provenance information for end-users and site PIs of the published data.

  • Enables rich faceted search capabilities for TERN’s data portal.

  • The information framework is reusable in other TERN services such as SHaRED - enable users to submit data and specify where it was collected from (exact sensor and site).

  • Supports TERN Ecosystem Processes business needs.

  • A unified TERN asset registry with detailed description on deployment and calibration activities.

  • TERN assets are uniquely identified with a URI, and additional information of the asset is available upon dereferencing the URI.

Plan

Initially, we are experimenting with the information model by sending it out for feedback in the form of an Excel spreadsheet. Once we are happy that the information model meets all the requirements, we will convert what is entered into the Excel spreadsheet into a database.

A web application to interface with the database will be created. This web application is called Duma and it is already being used within TERN Data Services and Analytics to manage the different people, organisations, and projects which TERN is affiliated with. Therefore, any data related to a person or organisation is likely already in the system and we can simply link it with a relationship as shown in the upcoming diagrams below.

Conceptual Model

Logical Model (of the Excel template)

Navigation