Below is the overview of the alignment between TERN and BDBSA. The alignment will be broken by collection type, starting with plant records on the survey collection (appended with “SU” on tables). Individual data models and preprocessing steps will be displayed below.
SUSPECIES for plants
This single table contains all of the survey- and opportunistic-based measurements on plants and animals.
For filtering the records into survey and observation-based the following joins are needed, table name convention would be tern_survey_suspeciesY
JOIN bdbsa.supatch sp ON sv."PATCHID" = sp."PATCHID" join bdbsa.suspecies spp on sv."VISITNR" = spp."VISITNR"
For getting the full scientific name
--This are the columns needed concat_ws(' ', fl."GENUS",fl."SP",fl."ISP1RANK",fl."ISP1",fl."ISP2RANK", fl."ISP2",fl."ISP3RANK",fl."ISP3",fl."SPAUTHOR",fl."ISP1AUTHOR", fl."ISP2AUTHOR",fl."ISP3AUTHOR") as species_name, --and the join is made on the suspecies and flvnonsynnotren tables join bdbsa.flvnonsynnotren fl on fl."NSXCODE" = spp."NSXCODE"
Preingestion processing of targeted tables
The SUSPECIES table combines three different FOI’s and data types for plants:
survey based data for plant populations which are observed on plots
survey based data for plant individuals which are observed on plots
opportunistic samples of plant occurrences that are plot-less
All records need to be filtered by ISCERTAIN =
Y
Because of this and rows in the mapping have a 1:1 relationship with an FoI (i.e. a parameter cannot belong to more than one FoI), the data will be separated into three tables for processing: survey-plant population, survey-plant individual and opportunistic-plant occurrence.
NUMOBSERVED column
This column whenever present will be separated into two variables:
NUMOBSERVED: all numeric values >0 will be kept in NUMOBSERVED. 0 values and Non-numeric values will be transformed to NULL values
PRESENCE: This column will have two values “none detected“ and “present”. The “present” value will be all numeric values > 0 in the NUMOBSERVED column as well as the several
, present but not counted
and Present but not counted
values. “none detected” will be all the 0 and none detected
values in the NUMOBSERVED column.
Columns from other tables shared across FOI’s
suplantlfstage has to be join to the suspecies table
join bdbsa.suplantlfstage s on s."SPSEQNR"= spp."SPSEQNR" and s."VISITNR" = sv."VISITNR"
This table contains two parameters that are shared across plant opportunistic, population and individual FOI’s.
Notice that the following column need to be created for occurrence, individuals and populations
“LIFESTAGECODE”. The following conversion needs to be done for the “LIFESTAGECODE” included in the database
DE category has to be converted to D
N category has to be converted to V
SP needs to be converted to X
RE to V (Note while the RE category can become either Vegetative or Regenerating the plants listed as RE in the Database only contain many herbaceous annual species suggesting that they must be in a vegetative state, N of record under RE = 15)
Survey columns individuals
Select data for survey only
bdbsa.supatch."SITETYPE" = 'SU'
2. Select data for individuals using DBH as key
"DBH" is not null
3 . For individuals the following variables from the suspecies table will be ingested:
PLANTEDRELEASED_individual, CROWN_EXTENT_PERC, CROWN_EXTENT_SCORE, CROWN_DENSITY_PERC, CROWN_DENSITY_SCORE, DBH, NEW_TIP_GROWTH, EPICORMIC_GROWTH, EXTENT_REPRODUCTION, MISTLETOE_LOAD, LEAF_DIE_OFF, LEAF_DAMAGE, EXTENT_BARK_CRACKING
Notice that the following columns need to be created for both individuals and populations “PLANTEDRELEASED” and species_name columns into “PLANTEDRELEASED_tree” and “species_name_tree”and “PLANTEDRELEASED_population” and “species_name_population”
4. The scientific name of the species comes from matchin the NSXCODE column from suspecies to flvnonsynnotren table.
Opportunistic columns
Select data for opportunistic only
bdbsa.supatch."SITETYPE" = 'OP'
Opportunistic records may or may not have a site name (parent site). These records do not belong to a quadrat. All of these records have been assigned a “PATCHID” however a site sensu TERN data ontology does not exist. In terms of the ontology, these records are plot-less
2. For these FOI, a “tern_opportunistic_suspecies” data has to be created and the following columns from the suspecies table need to be included:
NUMOBSERVED,PLANTEDRELEASED_occurence and species_occurrence
3. LOCMETHODNR and RELIABNR are attributes of the coordinates. In the first case it is an instrument that varies by latitude,longitude combination. In the second case is a reliability value for the coordinates that comes in a lookuptable lureliab
4. Only two site attributes have enough information for this ingestion: GENLOC, LOCCOMM. PROPERTY and VISITCOMM may also be added but more than 76% are null.
suoverstorey table ingestion
This table describes the characteristics of overstorey stratum (woody species). For the ingetion a new column called stratum_category needs to be created and filled with the correct stratum, in this case “overstorey” value.
Plant community and vegetation disturbance ingestion
The dataset for plant communities is spread across couple of tables for ingesting this data the folloing merge is needed
select "PATCHID" , sv."VISITNR" , "VISITDATE" , "ISFIRESCARS" , "LASTFIREYEAR" , "ISFIREYRCERTAIN" , sv."STRFORMATIONNR", lstr."STRFORMATIONDESC", lstr."STRLIFEFORMNR" ,lstr."PROJFOLCOVERNR" from bdbsa.suvisit sv join bdbsa.lustrformation lstr ON lstr."STRFORMATIONNR" = sv."STRFORMATIONNR"
Only the following variables will be mapped:
LASTFIREYEAR (note that ?
values need to be remapped to Y)
ISFIREYRCERTAIN (note that ?
values need to be remapped to Y)
STRFORMATIONNR
Plant functional type ingestion
The data for plant functional type is on a single table, suplantassemb, with only two parameters. This table does not need pre-processing.
Voucher ingestion
Vouchered specimens are in the suvoucher sheet. The following is necessary:
Filter to only voucher containing AD in the museumherbmnr column
Generate unique record number (http://rs.tdwg.org/dwc/terms/recordNumber ) using prefix and vouchernr.
create specimen_holder column. Value is State Herbarium of South Australia
museumherbmnr corresponds to state catalogue number. ALA use also this field, but adds a space between alphabetic and numeric characters
Scientific name is the concatenation of following flsp columns fl."GENUS",fl."SP",
fl."ISP1RANK" , fl."ISP1" , fl."SPAUTHOR"Get values from DETERMINERNR1 for the determiner column. The values for observer are outside the database see excel sheet at bottom of this list
NOTICE that DETERMINEDATE is not the same as the collection date that happened during VISITDATE
See code below for guidance
select sp."SITEID", sp."QUADSITE", sp."PATCHID", sv."VISITNR", sv."VISITDATE", svo."PREFIX", svo."VOUCHERNR",--unique identifiers svo."MUSEUMHERBMNR", --used by ALA -- svo."VOUCHERTYPE",svo."SPSEQNR", spp."SPECIESNR", fl."GENUS",fl."SP", fl."ISP1RANK" , fl."ISP1" , fl."SPAUTHOR", svo."DETERMINERNR1" , --svo."DETERMINERNR2", determiner 2 is not used svo."DETERMINEDATE", --svo."ACCURACYCODE",--this is for the date svo."VOUCHERCOMM" from bdbsa.suvoucher svo join bdbsa.suspecies spp on spp."SPSEQNR" =svo."SPSEQNR" and spp."VISITNR" = svo."VISITNR" join bdbsa.flsp fl on spp."SPECIESNR" = fl."SPECIESNR" join bdbsa.suvisit sv on sv."VISITNR" =svo."VISITNR" join bdbsa.supatch sp on sp."PATCHID" = sv."PATCHID" where substring("MUSEUMHERBMNR" , '[A-Z]{1,}') = 'AD' order by "VOUCHERTYPE"
SUSPECIES for animals
The suspecies tables for animals will be divided into individuals, populations and occurrences per animal group.
NUMTRAP columns, columns with the following syntax NUMXXXXTRAPA and NUMXXXXTRAPB need to be added by trap type by row, e.g. NUMELLIOTTRAP AS sum(NUMELLIOTTRAPA, NUMELLIOTTRAPB)
.
These variables are stored at the site-visit level. However, they apply only to particular animal groups,i.e. not all NUMTRAP columns apply to all animal observations done during a site visit. Also create dipnetused variable where the value is YES when DIPNETHRS is not null.
All records have to be filtered by ISCERTAIN = Y
NUMOBSERVED column
This column whenever present will be separated into two variables:
NUMOBSERVED: all numeric values >0 will be kept in NUMOBSERVED. 0 values and Non-numeric values will be transformed to NULL values
PRESENCE: This column will have two values “none detected“ and “present”. The “present” value will be all numeric values > 0 in the NUMOBSERVED column as well as the several
, present but not counted
and Present but not counted
values. “none detected” will be all the 0 and none detected
values in the NUMOBSERVED column.
Individuals
Pivot length and lengthmethod columns
First, the length and length method have to be pivot to a wider format.
Expand (pivot wider) column LENGTHNR as column names and LENGTH column as values to fill these columns. The column names for the LENTHNR column are below, as well as an example of the query:
1 body length
10 snout to vent length (does not need to be ingested as it already exist as SNOUT)
12 tail Length
14 mantle Length
15 invertebrate carapace length (if species type I)
15 turtle carapace length (if species type R)
2 fork length
3 lower jaw Fork Length
5 PES Length
6 Head Length
7 Forearm Length
--CREATE TABLE bdbsa.analysis_length AS SELECT "VISITNR", "SPECIESTYPE","SPSEQNR","NSXCODE","LENGTHMETHOD", "LENGTH" from bdbsa.suspecies s where ("LENGTHMETHOD" is not null and "LENGTH" is not null); --drop TABLE if exists bdbsa.analysis_length; select distinct "LENGTHMETHOD" from bdbsa.analysis_length ORDER BY 1; SELECT concat("VISITNR", '_', "SPECIESTYPE",'_',"SPSEQNR",'_',"NSXCODE")as row_id,"VISITNR", "SPECIESTYPE","SPSEQNR","NSXCODE","LENGTHMETHOD", "LENGTH" FROM bdbsa.analysis_length ORDER BY 5; --CREATE EXTENSION IF NOT EXISTS tablefunc; --create table bdbsa.analyses_length_wide as SELECT * FROM crosstab( $$SELECT concat("VISITNR", '_', "SPECIESTYPE",'_',"SPSEQNR",'_',"NSXCODE") as row_id,"VISITNR", "SPECIESTYPE","SPSEQNR","NSXCODE","LENGTHMETHOD", "LENGTH" FROM bdbsa.analysis_length ORDER BY 5$$ , $$select distinct "LENGTHMETHOD" from bdbsa.analysis_length ORDER BY 1$$ ) AS ("row_id" text,"VISITNR" float8, "SPECIESTYPE" text,"SPSEQNR" float8, "NSXCODE" text, "total_length" float8, "svlength" float8, "tail_length" float8,"disc_length" float8, "mantlelength" float8, "carapace_length" float8, "fork" float8, "lowerjawfork" float8,"pes_length" float8,"head_length" float8, "forearm_length" float8); --drop TABLE if exists bdbsa.analyses_length_wide;
2. For individuals the following joins and logic are required, this step differs across species. Invertebrates and birds are excluded as they only comprise 3 records for this FOI.
For mammals :
join bdbsa.analyses_length_wide alw on alw."VISITNR"= spp."VISITNR" and alw."SPECIESTYPE"= spp."SPECIESTYPE" and alw."SPSEQNR" = spp."SPSEQNR" and alw."NSXCODE" =spp."NSXCODE" left join bdbsa.sutrap str on str."VISITNR" = spp."VISITNR" left join bdbsa.suvisit sv on sv."VISITNR" = spp."VISITNR" JOIN bdbsa.supatch sp ON sp."PATCHID" = sv."PATCHID" where (spp."SPECIESTYPE" = 'M')
TEATS column: Values in category 3 need to be lumped into category 2.
TESTES column: Values in category 3 need to be lumped into category 2.
Variables to be ingested are the following (note: NUMOBSERVED is not ingested as this FOI by definition is one individual): METHODNR, species_name, tail_length, pes_length, forearm_length, SPCOMM, SEX, TEATS, VAGINA, POUCH, TESTES, WEIGHT, PLANTEDRELEASED, NUMNIGHTS, NUMELLIOTTRAPA, NUMELLIOTNIGHTS, NUMPITTRAPA, NUMPITNIGHTS, NUMCAGETRAPA, NUMCAGENIGHTS, ISHARPTRAP, ISMISTNET, NUMELLIOTTRAPB, NUMPITTRAPB, NUMCAGETRAPB, ISHAIRTUBE, ISBATDETECTOR
For birds: there are no individuals.
For fish:
The following tables are needed:
from suspecies spp join bdbsa.analyses_length_wide alw on alw."VISITNR"= spp."VISITNR" and alw."SPECIESTYPE"= spp."SPECIESTYPE" and alw."SPSEQNR" = spp."SPSEQNR" and alw."NSXCODE" =spp."NSXCODE" left join bdbsa.sutrap str on str."VISITNR" = spp."VISITNR" left join bdbsa.suvisit sv on sv."VISITNR" = spp."VISITNR" JOIN bdbsa.supatch sp ON sp."PATCHID" = sv."PATCHID" where spp."SPECIESTYPE" = 'F';
Variables to be ingested are as follow:
species_name, METHODNR, SPCOMM, SEX, WEIGHT, ACTIVITY, PLANTEDRELEASED, total_length, disc_length, fork, lowerjawfork, FYKENETDUR, FYKENETNR, BAITTRAPHRS, BAITTRAPNUM
For reptiles:
The following tables are needed
from suspecies spp join bdbsa.analyses_length_wide alw on alw."VISITNR"= spp."VISITNR" and alw."SPECIESTYPE"= spp."SPECIESTYPE" and alw."SPSEQNR" = spp."SPSEQNR" and alw."NSXCODE" =spp."NSXCODE" left join bdbsa.sutrap str on str."VISITNR" = spp."VISITNR" left join bdbsa.suvisit sv on sv."VISITNR" = spp."VISITNR" JOIN bdbsa.supatch sp ON sp."PATCHID" = sv."PATCHID" where spp."SPECIESTYPE" = 'R';
Variables to be ingested are the following:
METHODNR, STRATANR, MACROHABNR, MICROHABNR, SPCOMM, SEX, WEIGHT, ISGRAVID, PLANTEDRELEASED, total_length, SNOUT, tail_length, turtle_carapace_length, NUMNIGHTS, NUMELLIOTTRAP, NUMELLIOTNIGHTS, NUMPITTRAP, NUMPITNIGHTS, NUMCAGETRAP, NUMCAGENIGHTS, ISHARPTRAP, ISMISTNET, FYKENETDUR, FYKENETNR
For amphibian:
The following tables are needed:
str."FYKENETHRS", str."BAITTRAPHRS", str."BAITTRAPNUM", str."DIPNETHRS" from suspecies spp join bdbsa.analyses_length_wide alw on alw."VISITNR"= spp."VISITNR" and alw."SPECIESTYPE"= spp."SPECIESTYPE" and alw."SPSEQNR" = spp."SPSEQNR" and alw."NSXCODE" =spp."NSXCODE" left join bdbsa.sutrap str on str."VISITNR" = spp."VISITNR" left join bdbsa.suvisit sv on sv."VISITNR" = spp."VISITNR" JOIN bdbsa.supatch sp ON sp."PATCHID" = sv."PATCHID" where spp."SPECIESTYPE" = 'A';
The following variables need to be ingested: METHODNR, SPCOMM, SEX, WEIGHT, PLANTEDRELEASED, total_length, svlength, fork, FYKENETDUR, FYKENETNR
Occurrence
For birds the following join is needed:
JOIN bdbsa.supatch sp ON sv."PATCHID" = sp."PATCHID" join bdbsa.suspecies spp on sv."VISITNR" = spp."VISITNR" join bdbsa.vssp vs on vs."NSXCODE" = spp."NSXCODE" full join bdbsa.sutrap str on str."VISITNR" = sv."VISITNR" where (spp."SPECIESTYPE" = 'B')
and the following columns will be ingested:
METHODNR, species_name, STRATANR, NUMOBSERVED, MICROHABNR, MACROHABNR, SPCOMM, SEX, ACTIVITY, PLANTEDRELEASED, NUMNIGHTS, NUMELLIOTTRAP, NUMELLIOTNIGHTS, NUMPITTRAP, NUMPITNIGHTS, NUMCAGETRAP, NUMCAGENIGHTS, ISHARPTRAP, ISMISTNET, ISHAIRTUBE, ISBATDETECTOR
Most of the variables after PLANTEDRELEASED
may be removed, this needs to be confirmed.
For mammals
For mammals, the following joins and logic are required
FROM bdbsa.suvisit sv JOIN bdbsa.supatch sp ON sv."PATCHID" = sp."PATCHID" join bdbsa.suspecies spp on sv."VISITNR" = spp."VISITNR" join bdbsa.vssp vs on vs."NSXCODE" = spp."NSXCODE" full join bdbsa.sutrap str on str."VISITNR" = sv."VISITNR" where (spp."SPECIESTYPE" = 'M' and "TEATS" is null and "VAGINA" is null and "POUCH" is null and "TESTES" is null and "WEIGHT" is null and "NUMPOUCH" is null and "LENGTH" is null)
Convert 'none detected' value in NUMOBSERVED to 0
The following variables are to be ingested
STRATANR, MACROHABNR, species_name, NUMOBSERVED, MICROHABNR, SPCOMM, SEX, ACTIVITY, PLANTEDRELEASED, NUMNIGHTS, NUMELLIOTTRAP, NUMELLIOTNIGHTS, NUMPITTRAP, NUMPITNIGHTS, NUMCAGETRAP, NUMCAGENIGHTS, ISHARPTRAP, ISMISTNET, ISHAIRTUBE, ISBATDETECTOR
For Fish
The following logic is required for fish
from suspecies spp left join bdbsa.sutrap str on str."VISITNR" = spp."VISITNR" left join bdbsa.suvisit sv on sv."VISITNR" = spp."VISITNR" left JOIN bdbsa.supatch sp ON sp."PATCHID" = sv."PATCHID" where (spp."SPECIESTYPE" = 'F' and "LENGTH" is null and "WEIGHT" is null);
The following variables will be ingested:
species_name, NUMOBSERVED, PRESENCE, SPCOMM, SEX, ACTIVITY, PLANTEDRELEASED, FYKENETDUR, FYKENETNR, BAITTRAPHRS, BAITTRAPNUM
For Reptiles
The following logic/joins are required
from suspecies spp left join bdbsa.sutrap str on str."VISITNR" = spp."VISITNR" left join bdbsa.suvisit sv on sv."VISITNR" = spp."VISITNR" JOIN bdbsa.supatch sp ON sp."PATCHID" = sv."PATCHID" where (spp."SPECIESTYPE" = 'R' and "LENGTH" is null and "WEIGHT" is null);
The following variables will be ingested:
STRATANR, METHODNR, MACROHABNR, MICROHABNR, MACROHABNR, species_name, NUMOBSERVED, PRESENCE, SPCOMM, SEX, ISGRAVID, PLANTEDRELEASED, NUMNIGHTS, NUMELLIOTTRAP, NUMELLIOTNIGHTS, NUMPITTRAP, NUMPITNIGHTS, NUMCAGETRAP, NUMCAGENIGHTS
For Amphibians
The following logic and joins are required
from suspecies spp left join bdbsa.sutrap str on str."VISITNR" = spp."VISITNR" left join bdbsa.suvisit sv on sv."VISITNR" = spp."VISITNR" left JOIN bdbsa.supatch sp ON sp."PATCHID" = sv."PATCHID" where (spp."SPECIESTYPE" = 'A' and "LENGTH" is null and "WEIGHT" is null);
The following variables will be ingested:
NUMOBSERVED, METHODNR, STRATANR, MACROHABNR, MICROHABNR, SPCOMM, PLANTEDRELEASED, NUMNIGHTS, NUMELLIOTTRAP, NUMELLIOTNIGHTS, NUMPITTRAP, NUMPITNIGHTS, NUMCAGETRAP, NUMCAGENIGHTS, FYKENETDUR, FYKENETNR, BAITTRAPHRS, BAITTRAPNUM, DIPNETHRS, dipnetused
0 Comments