Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

At 1/11/2021 with 1 full dataset ingested, the total number of fields is 290.

Denormalise regions

Force Merge API

...

Code Block
...
"region_types": [
  "http://linked.data.gov.au/dataset/local-gov-areas-2011",
  "http://linked.data.gov.au/dataset/nrm-2017",
  "http://linked.data.gov.au/dataset/bioregion/IBRA7",
  "http://linked.data.gov.au/dataset/bioregion",
  "http://linked.data.gov.au/dataset/asgs2016/stateorterritory",
  "http://linked.data.gov.au/dataset/wwf-terr-ecoregions"
],
"region:local-gov-areas-2011": "http://linked.data.gov.au/dataset/local-gov-areas-2011/32250",
"region:nrm-2017": "http://linked.data.gov.au/dataset/nrm-2017/3080",
"region:bioregion/IBRA7": "http://linked.data.gov.au/dataset/bioregion/GUP",
"region:bioregion": "http://linked.data.gov.au/dataset/bioregion/GUP01",
"region:asgs2016/stateorterritory": "http://linked.data.gov.au/dataset/asgs2016/stateorterritory/3",
"region:wwf-terr-ecoregions": "http://linked.data.gov.au/dataset/wwf-terr-ecoregions/12945",
...

ES document mapping:

Code Block
"region:asgs2016/stateorterritory" : {
  "type" : "keyword"
},
"region:bioregion" : {
  "type" : "keyword"
},
"region:bioregion/IBRA7" : {
  "type" : "keyword"
},
"region:capad-2018-terrestrial" : {
  "type" : "keyword"
},
"region:local-gov-areas-2011" : {
  "type" : "keyword"
},
"region:nrm-2017" : {
  "type" : "keyword"
},
"region:wwf-terr-ecoregions" : {
  "type" : "keyword"
},
"region_types" : {
  "type" : "keyword"
},

Mapping generated dynamically using https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html

Code Block
"mappings" : {
  "dynamic_templates" : [
    {
      "region_as_keyword" : {
        "match" : "region:*",
        "mapping" : {
          "type" : "keyword"
        }
      }
    },
    ...
  ]
}

Force Merge API

Indices segments are merged after every indexing.

Disable refresh during indexing

Disabling index refresh makes indexing times notably faster (thoughput: ~1000 every two seconds).

1 refresh action is performed manually after indexing. Then the index segments are merged (force-merge).

Dynamic mapping

In order to ensure that the correct datatype is stored in ES for each attribute value, dynamic templating is performed during indexing following the defined rules:

Code Block
"mappings" : {
  "dynamic_templates" : [
    {
      "region_as_keyword" : {
        "match" : "region:*",
        "mapping" : {
          "type" : "keyword"
        }
      }
    },
    {
      "attribute_field" : {
        "path_match" : "*_attr_*.attribute",
        "mapping" : {
          "type" : "keyword"
        }
      }
    },
    {
      "id_field" : {
        "path_match" : "*_attr_*.id",
        "mapping" : {
          "type" : "keyword"
        }
      }
    },
    {
      "unit_field" : {
        "path_match" : "*_attr_*.unit_of_measure",
        "mapping" : {
          "type" : "keyword"
        }
      }
    },
    {
      "value_label_field" : {
        "path_match" : "*_attr_*.value.label",
        "mapping" : {
          "type" : "text"
        }
      }
    },
    {
      "value_type_field" : {
        "path_match" : "*_attr_*.value.type",
        "mapping" : {
          "type" : "keyword"
        }
      }
    },
    {
      "value_value_field" : {
        "path_match" : "*_attr_*.value.value_float",
        "mapping" : {
          "coerce" : true,
          "doc_values" : true,
          "ignore_malformed" : true,
          "type" : "float"
        }
      }
    },
    {
      "value_value_field" : {
        "path_match" : "*_attr_*.value.value_int",
        "mapping" : {
          "coerce" : true,
          "doc_values" : true,
          "ignore_malformed" : true,
          "type" : "integer"
        }
      }
    },
    {
      "value_value_field" : {
        "path_match" : "*_attr_*.value.value_bool",
        "mapping" : {
          "normalizer" : "lowercase_normalizer",
          "type" : "keyword"
        }
      }
    },
    {
      "value_value_field" : {
        "path_match" : "*_attr_*.value.value_datetime",
        "mapping" : {
          "format" : "yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis",
          "ignore_malformed" : "true",
          "type" : "date"
        }
      }
    },
    {
      "value_value_field" : {
        "path_match" : "*_attr_*.value.value_date",
        "mapping" : {
          "format" : "yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis",
          "ignore_malformed" : "true",
          "type" : "date"
        }
      }
    },
    {
      "value_value_field" : {
        "path_match" : "*_attr_*.value.value_uri",
        "mapping" : {
          "type" : "keyword"
        }
      }
    },
    {
      "value_value_field" : {
        "path_match" : "*_attr_*.value.value_string",
        "mapping" : {
          "type" : "keyword"
        }
      }
    }
  ],
  ...
}

Clear cache API | Elasticsearch Guide [7.10] | Elastic

Clear cache after each test query to really test performance improvement!!!

Data indices Mapping

Expand
titlelandform
Code Block
"plotdata_ecoplots3-data-observations-ausplots2-landform-20211111005911":{
      "mappings":{
         "dynamic_templates":[
            {
               "region_as_keyword":{
                  "match":"region:*",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "attribute_field":{
                  "path_match":"*_attr_*.attribute",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "id_field":{
                  "path_match":"*_attr_*.id",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "unit_field":{
                  "path_match":"*_attr_*.unit_of_measure",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "value_label_field":{
                  "path_match":"*_attr_*.value.label",
                  "mapping":{
                     "type":"text"
                  }
               }
            },
            {
               "value_type_field":{
                  "path_match":"*_attr_*.value.type",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_float",
                  "mapping":{
                     "coerce":true,
                     "ignore_malformed":true,
                     "type":"float"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_int",
                  "mapping":{
                     "coerce":true,
                     "ignore_malformed":true,
                     "type":"integer"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_bool",
                  "mapping":{
                     "normalizer":"lowercase_normalizer",
                     "type":"keyword"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_datetime",
                  "mapping":{
                     "format":"yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis",
                     "ignore_malformed":"true",
                     "type":"date"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_date",
                  "mapping":{
                     "format":"yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis",
                     "ignore_malformed":"true",
                     "type":"date"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_uri",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_string",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            }
         ],
         "properties":{
            "dataset":{
               "type":"keyword"
            },
            "feature_class":{
               "type":"keyword"
            },
            "feature_id":{
               "type":"keyword"
            },
            "feature_type":{
               "type":"keyword"
            },
            "foi_attributes":{
               "type":"keyword"
            },
            "id":{
               "type":"keyword"
            },
            "instr_attributes":{
               "type":"keyword"
            },
            "instrument_class":{
               "type":"keyword"
            },
            "instrument_id":{
               "type":"keyword"
            },
            "instrument_type":{
               "type":"keyword"
            },
            "obs_attributes":{
               "type":"keyword"
            },
            "observation_class":{
               "type":"keyword"
            },
            "observed_property":{
               "type":"keyword"
            },
            "region:asgs2016/stateorterritory":{
               "type":"keyword"
            },
            "region:bioregion":{
               "type":"keyword"
            },
            "region:bioregion/IBRA7":{
               "type":"keyword"
            },
            "region:capad-2018-terrestrial":{
               "type":"keyword"
            },
            "region:local-gov-areas-2011":{
               "type":"keyword"
            },
            "region:nrm-2017":{
               "type":"keyword"
            },
            "region:wwf-terr-ecoregions":{
               "type":"keyword"
            },
            "region_types":{
               "type":"keyword"
            },
            "result_time":{
               "properties":{
                  "type":{
                     "type":"text"
                  },
                  "value":{
                     "type":"date",
                     "format":"yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis"
                  }
               }
            },
            "result_value":{
               "properties":{
                  "label":{
                     "type":"text",
                     "fields":{
                        "keyword":{
                           "type":"keyword",
                           "ignore_above":256
                        }
                     }
                  },
                  "type":{
                     "type":"text",
                     "fields":{
                        "keyword":{
                           "type":"keyword",
                           "ignore_above":256
                        }
                     }
                  },
                  "value":{
                     "type":"text",
                     "fields":{
                        "boolean":{
                           "type":"keyword",
                           "normalizer":"lowercase_normalizer"
                        },
                        "date":{
                           "type":"date",
                           "format":"yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis",
                           "ignore_malformed":true
                        },
                        "float":{
                           "type":"float",
                           "ignore_malformed":true,
                           "coerce":true
                        },
                        "integer":{
                           "type":"integer",
                           "ignore_malformed":true,
                           "coerce":true
                        },
                        "keyword":{
                           "type":"keyword",
                           "normalizer":"lowercase_normalizer"
                        }
                     }
                  }
               }
            },
            "site_attr_count":{
               "type":"integer",
               "doc_values":false
            },
            "site_id":{
               "type":"keyword"
            },
            "site_visit_attr_count":{
               "type":"integer",
               "doc_values":false
            },
            "site_visit_date":{
               "properties":{
                  "type":{
                     "type":"text"
                  },
                  "value":{
                     "type":"date",
                     "format":"yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis"
                  }
               }
            },
            "site_visit_id":{
               "type":"keyword"
            },
            "sites_hierarchy":{
               "properties":{
                  "depth":{
                     "type":"integer",
                     "doc_values":false
                  },
                  "label":{
                     "type":"text",
                     "fields":{
                        "keyword":{
                           "type":"keyword",
                           "ignore_above":256
                        }
                     }
                  },
                  "parent_site_id":{
                     "type":"keyword"
                  },
                  "site_id":{
                     "type":"keyword"
                  },
                  "site_id_label":{
                     "type":"text",
                     "fields":{
                        "keyword":{
                           "type":"keyword",
                           "ignore_above":256
                        }
                     }
                  }
               }
            },
            "unit_of_measure":{
               "type":"keyword"
            },
            "used_procedure":{
               "type":"keyword"
            }
         }
      }
   },
Expand
titleplant-community
Code Block
"plotdata_ecoplots3-data-observations-ausplots2-plant-community-20211111024248":{
      "mappings":{
         "dynamic_templates":[
            {
               "region_as_keyword":{
                  "match":"region:*",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "attribute_field":{
                  "path_match":"*_attr_*.attribute",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "id_field":{
                  "path_match":"*_attr_*.id",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "unit_field":{
                  "path_match":"*_attr_*.unit_of_measure",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "value_label_field":{
                  "path_match":"*_attr_*.value.label",
                  "mapping":{
                     "type":"text"
                  }
               }
            },
            {
               "value_type_field":{
                  "path_match":"*_attr_*.value.type",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_float",
                  "mapping":{
                     "coerce":true,
                     "ignore_malformed":true,
                     "type":"float"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_int",
                  "mapping":{
                     "coerce":true,
                     "ignore_malformed":true,
                     "type":"integer"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_bool",
                  "mapping":{
                     "normalizer":"lowercase_normalizer",
                     "type":"keyword"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_datetime",
                  "mapping":{
                     "format":"yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis",
                     "ignore_malformed":"true",
                     "type":"date"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_date",
                  "mapping":{
                     "format":"yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis",
                     "ignore_malformed":"true",
                     "type":"date"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_uri",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            },
            {
               "value_value_field":{
                  "path_match":"*_attr_*.value.value_string",
                  "mapping":{
                     "type":"keyword"
                  }
               }
            }
         ],
         "properties":{
            "dataset":{
               "type":"keyword"
            },
            "feature_class":{
               "type":"keyword"
            },
            "feature_id":{
               "type":"keyword"
            },
            "feature_type":{
               "type":"keyword"
            },
            "foi_attr_tern:56195246-ec5d-4050-a1c6-af786fbec715":{
               "properties":{
                  "attribute":{
                     "type":"keyword"
                  },
                  "id":{
                     "type":"keyword"
                  },
                  "value":{
                     "properties":{
                        "type":{
                           "type":"keyword"
                        },
                        "value_string":{
                           "type":"keyword"
                        }
                     }
                  }
               }
            },
            "foi_attr_tern:5a13a61f-a43f-40cf-bc3f-3e0cc2e64ce1":{
               "properties":{
                  "attribute":{
                     "type":"keyword"
                  },
                  "id":{
                     "type":"keyword"
                  },
                  "value":{
                     "properties":{
                        "type":{
                           "type":"keyword"
                        },
                        "value_string":{
                           "type":"keyword"
                        }
                     }
                  }
               }
            },
            "foi_attr_tern:7455b778-fe96-4d3a-906f-3ed1faae8055":{
               "properties":{
                  "attribute":{
                     "type":"keyword"
                  },
                  "id":{
                     "type":"keyword"
                  },
                  "value":{
                     "properties":{
                        "type":{
                           "type":"keyword"
                        },
                        "value_string":{
                           "type":"keyword"
                        }
                     }
                  }
               }
            },
            "foi_attributes":{
               "type":"keyword"
            },
            "id":{
               "type":"keyword"
            },
            "instr_attributes":{
               "type":"keyword"
            },
            "instrument_class":{
               "type":"keyword"
            },
            "instrument_id":{
               "type":"keyword"
            },
            "instrument_type":{
               "type":"keyword"
            },
            "obs_attributes":{
               "type":"keyword"
            },
            "observation_class":{
               "type":"keyword"
            },
            "observed_property":{
               "type":"keyword"
            },
            "region:asgs2016/stateorterritory":{
               "type":"keyword"
            },
            "region:bioregion":{
               "type":"keyword"
            },
            "region:bioregion/IBRA7":{
               "type":"keyword"
            },
            "region:capad-2018-terrestrial":{
               "type":"keyword"
            },
            "region:local-gov-areas-2011":{
               "type":"keyword"
            },
            "region:nrm-2017":{
               "type":"keyword"
            },
            "region:wwf-terr-ecoregions":{
               "type":"keyword"
            },
            "region_types":{
               "type":"keyword"
            },
            "result_time":{
               "properties":{
                  "type":{
                     "type":"text"
                  },
                  "value":{
                     "type":"date",
                     "format":"yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis"
                  }
               }
            },
            "result_value":{
               "properties":{
                  "label":{
                     "type":"text",
                     "fields":{
                        "keyword":{
                           "type":"keyword",
                           "ignore_above":256
                        }
                     }
                  },
                  "type":{
                     "type":"text",
                     "fields":{
                        "keyword":{
                           "type":"keyword",
                           "ignore_above":256
                        }
                     }
                  },
                  "value":{
                     "type":"text",
                     "fields":{
                        "boolean":{
                           "type":"keyword",
                           "normalizer":"lowercase_normalizer"
                        },
                        "date":{
                           "type":"date",
                           "format":"yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis",
                           "ignore_malformed":true
                        },
                        "float":{
                           "type":"float",
                           "ignore_malformed":true,
                           "coerce":true
                        },
                        "integer":{
                           "type":"integer",
                           "ignore_malformed":true,
                           "coerce":true
                        },
                        "keyword":{
                           "type":"keyword",
                           "normalizer":"lowercase_normalizer"
                        }
                     }
                  }
               }
            },
            "site_attr_count":{
               "type":"integer",
               "doc_values":false
            },
            "site_id":{
               "type":"keyword"
            },
            "site_visit_attr_count":{
               "type":"integer",
               "doc_values":false
            },
            "site_visit_date":{
               "properties":{
                  "type":{
                     "type":"text"
                  },
                  "value":{
                     "type":"date",
                     "format":"yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis"
                  }
               }
            },
            "site_visit_id":{
               "type":"keyword"
            },
            "sites_hierarchy":{
               "properties":{
                  "depth":{
                     "type":"integer",
                     "doc_values":false
                  },
                  "label":{
                     "type":"text",
                     "fields":{
                        "keyword":{
                           "type":"keyword",
                           "ignore_above":256
                        }
                     }
                  },
                  "parent_site_id":{
                     "type":"keyword"
                  },
                  "site_id":{
                     "type":"keyword"
                  },
                  "site_id_label":{
                     "type":"text",
                     "fields":{
                        "keyword":{
                           "type":"keyword",
                           "ignore_above":256
                        }
                     }
                  }
               }
            },
            "unit_of_measure":{
               "type":"keyword"
            },
            "used_procedure":{
               "type":"keyword"
            }
         }
      }
   },