...
At 1/11/2021 with 1 full dataset ingested, the total number of fields is 290.
Denormalise regions
Force Merge API
...
Code Block |
---|
...
"region_types": [
"http://linked.data.gov.au/dataset/local-gov-areas-2011",
"http://linked.data.gov.au/dataset/nrm-2017",
"http://linked.data.gov.au/dataset/bioregion/IBRA7",
"http://linked.data.gov.au/dataset/bioregion",
"http://linked.data.gov.au/dataset/asgs2016/stateorterritory",
"http://linked.data.gov.au/dataset/wwf-terr-ecoregions"
],
"region:local-gov-areas-2011": "http://linked.data.gov.au/dataset/local-gov-areas-2011/32250",
"region:nrm-2017": "http://linked.data.gov.au/dataset/nrm-2017/3080",
"region:bioregion/IBRA7": "http://linked.data.gov.au/dataset/bioregion/GUP",
"region:bioregion": "http://linked.data.gov.au/dataset/bioregion/GUP01",
"region:asgs2016/stateorterritory": "http://linked.data.gov.au/dataset/asgs2016/stateorterritory/3",
"region:wwf-terr-ecoregions": "http://linked.data.gov.au/dataset/wwf-terr-ecoregions/12945",
... |
ES document mapping:
Code Block |
---|
"region:asgs2016/stateorterritory" : {
"type" : "keyword"
},
"region:bioregion" : {
"type" : "keyword"
},
"region:bioregion/IBRA7" : {
"type" : "keyword"
},
"region:capad-2018-terrestrial" : {
"type" : "keyword"
},
"region:local-gov-areas-2011" : {
"type" : "keyword"
},
"region:nrm-2017" : {
"type" : "keyword"
},
"region:wwf-terr-ecoregions" : {
"type" : "keyword"
},
"region_types" : {
"type" : "keyword"
}, |
Mapping generated dynamically using https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html
Code Block |
---|
"mappings" : {
"dynamic_templates" : [
{
"region_as_keyword" : {
"match" : "region:*",
"mapping" : {
"type" : "keyword"
}
}
},
...
]
} |
Force Merge API
Indices segments are merged after every indexing.
Disable refresh during indexing
Disabling index refresh makes indexing times notably faster (thoughput: ~1000 every two seconds).
1 refresh action is performed manually after indexing. Then the index segments are merged (force-merge).
Dynamic mapping
In order to ensure that the correct datatype is stored in ES for each attribute value, dynamic templating is performed during indexing following the defined rules:
Code Block |
---|
"mappings" : {
"dynamic_templates" : [
{
"region_as_keyword" : {
"match" : "region:*",
"mapping" : {
"type" : "keyword"
}
}
},
{
"attribute_field" : {
"path_match" : "*_attr_*.attribute",
"mapping" : {
"type" : "keyword"
}
}
},
{
"id_field" : {
"path_match" : "*_attr_*.id",
"mapping" : {
"type" : "keyword"
}
}
},
{
"unit_field" : {
"path_match" : "*_attr_*.unit_of_measure",
"mapping" : {
"type" : "keyword"
}
}
},
{
"value_label_field" : {
"path_match" : "*_attr_*.value.label",
"mapping" : {
"type" : "text"
}
}
},
{
"value_type_field" : {
"path_match" : "*_attr_*.value.type",
"mapping" : {
"type" : "keyword"
}
}
},
{
"value_value_field" : {
"path_match" : "*_attr_*.value.value_float",
"mapping" : {
"coerce" : true,
"doc_values" : true,
"ignore_malformed" : true,
"type" : "float"
}
}
},
{
"value_value_field" : {
"path_match" : "*_attr_*.value.value_int",
"mapping" : {
"coerce" : true,
"doc_values" : true,
"ignore_malformed" : true,
"type" : "integer"
}
}
},
{
"value_value_field" : {
"path_match" : "*_attr_*.value.value_bool",
"mapping" : {
"normalizer" : "lowercase_normalizer",
"type" : "keyword"
}
}
},
{
"value_value_field" : {
"path_match" : "*_attr_*.value.value_datetime",
"mapping" : {
"format" : "yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis",
"ignore_malformed" : "true",
"type" : "date"
}
}
},
{
"value_value_field" : {
"path_match" : "*_attr_*.value.value_date",
"mapping" : {
"format" : "yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis",
"ignore_malformed" : "true",
"type" : "date"
}
}
},
{
"value_value_field" : {
"path_match" : "*_attr_*.value.value_uri",
"mapping" : {
"type" : "keyword"
}
}
},
{
"value_value_field" : {
"path_match" : "*_attr_*.value.value_string",
"mapping" : {
"type" : "keyword"
}
}
}
],
...
} |
Clear cache API | Elasticsearch Guide [7.10] | Elastic
Clear cache after each test query to really test performance improvement!!!
Data indices Mapping
Expand | ||
---|---|---|
| ||
|
Expand | ||
---|---|---|
| ||
|