/
Ecoplots ES document structure
Table of Contents


Information model introduction

The EcoPlot uses an underlying common information model based on SOSA ontology. All data is mapped to the information model and each feature types and observed properties are controlled vocabularies. In summary, An observer will visit a site to make observations related to Feature of interests.

High-level information model


A plot-based site observation (the main object in our model) has the following fields:

  • Observation ID: unique identifier through all observations in the system. Not used to filter by

  • Dataset: obs. belongs to 1 specific dataset. used to filter by in facets

  • Site ID: obs. was taken within an ecological site. used to filter by in facets

  • Site Visit ID: obs. was taken during a specific visit to the site. used to filter by in facets

  • Site Visit Date: when the visit to the site happened. used to filter by in facets

  • Feature of interest ID: obs. belongs to a specific feature of interest (aka foi). Not used to filter by

  • Feature of interest type: obs. belongs to a foi which has a type. used to filter by in facets

  • Observed property: the observed/measured property. used to filter by in facets

  • Result: the result of the observation / measured value (can be multiple data types). WILL be used to filter by its value

  • Unit of measurement: unit of the measured result, if apply. Not used to filter by

  • Result time: when the observation was taken. Not used to filter by

  • Used procedure: Which method or procedure was used to make the obs. used to filter by in facets

  • Used instrument: Which specific instrument was used to make the obs., if apply. Not used to filter by

  • Regions: the site where the obs. was taken belongs to a geographical region around Australia. There are multiple regions types (States, Local government areas, bioregions, etc.) so an obs. can belong to 1 to many regions (but exactly to 1 region per region type). used to filter by in facets

  • Site Attributes: the site/plot can have different attributes (e.g. dimensions, shape, description…). A site has many observations, so every attribute will be duplicated in every document (observation). WILL be used to filter by its value

  • Site Visit Attributes: During a visit to the site, many observations are made, so every attribute will be duplicated in every document (observation). WILL be used to filter by its value

  • FOI attributes: A FOI has many observations, so every attribute will be duplicated in every document (observation). WILL be used to filter by its value

  • Observation attributes: An observation can have multiple attributes. WILL be used to filter by its value

  • Instrument attributes: An instrument can have multiple attributes. WILL be used to filter by its value

An attribute has the following fields:

  • Attribute ID: unique identifier through all attributes in the system. Not used to filter by

  • Attribute: specific attribute (e.g. type of soil observation, plot dimensions, scientific species name…) used to filter by in facets

  • Value: value of the attribute. WILL be used to filter by its value

  • Unit of measurement: unit of the measured result, if apply. Not used to filter by


1"id":"http://linked.data.gov.au/dataset/ausplots/soil_characterisation-obs-colour_when_moist-119526", 2"dataset": "Ausplots Rangelands" 3"feature_id": "http://linked.data.gov.au/dataset/ausplots/id-119526" 4"feature_type": "soil profile" 5"foi_attributes":[ 6 { 7 "attribute":"soil depth max" 8 "id":"http://linked.data.gov.au/dataset/ausplots/soil_characterisation-attr-lower_depth-119526" 9 "unit_of_measure":"http://qudt.org/vocab/unit/M" 10 "value": 0.22 11 }, 12 { 13 "attribute":"soil depth min" 14 "id":"http://linked.data.gov.au/dataset/ausplots/soil_characterisation-attr-upper_depth-119526" 15 "unit_of_measure":"http://qudt.org/vocab/unit/M" 16 "value": 0.07 17 }, 18 { 19 "attribute": "type of soil observation" 20 "id":"http://linked.data.gov.au/dataset/ausplots/soil_characterisation-attr-soil_observation_type-119526" 21 "unit_of_measure": null 22 "value": "http://linked.data.gov.au/def/tern-cv/e2505a19-b277-4f83-b146-bc9cd9c691a0" 23 } 24], 25"instr_attributes":[], 26"instrument_type": "munsell soil colour chart" 27"obs_attributes":[], 28"observed_property":"wet soil colour" 29"regions":[ 30 { 31 "dataset": "States and territories" 32 "label":"Northern Territory" 33 }, 34 { 35 "dataset":"Subregions" 36 "label":"McArthur" 37 }, 38 { 39 "dataset":"Local government areas" 40 "label":"Roper Gulf (S)" 41 }, 42 { 43 "dataset":"Bioregions" 44 "label":"Gulf Fall and Uplands" 45 }, 46 { 47 "dataset":"WWF ecoregions" 48 "label":"Carpentaria tropical savanna" 49 }, 50 { 51 "dataset": "Terrestrial CAPAD regions" 52 "label":"Limmen" 53 } 54], 55"result_time": "2012-06-12T00:00:00Z" 56"result_value": "7.5YR56" 57"site_id":"NTAGFU0026" 58"site_visit_date": "2012-06-12T00:00:00Z" 59"site_visit_id": "53673" 60"unit_of_measure":null 61"used_procedure":"Soil characterisation to 1 m+"

All this information is showed in the Ecoplots-UI (https://ecoplots-test.tern.org.au/search) as rows in a table. The information showed in this datagrid is pulled from ES though an API, using the filters selected by the user in the facets section.

Visual graph example

Draw.io Diagram

The above diagram would be translated into the following table (very simplified):

dataset

site_id

feature_id

foi_attributes

observation

obs_attributes

Ausplots

site_id-1

plant-pop-123456

[attr-species_name-123456-]

obs-hits-123456-2

[]

Ausplots

site_id-1

plant-pop-123456

[attr-species_name-123456-]

obs-basal_area-123456-2

[attr-point_id-obs-123456-1]

Notice the main object is the different “observations”, which may have observation-attributes (these attr. are specific and unique to an observation).
The rest of items showed in the graph are also embedded in the same document (dataset, site, site_visit, feature_id…), which means that, for example, many documents have the same “feature_id” (e.g. plant-pop-123456) and all attributes for that feature_id instance are duplicated towards all documents which are connected to the same feature_id.

Faceted search

Faceted search (basic functionality)

As introduced above, an observation has many fields, some of which we want to filter by though the faceted search on the left-sided menu.

  • Filter by region_type and regions:
    Allows the user to filter by one “region_type” at a time, and then by 1 to many regions of the specified region_type:

Currently 6-7 values, not significantly extendable in the future (maybe adding few more region_types)

Around hundred or few hundreds of different options in most of “region_types”.

  • Filter by dataset -> site -> site_visit:
    Firstly it allows the user to filter by a specific dataset, once selected, a new facet with all site_ids available is displayed, and once selected the site_visit_id facets is showed.

Currently 1 value, less than a hundred in the future.

Each dataset may have tens of thousands of sites.

Every site usually has 1-3 site visits, but potentially might be more.

  • Filter by Feature of Interest (FOI) → FOI attributes: a required feature in the future is to allow the user to filter by the “value” of the attributes.
    E.g. The user selects an attribute (e.g. reliability) and then it can filter by the value. If the value is a categorical value (high, medium, low) we would show a new facet with the different options. If not, a new input would allow the user to introduce the desired value (e.g. vegetation height > 1.5m).

Less than a hundred of options.

Few hundreds of options within all future datasets.

  • Filter by parameter (aka Observed property) → Observation attributes:
    Same expected behaviour as Feature_type/attributes filter. We would like to allow the user to filter by attribute values.

Few hundreds of options.

Less than hundred of options within all future datasets.

  • Filter by site_visit_date:
    Allows the user to fix a date range to filter observations whose visit_date is between that range.

Not implemented yet
  • Filter by site_attributes

  • Filter by site_visit attributes

    Same expected behaviour as FOI and observation attributes.

Faceted search (extended functionality)

Nested sites

In the current version of the UI / ES document structure, each observation only has one site, but our initial data model allows (and in practice, it happens) to have nested sites:

This means that we would want to allow the user to filter by all the levels of the site hierarchy:

  1. Firstly, we show all the options for top level sites (plot1 in the example).

  2. If a top level site is selected, then show a new facet with the next level (transect)

  3. And so on…

More details and proposed solution: https://ternaus.atlassian.net/wiki/spaces/EE/pages/2226520950


Filtering by attributes value (categorical values)

Filter by the value of attributes is a required functionality to be implemented in the UI, this means:

  • Categorical values: Once the user has selected a specific attribute, a new facet (combobox) with all the possible values/results must be shown in the UI. This would then allow the user to select a concrete value, so the search will be narrowed to those observations whose have a specific attribute with a specific value.
    E.g. The user selects an attribute (e.g. reliability) and then it can filter by the value. If the value is a categorical value (high, medium, low) we would show a new facet with the different options.

Possible categorical values of the “type of soil observation” attr.

1{ 2 "query": { 3 "bool": { 4 "filter": [ 5 { 6 "terms": { 7 "feature_type.value": [ 8 "http://linked.data.gov.au/def/tern-cv/80c39b95-0912-4267-bb66-2fa081683723" 9 ] 10 } 11 } 12 ] 13 } 14 }, 15 "aggs": { 16 "nested_agg": { 17 "nested": { 18 "path": "foi_attributes" 19 }, 20 "aggs": { 21 "filtering": { 22 "filter": { 23 "term": { 24 "foi_attributes.attribute.value": "http://linked.data.gov.au/def/tern-cv/8e7dfefe-e3ee-40ac-9024-ede48922bee6" 25 } 26 }, 27 "aggs": { 28 "value": { 29 "terms": { 30 "field": "foi_attributes.value.value.keyword", 31 "size": 1000 32 } 33 } 34 } 35 } 36 } 37 } 38 }, 39 "size": 0 40}

Result: (2 possible values, “soil pit” and “auger boring” in the example)

1{ 2 "took": 43, 3 "timed_out": false, 4 "_shards": { 5 "total": 1, 6 "successful": 1, 7 "skipped": 0, 8 "failed": 0 9 }, 10 "hits": { 11 "total": { 12 "value": 10000, 13 "relation": "gte" 14 }, 15 "max_score": null, 16 "hits": [] 17 }, 18 "aggregations": { 19 "nested_agg": { 20 "doc_count": 120620, 21 "filtering": { 22 "doc_count": 26175, 23 "value": { 24 "doc_count_error_upper_bound": 0, 25 "sum_other_doc_count": 0, 26 "buckets": [ 27 { 28 "key": "http://linked.data.gov.au/def/tern-cv/e2505a19-b277-4f83-b146-bc9cd9c691a0", 29 "doc_count": 25735 30 }, 31 { 32 "key": "http://linked.data.gov.au/def/tern-cv/2747e2d9-04b7-4115-8f4b-ca0264eb9ad2", 33 "doc_count": 440 34 } 35 ] 36 } 37 } 38 } 39 } 40}


  • Not categorical values: If the selected attribute is not a CV, a new input would allow the user to introduce the desired value (e.g. vegetation height > 1.5m). This could include range queries with numbers and full text search with string, for example.

ES implementation

Document examples can be easily extracted from ES:
https://es-test.tern.org.au/plotdata_ecoplots-data/_search

Document mapping

Most fields of an observation (dataset, site, site_visit, feature_type, etc.) are kind of key-value fields, what we call “label” and “value”. “Label” is the human readable label of an item, which also has a “value” that usually contains a “URI”.

All filters and aggregations made in the UI use the value field. One query for getting the label fields is executed just once and is stored/cached for label-value mapping.


1{ 2 "plotdata_ecoplots-data-observations-ausplots-test-20210810095151" : { 3 "mappings" : { 4 "properties" : { 5 "dataset" : { 6 "properties" : { 7 "label" : { 8 "type" : "text", 9 "fields" : { 10 "keyword" : { 11 "type" : "keyword", 12 "ignore_above" : 256 13 } 14 } 15 }, 16 "value" : { 17 "type" : "keyword" 18 } 19 } 20 }, 21 "feature_id" : { 22 "type" : "keyword" 23 }, 24 "feature_type" : { 25 "properties" : { 26 "key" : { 27 "type" : "keyword" 28 }, 29 "label" : { 30 "type" : "text", 31 "fields" : { 32 "keyword" : { 33 "type" : "keyword", 34 "ignore_above" : 256 35 } 36 } 37 }, 38 "value" : { 39 "type" : "keyword" 40 } 41 } 42 }, 43 "foi_attributes" : { 44 "type" : "nested", 45 "properties" : { 46 "attribute" : { 47 "properties" : { 48 "label" : { 49 "type" : "text", 50 "fields" : { 51 "keyword" : { 52 "type" : "keyword", 53 "ignore_above" : 256 54 } 55 } 56 }, 57 "value" : { 58 "type" : "keyword" 59 } 60 } 61 }, 62 "id" : { 63 "type" : "keyword" 64 }, 65 "unit_of_measure" : { 66 "type" : "keyword" 67 }, 68 "value" : { 69 "properties" : { 70 "label" : { 71 "type" : "text", 72 "fields" : { 73 "keyword" : { 74 "type" : "keyword", 75 "ignore_above" : 256 76 } 77 } 78 }, 79 "type" : { 80 "type" : "keyword" 81 }, 82 "value" : { 83 "type" : "text", 84 "fields" : { 85 "boolean" : { 86 "type" : "keyword", 87 "normalizer" : "lowercase_normalizer" 88 }, 89 "date" : { 90 "type" : "date", 91 "ignore_malformed" : true, 92 "format" : "yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis" 93 }, 94 "float" : { 95 "type" : "float", 96 "ignore_malformed" : true, 97 "coerce" : true 98 }, 99 "integer" : { 100 "type" : "integer", 101 "ignore_malformed" : true, 102 "coerce" : true 103 }, 104 "keyword" : { 105 "type" : "keyword", 106 "normalizer" : "lowercase_normalizer" 107 } 108 } 109 } 110 } 111 } 112 } 113 }, 114 "id" : { 115 "type" : "keyword" 116 }, 117 "instr_attributes" : { 118 "type" : "nested", 119 "properties" : { 120 "attribute" : { 121 "properties" : { 122 "label" : { 123 "type" : "text", 124 "fields" : { 125 "keyword" : { 126 "type" : "keyword", 127 "ignore_above" : 256 128 } 129 } 130 }, 131 "value" : { 132 "type" : "keyword" 133 } 134 } 135 }, 136 "id" : { 137 "type" : "keyword" 138 }, 139 "unit_of_measure" : { 140 "type" : "keyword" 141 }, 142 "value" : { 143 "properties" : { 144 "label" : { 145 "type" : "text", 146 "fields" : { 147 "keyword" : { 148 "type" : "keyword", 149 "ignore_above" : 256 150 } 151 } 152 }, 153 "type" : { 154 "type" : "keyword" 155 }, 156 "value" : { 157 "type" : "text", 158 "fields" : { 159 "boolean" : { 160 "type" : "keyword", 161 "normalizer" : "lowercase_normalizer" 162 }, 163 "date" : { 164 "type" : "date", 165 "ignore_malformed" : true, 166 "format" : "yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis" 167 }, 168 "float" : { 169 "type" : "float", 170 "ignore_malformed" : true, 171 "coerce" : true 172 }, 173 "integer" : { 174 "type" : "integer", 175 "ignore_malformed" : true, 176 "coerce" : true 177 }, 178 "keyword" : { 179 "type" : "keyword", 180 "normalizer" : "lowercase_normalizer" 181 } 182 } 183 } 184 } 185 } 186 } 187 }, 188 "instrument_type" : { 189 "properties" : { 190 "label" : { 191 "type" : "text", 192 "fields" : { 193 "keyword" : { 194 "type" : "keyword", 195 "ignore_above" : 256 196 } 197 } 198 }, 199 "value" : { 200 "type" : "keyword" 201 } 202 } 203 }, 204 "obs_attributes" : { 205 "type" : "nested", 206 "properties" : { 207 "attribute" : { 208 "properties" : { 209 "label" : { 210 "type" : "text", 211 "fields" : { 212 "keyword" : { 213 "type" : "keyword", 214 "ignore_above" : 256 215 } 216 } 217 }, 218 "value" : { 219 "type" : "keyword" 220 } 221 } 222 }, 223 "id" : { 224 "type" : "keyword" 225 }, 226 "unit_of_measure" : { 227 "type" : "keyword" 228 }, 229 "value" : { 230 "properties" : { 231 "label" : { 232 "type" : "text", 233 "fields" : { 234 "keyword" : { 235 "type" : "keyword", 236 "ignore_above" : 256 237 } 238 } 239 }, 240 "type" : { 241 "type" : "keyword" 242 }, 243 "value" : { 244 "type" : "text", 245 "fields" : { 246 "boolean" : { 247 "type" : "keyword", 248 "normalizer" : "lowercase_normalizer" 249 }, 250 "date" : { 251 "type" : "date", 252 "ignore_malformed" : true, 253 "format" : "yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis" 254 }, 255 "float" : { 256 "type" : "float", 257 "ignore_malformed" : true, 258 "coerce" : true 259 }, 260 "integer" : { 261 "type" : "integer", 262 "ignore_malformed" : true, 263 "coerce" : true 264 }, 265 "keyword" : { 266 "type" : "keyword", 267 "normalizer" : "lowercase_normalizer" 268 } 269 } 270 } 271 } 272 } 273 } 274 }, 275 "observed_property" : { 276 "properties" : { 277 "label" : { 278 "type" : "text", 279 "fields" : { 280 "keyword" : { 281 "type" : "keyword", 282 "ignore_above" : 256 283 } 284 } 285 }, 286 "value" : { 287 "type" : "keyword" 288 } 289 } 290 }, 291 "regions" : { 292 "type" : "nested", 293 "properties" : { 294 "dataset" : { 295 "properties" : { 296 "label" : { 297 "type" : "text", 298 "fields" : { 299 "keyword" : { 300 "type" : "keyword", 301 "ignore_above" : 256 302 } 303 } 304 }, 305 "uri" : { 306 "type" : "keyword" 307 } 308 } 309 }, 310 "label" : { 311 "type" : "text", 312 "fields" : { 313 "keyword" : { 314 "type" : "keyword", 315 "ignore_above" : 256 316 } 317 } 318 }, 319 "uri" : { 320 "type" : "keyword" 321 } 322 } 323 }, 324 "result_time" : { 325 "properties" : { 326 "type" : { 327 "type" : "text" 328 }, 329 "value" : { 330 "type" : "date", 331 "ignore_malformed" : false, 332 "format" : "yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis" 333 } 334 } 335 }, 336 "result_value" : { 337 "properties" : { 338 "label" : { 339 "type" : "text", 340 "fields" : { 341 "keyword" : { 342 "type" : "keyword", 343 "ignore_above" : 256 344 } 345 } 346 }, 347 "type" : { 348 "type" : "text", 349 "fields" : { 350 "keyword" : { 351 "type" : "keyword", 352 "ignore_above" : 256 353 } 354 } 355 }, 356 "value" : { 357 "type" : "text", 358 "fields" : { 359 "boolean" : { 360 "type" : "keyword", 361 "normalizer" : "lowercase_normalizer" 362 }, 363 "date" : { 364 "type" : "date", 365 "ignore_malformed" : true, 366 "format" : "yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis" 367 }, 368 "float" : { 369 "type" : "float", 370 "ignore_malformed" : true, 371 "coerce" : true 372 }, 373 "integer" : { 374 "type" : "integer", 375 "ignore_malformed" : true, 376 "coerce" : true 377 }, 378 "keyword" : { 379 "type" : "keyword", 380 "normalizer" : "lowercase_normalizer" 381 } 382 } 383 } 384 } 385 }, 386 "site_id" : { 387 "properties" : { 388 "label" : { 389 "type" : "text", 390 "fields" : { 391 "keyword" : { 392 "type" : "keyword", 393 "ignore_above" : 256 394 } 395 } 396 }, 397 "value" : { 398 "type" : "keyword" 399 } 400 } 401 }, 402 "site_visit_date" : { 403 "properties" : { 404 "type" : { 405 "type" : "text" 406 }, 407 "value" : { 408 "type" : "date", 409 "ignore_malformed" : false, 410 "format" : "yyyy-MM-dd'T'HH:mm:ss'Z'||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||d/MM/yyyy||epoch_millis" 411 } 412 } 413 }, 414 "site_visit_id" : { 415 "properties" : { 416 "label" : { 417 "type" : "text", 418 "fields" : { 419 "keyword" : { 420 "type" : "keyword", 421 "ignore_above" : 256 422 } 423 } 424 }, 425 "value" : { 426 "type" : "keyword" 427 } 428 } 429 }, 430 "unit_of_measure" : { 431 "type" : "keyword" 432 }, 433 "used_procedure" : { 434 "properties" : { 435 "label" : { 436 "type" : "text", 437 "fields" : { 438 "keyword" : { 439 "type" : "keyword", 440 "ignore_above" : 256 441 } 442 } 443 }, 444 "value" : { 445 "type" : "keyword" 446 } 447 } 448 } 449 } 450 } 451 } 452}

ES queries

Many query examples have been grouped into a Postman REST client collection (can also be imported into Insomnia REST client):

Data query

Example of query for reading observations (documents) that complies with user’s selection.

1{ 2 "query": { 3 "bool": { 4 "filter": [ 5 { 6 "nested": { 7 "path": "regions", 8 "query": { 9 "terms": { 10 "regions.uri": [ 11 "http://linked.data.gov.au/dataset/asgs2016/stateorterritory/3" 12 ] 13 } 14 } 15 } 16 }, 17 { 18 "terms": { 19 "dataset.value": [ 20 "http://linked.data.gov.au/dataset/ausplots" 21 ] 22 } 23 }, 24 { 25 "terms": { 26 "feature_type.value": [ 27 "http://linked.data.gov.au/def/tern-cv/60d7edf8-98c6-43e9-841c-e176c334d270" 28 ] 29 } 30 }, 31 { 32 "terms": { 33 "observed_property.value": [ 34 "http://linked.data.gov.au/def/tern-cv/09296da0-c645-4165-950c-780c21b3c140" 35 ] 36 } 37 } 38 ] 39 } 40 }, 41 "from": 0, 42 "size": 50, 43 "track_total_hits": true 44}


FACETS queries / list of aggregations

  • Simple aggregations: dataset, site_id, site_visit_id, feature_type, observed_property/parameter…

  • Nested aggregations: region_type, regions, foi_attributes, obs_attributes (future site_attributes, etc.)…

  • Composed aggregations: in order to retrieve the labels of more than 10.000 sites, we need to perform a series of composite_aggregations using the keyword “after”, as simple aggregations are limited to a max of 10.000 items per bucket.

Example of query for aggregating the possible values of a specific facet (it also complies with user’s selection).
Query used in the /facet API endpoint (based on user’s selection, it get all the possible site_id values):

1{ 2 "query":{ 3 "bool":{ 4 "filter":[ 5 { 6 "nested":{ 7 "path":"regions", 8 "query":{ 9 "terms":{ 10 "regions.uri":[ 11 "http://linked.data.gov.au/dataset/asgs2016/stateorterritory/3" 12 ] 13 } 14 } 15 } 16 }, 17 { 18 "terms":{ 19 "dataset.value":[ 20 "http://linked.data.gov.au/dataset/ausplots" 21 ] 22 } 23 }, 24 { 25 "terms":{ 26 "feature_type.value":[ 27 "http://linked.data.gov.au/def/tern-cv/60d7edf8-98c6-43e9-841c-e176c334d270" 28 ] 29 } 30 }, 31 { 32 "terms":{ 33 "observed_property.value":[ 34 "http://linked.data.gov.au/def/tern-cv/09296da0-c645-4165-950c-780c21b3c140" 35 ] 36 } 37 } 38 ] 39 } 40 }, 41 "aggs":{ 42 "value":{ 43 "terms":{ 44 "field":"site_id.value", 45 "size":200 46 } 47 } 48 }, 49 "size":0 50}

Example of aggregating a nested field (region_type):
Notice that: region.dataset in ES = region_type

1{ 2 "query":{ 3 "bool":{ 4 "filter":[ 5 { 6 "nested":{ 7 "path":"regions", 8 "query":{ 9 "terms":{ 10 "regions.uri":[ 11 "http://linked.data.gov.au/dataset/asgs2016/stateorterritory/3" 12 ] 13 } 14 } 15 } 16 }, 17 { 18 "terms":{ 19 "dataset.value":[ 20 "http://linked.data.gov.au/dataset/ausplots" 21 ] 22 } 23 }, 24 { 25 "terms":{ 26 "feature_type.value":[ 27 "http://linked.data.gov.au/def/tern-cv/60d7edf8-98c6-43e9-841c-e176c334d270" 28 ] 29 } 30 }, 31 { 32 "terms":{ 33 "observed_property.value":[ 34 "http://linked.data.gov.au/def/tern-cv/09296da0-c645-4165-950c-780c21b3c140" 35 ] 36 } 37 } 38 ] 39 } 40 }, 41 "aggs":{ 42 "nested_agg":{ 43 "nested":{ 44 "path":"regions" 45 }, 46 "aggs":{ 47 "value":{ 48 "terms":{ 49 "field":"regions.dataset.uri", # region.dataset = region_type 50 "size":1000 51 } 52 } 53 } 54 } 55 }, 56 "size":0 57}
1{ 2 "aggs": { 3 "nested_agg": { 4 "nested": { 5 "path": "regions" 6 }, 7 "aggs": { 8 "value": { 9 "terms": { 10 "field": "regions.dataset.uri", 11 "size": 1000 12 } 13 } 14 } 15 } 16 }, 17 "size": 0 18}
1{ 2 "aggs": { 3 "nested_agg": { 4 "nested": { 5 "path": "regions" 6 }, 7 "aggs": { 8 "filtering": { 9 "filter": { 10 "term": { 11 "regions.dataset.uri": "http://linked.data.gov.au/dataset/asgs2016/stateorterritory" 12 } 13 }, 14 "aggs": { 15 "value": { 16 "terms": { 17 "field": "regions.uri", 18 "size": 1000 19 } 20 } 21 } 22 } 23 } 24 } 25 }, 26 "size": 0 27}
1{ 2 "aggs": { 3 "value": { 4 "terms": { 5 "field": "dataset.value", 6 "size": 200 7 } 8 } 9 }, 10 "size": 0 11}
1{ 2 "query": { 3 "bool": { 4 "filter": [ 5 { 6 "terms": { 7 "dataset.value": [ 8 "http://linked.data.gov.au/dataset/ausplots" 9 ] 10 } 11 } 12 ] 13 } 14 }, 15 "aggs": { 16 "value": { 17 "terms": { 18 "field": "site_id.value", 19 "size": 200 20 } 21 } 22 }, 23 "size": 0 24}
1{ 2 "query":{ 3 "bool":{ 4 "filter":[ 5 { 6 "terms":{ 7 "dataset.value":[ 8 "http://linked.data.gov.au/dataset/ausplots" 9 ] 10 } 11 }, 12 { 13 "terms":{ 14 "site_id.value":[ 15 "http://linked.data.gov.au/dataset/ausplots/site-nsabbs0006" 16 ] 17 } 18 } 19 ] 20 } 21 }, 22 "aggs":{ 23 "value":{ 24 "terms":{ 25 "field":"site_visit_id.value", 26 "size":200 27 } 28 } 29 }, 30 "size":0 31}
1{ 2 "aggs": { 3 "value": { 4 "terms": { 5 "field": "feature_type.value", 6 "size": 200 7 } 8 } 9 }, 10 "size": 0 11}
1{ 2 "query": { 3 "bool": { 4 "filter": [ 5 { 6 "terms": { 7 "feature_type.value": [ 8 "http://linked.data.gov.au/def/tern-cv/60d7edf8-98c6-43e9-841c-e176c334d270" 9 ] 10 } 11 } 12 ] 13 } 14 }, 15 "aggs": { 16 "nested_agg": { 17 "nested": { 18 "path": "foi_attributes" 19 }, 20 "aggs": { 21 "value": { 22 "terms": { 23 "field": "foi_attributes.attribute.value", 24 "size": 1000 25 } 26 } 27 } 28 } 29 }, 30 "size": 0 31}
1{ 2 "aggs": { 3 "value": { 4 "terms": { 5 "field": "observed_property.value", 6 "size": 200 7 } 8 } 9 }, 10 "size": 0 11}
1{ 2 "query": { 3 "bool": { 4 "filter": [ 5 { 6 "terms": { 7 "observed_property.value": [ 8 "http://linked.data.gov.au/def/tern-cv/0bbd7fcd-0782-4efc-96e6-1f0f7669c655" 9 ] 10 } 11 } 12 ] 13 } 14 }, 15 "aggs": { 16 "nested_agg": { 17 "nested": { 18 "path": "obs_attributes" 19 }, 20 "aggs": { 21 "value": { 22 "terms": { 23 "field": "obs_attributes.attribute.value", 24 "size": 1000 25 } 26 } 27 } 28 } 29 }, 30 "size": 0 31}
1{ 2 "aggs": { 3 "min_date": { 4 "min": { 5 "field": "site_visit_date.value" 6 } 7 }, 8 "max_date": { 9 "max": { 10 "field": "site_visit_date.value" 11 } 12 } 13 }, 14 "size": 0 15}

Labels aggregations

Notice that “labels” queries are slow, but they are only executed once and then stored in browser store, so the performance problem does not lie on them.


1{ 2 "aggs": { 3 "nested_agg": { 4 "nested": { 5 "path": "regions" 6 }, 7 "aggs": { 8 "value": { 9 "terms": { 10 "field": "regions.dataset.uri", 11 "size": 1500 12 }, 13 "aggs": { 14 "label": { 15 "terms": { 16 "field": "regions.dataset.label.keyword", 17 "size": 1500 18 } 19 } 20 } 21 } 22 } 23 } 24 }, 25 "size": 0 26}
1{ 2 "aggs": { 3 "composite_agg": { 4 "composite": { 5 "sources": [ 6 { 7 "value": { 8 "terms": { 9 "field": "site_id.value" 10 } 11 } 12 }, 13 { 14 "label": { 15 "terms": { 16 "field": "site_id.label.keyword" 17 } 18 } 19 } 20 ], 21 "size": 5000 22 } 23 } 24 }, 25 "size": 0 26}
1{ 2 "aggs": { 3 "composite_agg": { 4 "composite": { 5 "sources": [ 6 { 7 "value": { 8 "terms": { 9 "field": "site_id.value" 10 } 11 } 12 }, 13 { 14 "label": { 15 "terms": { 16 "field": "site_id.label.keyword" 17 } 18 } 19 } 20 ], 21 "size": 5000, 22 "after": { 23 "value": "http://linked.data.gov.au/dataset/ausplots/site-wagcoo0004", 24 "label": "WAGCOO0004" 25 } 26 } 27 } 28 }, 29 "size": 0 30}

Ecoplots API

Ecoplots UI renders and triggers new search and facets using an Ecoplots API (https://ecoplots-test.tern.org.au/api/v1.0/ui):

The core endpoints used though the UI are:

  • /data: based on user’s selection in the facets menu, it requests to the API a paginated number of observations (documents in ES, displayed as rows in the datagrid).

    1{ 2 "sorting":[], 3 "query":{ 4 "dataset":[ 5 "http://linked.data.gov.au/dataset/ausplots" 6 ] 7 }, 8 "page_size":50, 9 "page_num":1 10}
  • /facet: based on user’s selection in the facets menu, it requests the lists of options that fit the filtering to populate all the facets (combo-boxes in the UI)

1{ 2 "query":{ 3 "dataset":[ 4 "http://linked.data.gov.au/dataset/ausplots" 5 ] 6 } 7}

Notice that all API request queries use the “value” of the option selected, instead of the showed label.

Using the above parameters of the request, the API generates and executes 1 or many ES queries and respond with the response of ES, which is processed in the UI.

  • Data endpoint is converted into only 1 ES query:

    1[2021-09-06 14:57:38,213] DEBUG in data: {'from': 0, 'size': 50, 'track_total_hits': True} 2127.0.0.1 - - [06/Sep/2021 14:57:38] "POST /api/v1.0/data HTTP/1.1" 200
  • Facet endpoint is converted into many ES queries:

    1[2021-09-06 14:57:38,295] DEBUG in facet: {'aggs': {'nested_agg': {'nested': {'path': 'regions'}, 'aggs': {'value': {'terms': {'field': 'regions.dataset.uri', 'size': 1000}}}}}, 'size': 0} 2[2021-09-06 14:57:38,346] DEBUG in facet: {'aggs': {'value': {'terms': {'field': 'dataset.value', 'size': 200}}}, 'size': 0} 3[2021-09-06 14:57:38,403] DEBUG in facet: {'aggs': {'value': {'terms': {'field': 'feature_type.value', 'size': 200}}}, 'size': 0} 4[2021-09-06 14:57:38,452] DEBUG in facet: {'aggs': {'value': {'terms': {'field': 'observed_property.value', 'size': 200}}}, 'size': 0} 5[2021-09-06 14:57:38,509] DEBUG in facet: {'aggs': {'min_date': {'min': {'field': 'site_visit_date.value'}}, 'max_date': {'max': {'field': 'site_visit_date.value'}}}, 'size': 0} 6127.0.0.1 - - [06/Sep/2021 14:57:38] "POST /api/v1.0/facet HTTP/1.1" 200 -

The number of internal queries to ES increases when some facets are already selected, e.g. once the user has selected a dataset, a new request is triggered and in this occasion also an aggregation for getting site_ids is executed.
This behaviour can be easily seen though the UI.

1[2021-09-06 15:03:17,240] DEBUG in facet: {'query': {'bool': {'filter': [{'terms': {'dataset.value': ['http://linked.data.gov.au/dataset/ausplots']}}]}}, 'aggs': {'nested_agg': {'nested': {'path': 'regions'}, 'aggs': {'value': {'terms': {'field': 'regions.dataset.uri', 'size': 1000}}}}}, 'size': 0} 2[2021-09-06 15:03:17,296] DEBUG in facet: {'aggs': {'value': {'terms': {'field': 'dataset.value', 'size': 200}}}, 'size': 0} 3[2021-09-06 15:03:17,346] DEBUG in facet: {'query': {'bool': {'filter': [{'terms': {'dataset.value': ['http://linked.data.gov.au/dataset/ausplots']}}]}}, 'aggs': {'value': {'terms': {'field': 'site_id.value', 'size': 200}}}, 'size': 0} 4[2021-09-06 15:03:17,440] DEBUG in facet: {'query': {'bool': {'filter': [{'terms': {'dataset.value': ['http://linked.data.gov.au/dataset/ausplots']}}]}}, 'aggs': {'value': {'terms': {'field': 'feature_type.value', 'size': 200}}}, 'size': 0} 5[2021-09-06 15:03:17,489] DEBUG in facet: {'query': {'bool': {'filter': [{'terms': {'dataset.value': ['http://linked.data.gov.au/dataset/ausplots']}}]}}, 'aggs': {'value': {'terms': {'field': 'observed_property.value', 'size': 200}}}, 'size': 0} 6[2021-09-06 15:03:17,537] DEBUG in facet: {'query': {'bool': {'filter': [{'terms': {'dataset.value': ['http://linked.data.gov.au/dataset/ausplots']}}]}}, 'aggs': {'min_date': {'min': {'field': 'site_visit_date.value'}}, 'max_date': {'max': {'field': 'site_visit_date.value'}}}, 'size': 0} 7127.0.0.1 - - [06/Sep/2021 15:03:17] "POST /api/v1.0/facet HTTP/1.1" 200 -