Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current Restore this Version View Version History

« Previous Version 2 Next »

Following some of the recommendations presented on Monday 25th, here are some stats to help to make decision about how ES performance improves.

https://docs.google.com/document/d/11rdMT-ZoFpOmJY4ND1ujb5kwyiSZCMro7_47QbkiY9U

Change mapping for “regions” field. Get rid of “nested” type and to use a plain “keyword” field and to use “regex” in aggregations.

Changes implemented

New mapping for regions:

"regions": {"type": "keyword"},
"region_type": {"type": "keyword"},

Data before:

"regions": [
      {
        "uri": "http://linked.data.gov.au/dataset/asgs2016/stateorterritory/5",
        "label": "Western Australia",
        "dataset": {
          "uri": "http://linked.data.gov.au/dataset/asgs2016/stateorterritory",
          "label": "States and territories"
        }
      },
      {
        "uri": "http://linked.data.gov.au/dataset/wwf-terr-ecoregions/14110",
        "label": "Southwest Australia savanna",
        "dataset": {
          "uri": "http://linked.data.gov.au/dataset/wwf-terr-ecoregions",
          "label": "WWF ecoregions"
        }
      },
      {
        "uri": "http://linked.data.gov.au/dataset/local-gov-areas-2011/56790",
        "label": "Northampton (S)",
        "dataset": {
          "uri": "http://linked.data.gov.au/dataset/local-gov-areas-2011",
          "label": "Local government areas"
        }
      },
      {
        "uri": "http://linked.data.gov.au/dataset/nrm-2017/5010",
        "label": "Northern Agricultural Region",
        "dataset": {
          "uri": "http://linked.data.gov.au/dataset/nrm-2017",
          "label": "NRM regions"
        }
      },
      {
        "uri": "http://linked.data.gov.au/dataset/capad-2018-terrestrial/BHA_26",
        "label": "Eurardy",
        "dataset": {
          "uri": "http://linked.data.gov.au/dataset/capad-2018-terrestrial",
          "label": "Terrestrial CAPAD regions"
        }
      },
      {
        "uri": "http://linked.data.gov.au/dataset/bioregion/GES01",
        "label": "Geraldton Hills",
        "dataset": {
          "uri": "http://linked.data.gov.au/dataset/bioregion",
          "label": "Subregions"
        }
      },
      {
        "uri": "http://linked.data.gov.au/dataset/bioregion/GES",
        "label": "Geraldton Sandplains",
        "dataset": {
          "uri": "http://linked.data.gov.au/dataset/bioregion/IBRA7",
          "label": "Bioregions"
        }
      }
],

Data after:

"regions": [
      "http://linked.data.gov.au/dataset/asgs2016/stateorterritory|http://linked.data.gov.au/dataset/asgs2016/stateorterritory/5",
      "http://linked.data.gov.au/dataset/wwf-terr-ecoregions|http://linked.data.gov.au/dataset/wwf-terr-ecoregions/14110",
      "http://linked.data.gov.au/dataset/local-gov-areas-2011|http://linked.data.gov.au/dataset/local-gov-areas-2011/56790",
      "http://linked.data.gov.au/dataset/nrm-2017|http://linked.data.gov.au/dataset/nrm-2017/5010",
      "http://linked.data.gov.au/dataset/capad-2018-terrestrial|http://linked.data.gov.au/dataset/capad-2018-terrestrial/BHA_26",
      "http://linked.data.gov.au/dataset/bioregion|http://linked.data.gov.au/dataset/bioregion/GES01",
      "http://linked.data.gov.au/dataset/bioregion/IBRA7|http://linked.data.gov.au/dataset/bioregion/GES"
],
"region_types": [
      "http://linked.data.gov.au/dataset/asgs2016/stateorterritory",
      "http://linked.data.gov.au/dataset/wwf-terr-ecoregions",
      "http://linked.data.gov.au/dataset/local-gov-areas-2011",
      "http://linked.data.gov.au/dataset/nrm-2017",
      "http://linked.data.gov.au/dataset/capad-2018-terrestrial",
      "http://linked.data.gov.au/dataset/bioregion",
      "http://linked.data.gov.au/dataset/bioregion/IBRA7"
]

Index size stats

Approach

No docs

No hidden docs

Docs increase

Nested docs

2,563,630

27,284,158

x10.64278

Keyword

2,563,630

10,292,406

x4.014778

New index is ~2.65 times bigger in terms of number of documents

ES Queries

Old query for regions aggregation:

{
    "aggs": {
        "nested_agg": {
            "nested": {
                "path": "regions"
            },
            "aggs": {
                "value": {
                    "terms": {
                        "field": "regions.dataset.uri",
                        "size": 1000
                    }
                }
            }
        }
    },
    "size": 0
}
{
    "aggs": {
        "nested_agg": {
            "nested": {
                "path": "regions"
            },
            "aggs": {
                "filtering": {
                    "filter": {
                        "term": {
                            "regions.dataset.uri": "http://linked.data.gov.au/dataset/asgs2016/stateorterritory"
                        }
                    },
                    "aggs": {
                        "value": {
                            "terms": {
                                "field": "regions.uri",
                                "size": 1000
                            }
                        }
                    }
                }
            }
        }
    },
    "size": 0
}

New queries:

{
    "aggs": {
        "regions": {
            "terms": {
                "field": "region_types"
            }
        }
    },
    "size": 0,
    "track_total_hits": true
}
{
    "aggs": {
        "regions": {
            "terms": {
                "field": "regions",
                "include": "http://linked.data.gov.au/dataset/bioregion\\|.*"
            }
        }
    },
    "size": 0,
    "track_total_hits": true
}
Requests time stats

0 Comments

You are not logged in. Any changes you make will be marked as anonymous. You may want to Log In if you already have an account.