Content Comparison

...

Uwe - because you have multiple indices, the queries are doing it in parallel and merge the result in the response
- Javi - We are using the scroll API to get like 20 million documents, since the indices are overall smaller, it is a slight improvement as well.
- Javi - for the UI we are using pagination
Uwe - is date range working too?
- Javi - yes, and the queries now are simpler too.
Javi - No longer need to use nested aggregations as well
- Uwe - Yes, just terms aggregations and nothing more. That's good. Really a great success in my opinion.
- Javi - Yes I am quite happy with the improvement
Javi - Gerhard did a few changes to the cluster, so maybe he wants to update Uwe.
- Gerhard - I changed the heap settings to 25% of the maximum ram and added a few more cpu cores and a bit more ram.
- Uwe - Last time we said I'll get access to the machines via SSH, is that still required? For now, seems fine.
- Gerhard - using the readonlyrest extension for authorization
  - Uwe - be careful and disable scripting if you're not using the script API
    - Not really sure if the read-only will affect this but if you're not using it, then disable it to be safer.
  - Gerhard - it's not even read-only from the outside, and it will eventually be completely blocked from the outside
Uwe - thought about using open search?
- Gerhard - thought about but haven't tried it
- Uwe - I think most is the same, elasticsearch stuff is identical but authorization handling is different
Uwe - you're using postman and not kibana?
- Javi - I use postman in my day-to-day work and it can save queries
Uwe - How much code changes was it in the UI and API?
- Javi - Project is split into 3
  - API
  - UI
  - Indexer
- Javi - Most of the changes were happening in the indexer
- Javi - The API project was just changing the queries.
- Javi - the UI now looks up the labels
- Uwe - Oh, I thought the API would look for the labels.
Uwe - Will the API be used by external users?
- Javi - Yes
- Uwe - will they also have to use URIs?
- Javi - For now yes
- Gerhard - we can also have R or Python wrappers around the APIs to improve the UX.
Javi - We have some issues with the map and was wondering if you can help us Uwe.
- Javi - We are doing geo aggregations.
- Uwe - So are the coordinates saved here in the documents?
- Javi - shows kibana
- Javi - using geopoint
- Uwe - unfortunately I don't have much experience with the geo aggregations
- Javi - shows the clustering of sites and how zooming in and out calculates the aggregations to return the response, a cluster of sites.
- Javi - we will have 10s of thousands of sites in the coming datasets
  - Javi - we like square data grids like DataOne shows DataOne portal
  - Javi - Looking at geotile API in elasticsearch
  - Uwe - unfortunately I have no idea and haven't tried this before
  - Javi - currently we are using the geohash API but I want to use the geotile API.
- Uwe - At PANGAEA we are assigning names of the regions but not anything like this with a map with rectangle clustering
- Uwe - I prefer full-text search instead of arbitrary maps broken into clusters of things
- Uwe - I prefer the current clustering that you have instead of the rectangular one.
Uwe - probably worthwhile to think about utilising full-text search in combination with the current search options.
Guru - Javi is currently doing aggregations in real-time, which causes some slowness. Is there a better way to do this to index it in advance.
- Uwe - A bit strange why it's slow for around 800 sites.
- Javi - Every time you zoom in or out is performing a new aggregation. So it's not really slow, just inefficient because it's performing a lot of aggregations.
- Uwe - At PANGAEA we are adding full-text search and everything during indexing time and have it as a grid and index it.
- Gerhard - not really slow currently and the animation with the zoom in and out is blocking the UI.
- Uwe - currently don't think there's much to do now as the aggregations is fast. You can precalculate it in the future.
- Javi - I read many people are moving to Elasticsearch for geo instead of using something like PostGIS
Uwe - https://wiki.pangaea.de/wiki/Topic

Version	Old Version 3	New Version Current
Changes made by	Edmond Chuc (Unlicensed)	Edmond Chuc (Unlicensed)
Saved on	16 Nov 2021	16 Nov 2021

Versions Compared

Key

Action items

Decisions