...
Uwe - because you have multiple indices, the queries are doing it in parallel and merge the result in the response
Javi - We are using the scroll API to get like 20 million documents, since the indices are overall smaller, it is a slight improvement as well.
Javi - for the UI we are using pagination
Uwe - is date range working too?
Javi - yes, and the queries now are simpler too.
Javi - No longer need to use nested aggregations as well
Uwe - Yes, just terms aggregations and nothing more. That's good. Really a great success in my opinion.
Javi - Yes I am quite happy with the improvement
Javi - Gerhard did a few changes to the cluster, so maybe he wants to update Uwe.
Gerhard - I changed the heap settings to 25% of the maximum ram and added a few more cpu cores and a bit more ram.
Uwe - Last time we said I'll get access to the machines via SSH, is that still required? For now, seems fine.
Gerhard - using the readonlyrest extension for authorization
Uwe - be careful and disable scripting if you're not using the script API
Not really sure if the read-only will affect this but if you're not using it, then disable it to be safer.
Gerhard - it's not even read-only from the outside, and it will eventually be completely blocked from the outside
Uwe - thought about using open search?
Gerhard - thought about but haven't tried it
Uwe - I think most is the same, elasticsearch stuff is identical but authorization handling is different
Uwe - you're using postman and not kibana?
Javi - I use postman in my day-to-day work and it can save queries
Uwe - How much code changes was it in the UI and API?
Javi - Project is split into 3
API
UI
Indexer
Javi - Most of the changes were happening in the indexer
Javi - The API project was just changing the queries.
Javi - the UI now looks up the labels
Uwe - Oh, I thought the API would look for the labels.
Uwe - Will the API be used by external users?
Javi - Yes
Uwe - will they also have to use URIs?
Javi - For now yes
Gerhard - we can also have R or Python wrappers around the APIs to improve the UX.
Javi - We have some issues with the map and was wondering if you can help us Uwe.
Javi - We are doing geo aggregations.
Uwe - So are the coordinates saved here in the documents?
Javi - shows kibana
Javi - using
geopoint
Uwe - unfortunately I don't have much experience with the geo aggregations
Javi - shows the clustering of sites and how zooming in and out calculates the aggregations to return the response, a cluster of sites.
Javi - we will have 10s of thousands of sites in the coming datasets
Javi - we like square data grids like DataOne shows DataOne portal
Javi - Looking at geotile API in elasticsearch
Uwe - unfortunately I have no idea and haven't tried this before
Javi - currently we are using the geohash API but I want to use the geotile API.
Uwe - At PANGAEA we are assigning names of the regions but not anything like this with a map with rectangle clustering
Uwe - I prefer full-text search instead of arbitrary maps broken into clusters of things
Uwe - I prefer the current clustering that you have instead of the rectangular one.
Uwe - probably worthwhile to think about utilising full-text search in combination with the current search options.
Guru - Javi is currently doing aggregations in real-time, which causes some slowness. Is there a better way to do this to index it in advance.
Uwe - A bit strange why it's slow for around 800 sites.
Javi - Every time you zoom in or out is performing a new aggregation. So it's not really slow, just inefficient because it's performing a lot of aggregations.
Uwe - At PANGAEA we are adding full-text search and everything during indexing time and have it as a grid and index it.
Gerhard - not really slow currently and the animation with the zoom in and out is blocking the UI.
Uwe - currently don't think there's much to do now as the aggregations is fast. You can precalculate it in the future.
Javi - I read many people are moving to Elasticsearch for geo instead of using something like PostGIS