I'm trying to implement a flexible searcher for my frontend application.
Document structure
The structure of my documents looks like this:
{
"student": {
"id": 1,
"name": "Joe"
},
"city": {
"id": 102,
"name": "London"
},
"tags": [
{ "id": 33, "name": "football" },
{ "id": 34, "name": "basketball" },
{ "id": 35, "name": "music" },
...
],
"skills": [
{ "id": 302, "name": "Active listening" },
{ "id": 23, "name": "Collaboration" },
{ "id": 34, "name": "Communication" },
...
]
}
I upload multiple documents with the same structure but with different data. Various students might have different tags but also certain tag can be assigned to multiple users. The same relation applies to skills.
What I want to achieve?
Let's imagine that I have three <Selects />. The first one is used to select city, the second one to select tags and the third one to select skills. When I choose London from the first select the I want to get the list of all tags for students live in London. I would like to have a possibility to filter out tags (tags names) based on the text I enter to <Select />. What's more - that list should be paginated.
Mapping I currently use
{
"mappings": {
"properties": {
"student": {
"type": "nested",
"properties": {
"id": { "type": "integer" },
"name": {
"type": "keyword"
}
}
},
"city": {
"type": "nested",
"properties": {
"id": { "type": "integer" },
"name": {
"type": "keyword"
}
}
},
"tags": {
"type": "nested",
"properties": {
"id": { "type": "integer" },
"name": { "type": "keyword" }
},
"skills": {
"type": "nested",
"properties": {
"id": { "type": "integer" },
"name": { "type": "keyword" }
}
My attempts
This is my first attempt to solve my case. It allows pagination but it doesn't support filtering out results based on text provided in <Select />
{
"aggs": {
"aggregatorField": {
"nested": {
"path": "tags"
},
"aggs": {
"aggregator": {
"composite": {
"size": 11,
"sources": [
{
"aggregator": {
"terms": {
"field": "tags.id"
}
}
}
]
},
"aggs": {
"item": {
"top_hits": {
"size": 1,
"sort": [
{
"tags.id": {
"order": "desc"
}
}
],
"_source": {
"include": [
"tags"
]
}
}
}
}
}
}
}
},
"size": 0
}
To solve add filtration I added normalizer to my mapping and applied it to tags and skills
"settings": {
"analysis": {
"normalizer": {
"lowercase_normalizer": {
"type": "custom",
"char_filter": [],
"filter": ["lowercase", "asciifolding"]
}
}
}
},
Now my search body looks like this:
{
"aggs": {
"aggregatorField": {
"nested": {
"path": "tags"
},
"aggs": {
"aggregator": {
"terms": {
"field": "tags.name",
"include": ".*ball.*",
"size": 10000
},
"aggs": {
"item": {
"top_hits": {
"size": 1,
"sort": [
{
"tags.id": {
"order": "desc"
}
}
],
"_source": {
"include": [
"tags"
]
}
}
},
"bucks": {
"bucket_sort": {
"from": 0,
"size": 11
}
}
}
}
}
}
},
"size": 0
}
It supports searching by text but doesn't support pagination. Additionally, it doesn't look like an elegant query to me so I assume that maybe there is a better way to handle that case. Maybe there is a way to extract all tags and skills to a separate index and make something similar to sql join?
Thanks for the help!
Your Way
I shortened your mapping
created some documents with my tags
and shortened your aggregation query
This query paginates tags
"from": 0,"size": 2"from": 2,"size": 2but sort tags only by name (or document count)
Separate Index
I don't know a way to extract all tags to a separate index only inside Elasticsearch. This way found a new problem - synchronization