I am using Spring data elasticsearch to query in my elastic documents. My Elasticsearch entity class:
//all the annotation things i.e lombok, de/serializer etc
@Document(indexName = "project", type = "project")
@EqualsAndHashCode
public class ProjectEntity extends CommonProperty implements Serializable {
@Id
private String id;
private String projectName;
private String description;
private String parentProjectId;
private Long projectOwner;
private String projectOwnerName;
private Long projectManager;
private String projectManagerName;
private String departmentId;
private String status;
private String organizationId;
@Field(type = FieldType.Nested)
private List<ActionStatusEntity> actionStatusList= new ArrayList<>();
@Field(type = FieldType.Nested)
private List<TeamMember> teamMemberList;
@Field(type = FieldType.Nested)
private List<UserDefineProperty> riskList;
}
I have done the other things like settings repositories, avoiding for brevity. Data Search:
projectRepository.findByOrganizationIdAndProjectName(userEntity.getOrganizationId().toString() ,projectRequest.getProjectName().trim());
//userEntity.getOrganizationId().toString()="28", projectName="Team Test"
Spring generated query for above call:
{
"from": 0,
"size": 10000,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "28",
"fields": [
"organizationId^1.0"
],
"type": "best_fields",
"default_operator": "and",
"max_determinized_states": 10000,
"enable_position_increments": true,
"fuzziness": "AUTO",
"fuzzy_prefix_length": 0,
"fuzzy_max_expansions": 50,
"phrase_slop": 0,
"escape": false,
"auto_generate_synonyms_phrase_query": true,
"fuzzy_transpositions": true,
"boost": 1
}
},
{
"query_string": {
"query": "Team Test",
"fields": [
"projectName^1.0"
],
"type": "best_fields",
"default_operator": "and",
"max_determinized_states": 10000,
"enable_position_increments": true,
"fuzziness": "AUTO",
"fuzzy_prefix_length": 0,
"fuzzy_max_expansions": 50,
"phrase_slop": 0,
"escape": false,
"auto_generate_synonyms_phrase_query": true,
"fuzzy_transpositions": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"version": true
}
Query Result:
{
"took" : 8,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 4.1767306,
"hits" : [
{
"_index" : "project",
"_type" : "project",
"_id" : "215",
"_version" : 2,
"_score" : 4.1767306,
"_source" : {
"projectName" : "team member only test",
"description" : "team member only test",
"projectOwner" : 50,
"projectOwnerName" : "***",
"departmentId" : "team member only test",
"organizationId" : "28"
}
},
{
"_index" : "project",
"_type" : "project",
"_id" : "408",
"_version" : 17,
"_score" : 4.1767306,
"_source" : {
"projectName" : "Category & Team adding test",
"description" : "Category & Team adding test",
"projectOwner" : 50,
"projectOwnerName" : "***",
"projectManager" : 50,
"projectManagerName" : "***",
"departmentId" : "cat",
"organizationId" : "28"
}
},
{
"_index" : "project",
"_type" : "project",
"_id" : "452",
"_version" : 4,
"_score" : 3.4388955,
"_source" : {
"projectName" : "team member not in system test",
"description" : "id-452",
"projectOwner" : 53,
"projectOwnerName" : "***",
"projectManager" : 202,
"projectManagerName" : "***",
"departmentId" : "abc",
"organizationId" : "28",
}
}
]
}
}
Look at the resultset, the projectName
field-value was checked like contains
method! It didn't check for the full given params.
Why this is happening? how to solve them?
Add: organizationId and projectName fields were set as fieldData=true
The query that Spring Data Elasticsearch derives from the method name is a Elasticsearch string-query with the given arguments as you noticed. For these Elasticsearch analyzes and parses the terms and then does the search for the documents that have these terms in the same order.
Your query with "Team Test" has two terms, "team" and "test", and all the documents you show have these terms in the project name, so they are returned.
If you had a document with "Team Test" and no other terms between these two, this would be returned with a higher score.
This implementation is choosen because it is what normally is expected when searching in Elasticsearch. Image having an index with names and searching for "Harry Miller" would not find a document with "Harry B. Miller".
You can write a custom repository method that builds a query that's fulfilling your needs and use that instead. Or, if you always want to do exact searches on this field, you could define it as a
keyword
field to prevent parsing and analyzing.You could use a match_phrase query with this repository method definition (only using one parameter here, you'd need to add the organization id, but then the resulting query would be too complex for this small code sample):