Solr Group By Field Tokens & Count

331 views Asked by At

I'm using Solr 6.3.0 to store a full tree hierarchy with 3 levels. Each document is a node and its path in the tree is stored in a field, e.g. treePath:>522>12>7 for a level 3 node or treePath:>522>12 for a level 2 node.

Counting the children for a particular level 2 node is easy: I can regex query on treePath:/>522>12>.*/. Also, I can count all the level 3 nodes with a regex query like />[0-9]+>[0-9]>.+/

I'm interested in getting the average branching factor at level 2. I think this should be possible using a faceted query that would group by the prefix of treePath.

The tricky part as I see it is grouping documents that share the prefix of a given field without specifying the actual prefix and letting Solr match them.

Any help is most welcome :)

Thanks!


Edit:

I figured out that I can simply count the level 3 nodes and divide that by the number of level 2 nodes and get the average branching factor but I'm still interested in finding out if there's a way of grouping the documents by field prefix

1

There are 1 answers

0
Hugo Zaragoza On BEST ANSWER

A possible solution would be to store level2 and level3 in two different fields, then faceting on the level2 field will give you all the level2s with their count. Summing this count and dividing by the number of elements would give you the branching factor.

The advantage of this solution over yours is that it can be applied with queries which restrict the trees you want to consider.