Raven DB filter on subset of array items and sort on the cheapest of the filter results items

368 views Asked by At

Assuming i have an parent class that I filter on various properties, one of which is a property that is an array of items . Now say that i want to only return the parent item if my array of items as above a min value and below a max value ...that's fine i can work that bit out; What if i then want to then sort on the filtered result set of those items

I made a c# fiddle example to show what im trying to achieve : https://dotnetfiddle.net/mV4d28 (note that foo2 is returned first even though foo1 has items in its array that are less that those in foo2)

As i need to do this using a index i need the index to be able to compute the order by based on the filter criteria used in my query.

I know elasticsearch has an inner hits function that dose this and mongo has pipelines which also dose this so im sure Raven must have a way of doing this too ?

I was hoping using just index and a transform with prams i could achieve this so I tried it:

my index and transform look like this

public class familyTransfrom : AbstractTransformerCreationTask<ParentItem>
{
    public class Result : ParentItem{
        public double[] ChildItemValuesFiltered { get; set; }
    }
    public familyTransfrom(){
        TransformResults = parents => from parent in parents
        let filterMinValue = Convert.ToDouble(ParameterOrDefault("FilterMinValue", Convert.ToDouble(0)).Value<double>())
        let filterMaxValue = Convert.ToDouble(ParameterOrDefault("FilterMaxValue", Convert.ToDouble(9999)).Value<double>())
        select new Result{
            ParentItemId = parent.ParentItemId,
            ParentItemName = parent.ParentItemName,
            ParentItemValue = parent.ParentItemValue,
            //ChildItemValuesFiltered = parent.ChildItems.Where(p => p.ChildItemValues.Any(y => Convert.ToDouble(y) >= Convert.ToDouble(filterMinValue) && Convert.ToDouble(y) <= Convert.ToDouble(filterMaxValue))).SelectMany(t => t.ChildItemValues).ToArray<double>(),
            ChildItemValuesFiltered = parent.ChildItems.SelectMany(p => p.ChildItemValues.Where(y => Convert.ToDouble(y) >= Convert.ToDouble(filterMinValue) && Convert.ToDouble(y) <= Convert.ToDouble(filterMaxValue))).ToArray<double>(),
            ChildItems = Recurse(parent, x => x.ChildItems).Select(y => y).ToArray()    
        };
    }
}
public class familyIndex : AbstractIndexCreationTask<ParentItem>{
        public class Result : ParentItem {
                public double[] ChildItemValues { get; set; }
        }             
        public familyIndex(){
            Map = parents => from parent in parents
                select new Result{
                    ParentItemId = parent.ParentItemId,
                    ParentItemName = parent.ParentItemName,
                    ParentItemValue = parent.ParentItemValue,
                    ChildItemValues = parent.ChildItems.SelectMany(p => p.ChildItemValues.Select(y => y)).ToArray(),  
                    ChildItems = Recurse(parent, x => x.ChildItems).Select(y => y).ToArray()  
                };                                                                              
                Index("ParentItemId", FieldIndexing.Analyzed);
                Index("ParentItemName", FieldIndexing.Analyzed);
                Index("ParentItemValue", FieldIndexing.Analyzed);
        Index("ChildItemValues", FieldIndexing.Analyzed);
        Index("ChildItems", FieldIndexing.Analyzed);
             }
}

my query is as follows , (this is using the live raven playground so this should just work out of the box it you want to use it)

using (IDocumentStore store = new DocumentStore { Url = "http://live-test.ravendb.net/", DefaultDatabase = "altha" })
{
    store.Initialize(); 
    using (IDocumentSession session = store.OpenSession()) 
    {
        if(1 == 2){         
            //foreach (ParentItem element in data.OfType<ParentItem>()) {
            //  session.Store((ParentItem)element);
            //  session.SaveChanges();
            //}
            new familyIndex().Execute(store);
            new familyTransfrom().Execute(store);
        }else{
            double filterMinValue = 3.0;
            double filterMaxValue = 4.0;
            var results =  session
                .Advanced
                .DocumentQuery<familyIndex.Result,familyIndex>()
                .WhereBetweenOrEqual("ChildItemValues", filterMinValue, filterMaxValue)
                .SetResultTransformer<familyTransfrom, familyTransfrom.Result>()
                .SetTransformerParameters(new Dictionary<string, RavenJToken> {
                    { "FilterMinValue", filterMinValue },
                    { "FilterMaxValue", filterMaxValue } })
                .OrderBy("ChildItemValues")
                .OfType<ParentItem>().ToList(); 
                results.Dump();                         
    }}
}

What i found was i cant use "ChildItemValuesFiltered" from the transform result as its not index. So unless i can order by the result of a transform ? i couldn't get this to work as although it filters it dosnt order correctly. Is there another to achieve what i want using projections or intersection or rank or reduce try method ?

I was thinking if i had to perhaps i could use https://ravendb.net/docs/article-page/3.5/csharp/indexes/querying/sorting#custom-sorting

and do something like this:

public class SortByNumberOfCharactersFromEnd : IndexEntriesToComparablesGenerator
{
    private readonly double filterMinValue;
    private readonly double filterMinValue;

    public SortByNumberOfCharactersFromEnd(IndexQuery indexQuery)
        : base(indexQuery)
    {
        filterMinValue = IndexQuery.TransformerParameters["FilterMinValue"].Value<double>();     // using transformer parameters to pass the length explicitly
        filterMaxValue = IndexQuery.TransformerParameters["FilterMaxValue"].Value<double>();
    }

    public override IComparable Generate(IndexReader reader, int doc)
    {
        var document = reader.Document(doc);
        double[] childItemValues = (double[])document.GetValues("ChildItemValuesFiltered").Select(double.Parse).ToArray();          // this field is stored in index
        return childItemValues.Where(x => x >= min && x <= max).Min();
    }
}

then do a where filter and order by clause using index and transform passing in the same prams that i use in the where filter . however im not sure if this would work ? More importantly im not sure how i go about getting the sort dll into the plugins ie what name space should the class go under, what name spaces dose it need to import, what assembly name dose it need to use etc According to https://ravendb.net/docs/article-page/3.5/csharp/server/plugins/what-are-plugins i just need to drop the dll in and raven will this this up , however i cant seem to find what name space i need to reference for IndexEntriesToComparablesGenerator ?

im using linqpad 5 to test my stuff ...so in order to use the custom order i have to reference the class

any tips or advice or how to guild/examples welcome

1

There are 1 answers

1
Rawdon Hume On BEST ANSWER

so it didn't occur to me that i could do the filtering in the transform

TransformResults = parents => from parent in parents
        let filterMinValue = Convert.ToDouble(ParameterOrDefault("FilterMinValue", Convert.ToDouble(0)).Value<double>())
        let filterMaxValue = Convert.ToDouble(ParameterOrDefault("FilterMaxValue", Convert.ToDouble(9999)).Value<double>())
        select new {
            ParentItemId = parent.ParentItemId,
            ParentItemName = parent.ParentItemName,
            ParentItemValue = parent.ParentItemValue,
            //ChildItemValuesFiltered = parent.ChildItems.Where(p => p.ChildItemValues.Any(y => Convert.ToDouble(y) >= Convert.ToDouble(filterMinValue) && Convert.ToDouble(y) <= Convert.ToDouble(filterMaxValue))).SelectMany(t => t.ChildItemValues).ToArray<double>(),
            ChildItemValuesFiltered = parent.ChildItems.SelectMany(p => p.ChildItemValues.Where(y => Convert.ToDouble(y) >= Convert.ToDouble(filterMinValue) && Convert.ToDouble(y) <= Convert.ToDouble(filterMaxValue))).ToArray<double>(),
            ChildItems = Recurse(parent, x => x.ChildItems).Select(y => y).ToArray()    
        } into r
        where r.ChildItemValuesFiltered.Length > 0
        orderby r.ChildItemValuesFiltered.Min()
        select r;

This gives me what i wanted, here are the sample query:

http://live-test.ravendb.net/databases/altha/indexes/familyIndex?start=0&pageSize=25&resultsTransformer=familyTransfrom&tp-FilterMinValue=3&tp-FilterMaxValue=4

i cant take credit for this as guys at raven helped me but sharing the knowledge for others