Compare multi values against multi values, maintaining performance

68 views Asked by At

I have two forms of records or data. One being Inventory Record and the other a Product.

Both the Inventory Record and Product Record have Code values stored in separate tables. Each record has a value record and is foreign keyed to the main table.

I am trying NOT to write this code below because i know from a performance standpoint it will run for days.

 public void test(product _Product)
        {
            DBDataContext db = new DBDataContext();


            var ProductCodes = db.tbl_ProdCodeValues.Where(x=>x.productID == _Product.productID);

            foreach (var code in ProductCodes)
            {

                var matches = db.InventoryCodeValues.Any(x => x.InventoryValue.ToLower().Contains(code.ProdCodeValue));
            }

        }

What is the best way to compare a collection of values against another collection of values. Please let me know if you think a Task or Multi Threading Approach would also improve performance.

Each product could have a approx 1-1000 values and the Inventory will ALWAYS have over 1 million values and still to grow. So performance is key because the math will tell you there is a ton of comparisons being performed. I can also move the Query from LINQ to SQL to T-SQL Stored procedures if needed.

2

There are 2 answers

5
usr On BEST ANSWER

Your foreach is like a Select.

var ProductCodes = db.tbl_ProdCodeValues.Where(x=>x.productID == _Product.productID);
var withMatches =
 ProductCodes
 .Select(code => new {
  code,
  matches = db.InventoryCodeValues.Any(x => x.InventoryValue.ToLower().Contains(code.ProdCodeValue))
 });

And now all of this remotes to the database. Look at the query plan to see whether this is acceptable already or whether we need to tune this. Because of that non-SARGable predicate this is probably a nasty cross product plus filter.

What format are the InventoryCodeValues? Is this a delimited list? If you split the list into rows then you can use a simple == predicate and make this query run in linear time instead of quadratic time.

0
Graffito On

What is the best way to compare a collection of values against another collection of values.

The proposed method consist in creating a list merging both types of records and sort the merged list according to matching criteria.

Once sorted, you will identify and process the matches in sequence. In the below code, I assumed that the compared Code values were string.

public class Member 
{ 
  internal string Key ; 
  internal int    Source ; 
  internal object DataObject ;
  internal Member(string key,string source,object dataobject)
  { // source identifies the source, e.g "P" for Prod and "I" for Inventory
    Key = key ;
    Source = Source ; 
    DataObject = dataobject 
  }
}

// create and fill the merged list  
List<Member> list = new List<member>();
for (int i=0;i<db.tbl_ProdCodeValues.Count;i++) 
{
  string prodcodevalue = ... ; // set the value here
  object prodcodeobject= ... ; // set the record object here
  list.Add(new Member(prodcodevalue,1,prodcodeobject) ;
}
for (int i=0;i<db.tbl_InventoryCodeValues.Count;i++) 
{
  string inventorycodevalue= ... ; // set the value here
  object inventorycodeobject= ... ; // set the record object here
  list.Add(new Member(inventorycodevalue,2,inventorycodeobject) ;
}

// sort the merged list 
list.Sort(delegate(Member x, Member y) { return (x.Key+" "+x.Source).CompareTo(y.Key+" "+y.Source); });

// Process the merged list
// we assume that a key cannot be empty
list.Add(new Member("",0,null) ; // just for proper termination of next loop
string CurKey="" ;
int starti=-1 ; int endi=-1 ; 
int startp=-1 ; int endp=-1 ;  
for (int n=0;n<list.Count;n++)
{
  if (list[n].Key==CurKey) { if (list[n].Source="I") endi=n ; if (list[n].Source="P") endp=n ;
  else 
  {
    if (CurKey!="" ) 
    { // -------- Process the CurKey for matches ---------
      // The Prod      records corresponding to CurKey are given by list[p].dataobject whith p from startp to endp  
      // The Inventory records corresponding to CurKey are given by list[i].dataobject whith i from starti to endi 
      // if either starti or startp is negative, there is no match
      ... // insert your business code there
    }
    if (list[n].Source="I") { starti=endi=n  ; startp=endp=-1 ; }
    if (list[n].Source="P") { starti=endi=-1 ; startp=endp=n  ; }
}