Create a LINQ for the List<> to fetch records based on conditional parameters

456 views Asked by At

I have multiple files placed in some folder structure. I have to load the complete file level details in a List<> collection.

Folder Structure :

/Servicer
    /Mortgage
        /008_Mortgage_00179C
            /008_Mortgage_00179C_1.csv
            /008_Mortgage_00179C_2.csv
        /009_Mortgage_00180C     
            /009_Mortgage_00180C_1.csv
            /009_Mortgage_00180C_2.csv
    /Note
        /006_Note_00194D
            /006_Note_00194D_1.csv
            /006_Note_00194D_2.csv
        /007_Note_00194E
            /007_Note_00194E_1.csv
        /007_Note_00193F
            /007_Note_00193F_1.csv
    /IRS 1040
       /005_IRS 1040_00872F
        /005_IRS 1040_00872F_1.csv
        /005_IRS 1040_00872F_2.csv
   /Bank Statement
      /0008_Bank Statement_0084H
        /0008_Bank Statement_0084H_1.csv
        /0008_Bank Statement_0084H_2.csv
      /0008_Bank Statement_0084G
        /0008_Bank Statement_0084G_1.csv
        /0008_Bank Statement_0084G_2.csv

The list type and structure is like this :

 List<Mstr_Batch_Data> lstFiles = new List<Mstr_Batch_Data>();

public class Mstr_Batch_Data
{
    public string docType { get; set; }
    public string docId { get; set; }
    public string pcFilePath { get; set; }
    public string ocrFilePath { get; set; }
    public string status { get; set; }
}

Now after loading the complete set of data in the master list, i have to fetch the records( in a different list maybe) based on some conditions/configurable values.

There are only two configurations :

  • BatchType - can be either SAME/DIFFERENT
  • BatchSize - can be 1,n or ALL

I did the first part using LINQ, and i am able to load the complete set,

lstFiles = (from f in Directory.GetFiles(configuration.ocrFolderPath, "*", SearchOption.AllDirectories).ToList()
                           select new Mstr_Batch_Data
                           {
                               docType =  Path.GetFileName(Path.GetDirectoryName(Path.GetDirectoryName(f))),
                               docId = Path.GetFileName(Path.GetDirectoryName(f)),
                               ocrFilePath = f,
                               pcFilePath = string.Concat(configuration.pcFolderPath, @"\",Path.GetFileName(Path.GetDirectoryName(Path.GetDirectoryName(f))),
                               @"\", Path.GetFileName(Path.GetDirectoryName(f)), @"\", Path.GetFileNameWithoutExtension(f),"_classifier.csv"),
                               status="Active"
                           }).ToList();

Now, going to next step of the process,

If BatchType = SAME, and BatchSize = 2, it should return all the records, in the following sort of structure : (all files from the same doctype folder and first 2 DocID folders, goes in one batch and so on)

/Batch 1
   /008_Mortgage_00179C_1.csv
   /008_Mortgage_00179C_2.csv
   /009_Mortgage_00180C_1.csv
   /009_Mortgage_00180C_2.csv
/Batch 2
   /006_Note_00194D_1.csv
   /006_Note_00194D_2.csv
   /007_Note_00194E_1.csv
/Batch 3
  /005_IRS 1040_00872F_1.csv
  /005_IRS 1040_00872F_2.csv
/Batch 4
  /0008_Bank Statement_0084H_1.csv
  /0008_Bank Statement_0084H_2.csv
  /0008_Bank Statement_0084G_1.csv
  /0008_Bank Statement_0084G_2.csv
/Batch 5
  /007_Note_00193F_1.csv

and if BatchType = DIFFERENT and BatchSize = 2, then all files from different doctype folder and only 2 DocId folders should go in one batch

/Batch 1
   /008_Mortgage_00179C_1.csv
   /008_Mortgage_00179C_2.csv
   /006_Note_00194D_1.csv
   /006_Note_00194D_2.csv
/Batch 2
   /005_IRS 1040_00872F_1.csv
   /005_IRS 1040_00872F_2.csv
   /0008_Bank Statement_0084H_1.csv
   /0008_Bank Statement_0084H_2.csv
/Batch 3
   /009_Mortgage_00180C_1.csv
   /009_Mortgage_00180C_2.csv
   /007_Note_00194E_1.csv
/Batch 4
  /0008_Bank Statement_0084G_1.csv
  /0008_Bank Statement_0084G_2.csv
  /007_Note_00193F_1.csv

So, If a do a LINQ on the Master list which takes the 2 configurations and returns the matched records(will be complex but more efficient). I am not sure how to create this LINQ. Second approach would be using for loops, i am trying this but not yet there.

                    lstBatchFiles = (List<String>)lstFiles.Select(o=>o.docId).Distinct();
                    bool flag = true;
                    for (int i = 0; i < lstBatchFiles.Count(); i++)
                    {
                        if (getConfig.batchType == "SAME")
                        {
                            lstFilePath = (from x in lstFiles 
                                          where x.docId.Equals(lstBatchFiles.ElementAt(i))
                                          select new Mstr_Batch_Data
                                          { 
                                              docId = x.docId,
                                              docType = x.docType,
                                              ocrFilePath = x.ocrFilePath,
                                              pcFilePath = x.pcFilePath,
                                              status = x.status
                                          }).ToList();

                            if (tempDocType == lstFilePath.ElementAt(i).docType || tempDocType == null)
                            {
                                // check the no of docs configuration and increment
                                // create request object here
                            }
                            if (flag)
                            {
                                tempDocType = lstFilePath.ElementAt(i).docType; 
                                flag = false;
                            }


                        }
                        else
                        {                                
                            // logic for Diffrent configuration
                        }
                    } 

I understand this is complex problem to explain, please feel free to comment/ask/update. Thanks

1

There are 1 answers

1
SKG On

I think this will work but haven't tested it...

//Add a property in Mstr_Batch_Data class for the DocFolder number : DocFolderId (and maintain it using an auto number 1,2,3 etc while populating)

//define an enumeration => enum Filters{Same, Different}

var size=2; //provide size
var filter=Filters.Same; //provide filter
//var docTypes=lstFiles.GroupBy(f=>f.docType).Select(g=>g.Key);
var files = lstFiles.GroupBy(f=>f.DocFolderId%(size+1)==0).ToList();
var batches = files.GroupBy(f=> ((filter== Filters.Same && f.docType)
                                   || (filter== Filters.Different && 1);