I have multiple files placed in some folder structure. I have to load the complete file level details in a List<> collection.
Folder Structure :
/Servicer
/Mortgage
/008_Mortgage_00179C
/008_Mortgage_00179C_1.csv
/008_Mortgage_00179C_2.csv
/009_Mortgage_00180C
/009_Mortgage_00180C_1.csv
/009_Mortgage_00180C_2.csv
/Note
/006_Note_00194D
/006_Note_00194D_1.csv
/006_Note_00194D_2.csv
/007_Note_00194E
/007_Note_00194E_1.csv
/007_Note_00193F
/007_Note_00193F_1.csv
/IRS 1040
/005_IRS 1040_00872F
/005_IRS 1040_00872F_1.csv
/005_IRS 1040_00872F_2.csv
/Bank Statement
/0008_Bank Statement_0084H
/0008_Bank Statement_0084H_1.csv
/0008_Bank Statement_0084H_2.csv
/0008_Bank Statement_0084G
/0008_Bank Statement_0084G_1.csv
/0008_Bank Statement_0084G_2.csv
The list type and structure is like this :
List<Mstr_Batch_Data> lstFiles = new List<Mstr_Batch_Data>();
public class Mstr_Batch_Data
{
public string docType { get; set; }
public string docId { get; set; }
public string pcFilePath { get; set; }
public string ocrFilePath { get; set; }
public string status { get; set; }
}
Now after loading the complete set of data in the master list, i have to fetch the records( in a different list maybe) based on some conditions/configurable values.
There are only two configurations :
- BatchType - can be either SAME/DIFFERENT
- BatchSize - can be 1,n or ALL
I did the first part using LINQ, and i am able to load the complete set,
lstFiles = (from f in Directory.GetFiles(configuration.ocrFolderPath, "*", SearchOption.AllDirectories).ToList()
select new Mstr_Batch_Data
{
docType = Path.GetFileName(Path.GetDirectoryName(Path.GetDirectoryName(f))),
docId = Path.GetFileName(Path.GetDirectoryName(f)),
ocrFilePath = f,
pcFilePath = string.Concat(configuration.pcFolderPath, @"\",Path.GetFileName(Path.GetDirectoryName(Path.GetDirectoryName(f))),
@"\", Path.GetFileName(Path.GetDirectoryName(f)), @"\", Path.GetFileNameWithoutExtension(f),"_classifier.csv"),
status="Active"
}).ToList();
Now, going to next step of the process,
If BatchType = SAME, and BatchSize = 2, it should return all the records, in the following sort of structure : (all files from the same doctype folder and first 2 DocID folders, goes in one batch and so on)
/Batch 1
/008_Mortgage_00179C_1.csv
/008_Mortgage_00179C_2.csv
/009_Mortgage_00180C_1.csv
/009_Mortgage_00180C_2.csv
/Batch 2
/006_Note_00194D_1.csv
/006_Note_00194D_2.csv
/007_Note_00194E_1.csv
/Batch 3
/005_IRS 1040_00872F_1.csv
/005_IRS 1040_00872F_2.csv
/Batch 4
/0008_Bank Statement_0084H_1.csv
/0008_Bank Statement_0084H_2.csv
/0008_Bank Statement_0084G_1.csv
/0008_Bank Statement_0084G_2.csv
/Batch 5
/007_Note_00193F_1.csv
and if BatchType = DIFFERENT and BatchSize = 2, then all files from different doctype folder and only 2 DocId folders should go in one batch
/Batch 1
/008_Mortgage_00179C_1.csv
/008_Mortgage_00179C_2.csv
/006_Note_00194D_1.csv
/006_Note_00194D_2.csv
/Batch 2
/005_IRS 1040_00872F_1.csv
/005_IRS 1040_00872F_2.csv
/0008_Bank Statement_0084H_1.csv
/0008_Bank Statement_0084H_2.csv
/Batch 3
/009_Mortgage_00180C_1.csv
/009_Mortgage_00180C_2.csv
/007_Note_00194E_1.csv
/Batch 4
/0008_Bank Statement_0084G_1.csv
/0008_Bank Statement_0084G_2.csv
/007_Note_00193F_1.csv
So, If a do a LINQ on the Master list which takes the 2 configurations and returns the matched records(will be complex but more efficient). I am not sure how to create this LINQ. Second approach would be using for loops, i am trying this but not yet there.
lstBatchFiles = (List<String>)lstFiles.Select(o=>o.docId).Distinct();
bool flag = true;
for (int i = 0; i < lstBatchFiles.Count(); i++)
{
if (getConfig.batchType == "SAME")
{
lstFilePath = (from x in lstFiles
where x.docId.Equals(lstBatchFiles.ElementAt(i))
select new Mstr_Batch_Data
{
docId = x.docId,
docType = x.docType,
ocrFilePath = x.ocrFilePath,
pcFilePath = x.pcFilePath,
status = x.status
}).ToList();
if (tempDocType == lstFilePath.ElementAt(i).docType || tempDocType == null)
{
// check the no of docs configuration and increment
// create request object here
}
if (flag)
{
tempDocType = lstFilePath.ElementAt(i).docType;
flag = false;
}
}
else
{
// logic for Diffrent configuration
}
}
I understand this is complex problem to explain, please feel free to comment/ask/update. Thanks
I think this will work but haven't tested it...