How to give more weight to recent data in a table in ML.net for the trainer algorithm

160 views Asked by At

I have a basic question about ML.net. Without showing alot of code, I wonder some basics if it is possible to give recent data more Weight importance to it?

For example, if our datatable consists of data where each row has a date like below, I would like to give the recent data more weight in order to tell the training algorithm like "FastTree" or "FastForest" to give those datasamples more importance.

a) I know it would be possible to duplicate the rows to give more importance which I would like to avoid if possible because it would mean longer training time.
b) Also I think "Window Slides" where training a subset of the data going forward in time, I would also trying to avoid for this question.

Is there any method to give actual weights for each row in the table?

In below example table, I just show a weight number between 1-10 for illustrating purposes to understand what I mean:

Sample code:

var context = new MLContext(seed: 0);

//Load the data
vardata = context.Data.LoadFromTextFile<Input>("C:/datafile.csv", hasHeader: true, separatorChar: ',');


var trainTestData = context.Data.TrainTestSplit(data, testFraction: 0.2, seed: 0);
var trainData = trainTestData.TrainSet;
var testData = trainTestData.TestSet;

// Define the data preprocessing pipeline. Concatenate which features to use!
var pipeline = mlContext.Transforms.Concatenate("Features", "feature1", "feature2")
    .Append(mlContext.Transforms.Conversion.ConvertType("Features", "Features", DataKind.Single)) 
    .Append(mlContext.Transforms.NormalizeMinMax("Features"))
    .Append(mlContext.Regression.Trainers.FastTree());

var model = pipeline.Fit(trainData);

public class Input
{
    [LoadColumn(2)] public float feature1;
    [LoadColumn(3)] public float feature2;
}

datafile.cv

Date,weight,feature1,feature2

06/30/2023,1,50,42
07/03/2023,2,52,45
07/05/2023,3,50,47
07/06/2023,4,54,43
07/07/2023,5,55,49
07/10/2023,6,57,44
07/11/2023,7,53,47
07/12/2023,8,52,45
07/13/2023,9,57,44
07/14/2023,10,53,42
0

There are 0 answers