Using training_window in the featuretools dfs on the nasa turbofan example returns empty features

72 views Asked by At

I am trying some experiments using the Remaining Useful Life prediction example on the Turbofan Engine Degradation Simulation Data Set from NASA. I want to use a small number of data points before the cut-off time to create features and for that I am trying to use the training_window="50m" parameter in the featuretools.dfs function. This value is valid because I have generated a time column for the dataframe with frequency=600s. That means my training window should select 5 values for each cut-off time to create features. However, using the parameter returns an empty feature matrix and so far I have not been able to figure out the reason. I am using the same code as given in this notebook with some additional changes that are:

  • I used CidCe primitive from the advanced notebook.
  • I used the following piece of code to search for labels which selects duplicate entries as well
label_times = lm.search(
        data.sort_values('time'),
        num_examples_per_instance=5,
        minimum_data=100,
        drop_empty=False,
        gap = 10,
        verbose=True,
    )
1

There are 1 answers

0
Pranav Prakash On

The mistake was mine. It was written somewhere in the documentation that I need to add the list time indices myself but in my defense, I never got a warning related to this contrary to what was mentioned in the documentation. I fixed it by using es.add_last_time_indexes().