my team is interested in a feature store solution that enables rapid experimentation of features, probably using feature versioning. In the Feast slack history, I found @Benjamin Tan’s post that explains their feast workflow, and they explain FeatureView versioning:
insights_v1 = FeatureView(
features=[
Feature(name="insight_type", dtype=ValueType.STRING)
]
)
insights_v2 = FeatureView(
features=[
Feature(name="customer_id", dtype=ValueType.STRING)
Feature(name="insight_type", dtype=ValueType.STRING)
]
)
Is this the recommended best practice for FeatureView versioning? It looks like Features do not have a version field. Is there a recommended strategy for Feature versioning? Creating a new column for each Feature version is one approach:
driver_rating_v1
driver_rating_v2
But that could get unwieldy if we want to experiment with dozens of permutations of the same Feature. Featureform appears to have support for feature versions through the "variant" field, but their documentation is a bit unclear.
Adding additional clarity on Featureform:
Variant
is analogous to version. You'd supply a string which then becomes an immutable identifier for the version of the transformation, source, etc. Variant is one of the common metadata fields provided in the Featureform API.Using the example of an ecommerce dataset & spark, here's an example of using the
variant
field to version a source (a parquet file in this case):You can set the variant variable ahead of time:
And you can create versions or variants of the transformations -- here I'm taking a dataframe called
total_paid_per_customer_per_day
and aggregating it.There are some more details on the Featureform CLI here: https://docs.featureform.com/getting-started/interact-with-the-cli