Features extraction in Real-time prediction in sagemaker

Question

Features extraction in Real-time prediction in sagemaker

171 views Asked by CEFA RAD At 06 May 2021 at 16:37

i want to deploy a real time prediction machine learning model for fraud detection using sagemaker.

i used sagemaker jupyter instance to:

-load my training data from s3 contains transactions
-preprocessing data and features engineering (i use category_encoders to encode the categorical value)
-training the model and configure the endpoint

For the inference step , i used a lambda function which invoke my endpoint to get the prediction for each real time transaction.

should i calculte again all the features for this real time transactions in lambda function ?

for the features when i use category_encoders with fit_transform() function to transform my categorical feature to numerical one, what should I do because the result will not be the same as training set?

is there another method not to redo the calculation of the features in the inference step?

Original Q&A

There are 1 answers

**codez0mb1e** · Answer 1 · 2021-09-07T15:45:03+00:00

should i calculate again all the features for this real time transaction in lambda function?

Yes, when inference a trained model (or predict on real-time data), you should pass exactly the same features list that you use for the training model. If you calculate some features while training (e.g. part of the day from timestamp) you should also calculate these features while inferencing.

for the features when i use category_encoders with fit_transform() function to transform my categorical feature to numerical one, what should I do because the result will not be the same as training set?

You should store all transformations that you use for training model: numeric scalers, categorical encoders, etc.

For python it looks like this:

import joblib # for dump fitted transformers
import category_encoders as ce

# 1. while training model
# fit encoder on historical data
encoder = ce.OneHotEncoder(cols=[...])
encoder.fit(X, y)
# and dump it
joblib.dump(encoder, 'filename.joblib') 

# 2. while inference a trained model
# load fitted encoder
encoder = joblib.load('filename.joblib')
# and apply transformation to new data
encoder.transform(X_new)

TechQA.

Features extraction in Real-time prediction in sagemaker

There are 1 answers

Related Questions in AMAZON-WEB-SERVICES

Related Questions in MACHINE-LEARNING

Related Questions in LAMBDA

Related Questions in AMAZON-SAGEMAKER

Related Questions in FRAUD-PREVENTION

Popular Questions

Popular Tags

Trending Questions