How to use Isolation Forest in python

Question

How to use Isolation Forest in python

785 views Asked by nicc96 At 08 December 2020 at 16:38

I'm working on detecting outliers in my unlabeled dataset (data are not labeled as inliers/outliers) and I'm using Isolation Forest in Python (scikit-learn library).
I want to get the anomaly score of the data in my dataset and so I'm using the following code:

if_model = IsolationForest(max_samples=100)
if_model.fit(dataset)
anomaly_score = if_model.score_samples(dataset)

However I have some questions:

Is the previous procedure correct or should I split my dataset in two parts, to perform the fit on a set and get the anomaly score on the other set?
what is the utility of the method predict? How should I use it?

Original Q&A

There are 1 answers

**anonymous** · Accepted Answer · 2020-12-08T19:44:44+00:00

To answer your first question, you do not need to split the data set. Test sets are needed for supervised algorithms. If you have an expected result for each row in the data, you can compare the model's output to the expected result to evaluate how well the model performs. This data cannot be used to fit the model, or the model might fit these specific rows of data well without fitting other data, and you would not know. Isolation forest, however, is an unsupervised algorithm. You do not have a list of anomalous rows to compare the isolation forest results against, so there is no use to hold back data to verify that the model works.
To answer the second question, predict gives a yes or no (1 or 0) answer as to whether each row is anomalous in the form of an array. score_samples returns a number representing how anomalous each row is but does not tell you whether it is anomalous or not. See sklearn documentation.

TechQA.

How to use Isolation Forest in python

There are 1 answers

Related Questions in PYTHON

Related Questions in SCIKIT-LEARN

Related Questions in RANDOM-FOREST

Related Questions in ANOMALY-DETECTION

Related Questions in ISOLATION-FOREST

Popular Questions

Trending Questions