Trying to write the code for the iforest algorithm in python

88 views Asked by Theresa3443 At 29 December 2024 at 02:59

I am trying to find outliers in a couple of different datasets from the UCI repository (thyroid, diabetes, and lymphography) currently I am working on the code for the iforest algorithm and i cannot get it to work. What am I doing wrong? and what can I do to fix it?

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.ensemble import IsolationForest
from sklearn.preprocessing import LabelEncoder
data = pd.read_csv(r'C:\\Users\\There\\Desktop\\thyroid-disease.csv')

label_encoder = LabelEncoder()
data\['sex'\] = label_encoder.fit_transform(data\['sex'\])
data\['on_thyroxine'\] = label_encoder.fit_transform(data\['on_thyroxine'\])
data\['query_on_thyroxine'\] = label_encoder.fit_transform(data\['query_on_thyroxine'\])
data.dropna(inplace=True)
data = data.astype(float)
selected_features = \['age', 'TSH'\]
X = data\[selected_features\]
clf = IsolationForest(contamination=0.1, random_state=42)
outliers = clf.fit_predict(X)

plt.scatter(X.iloc\[:, 0\], X.iloc\[:, 1\], color='k', s=3., label='Data points')
plt.scatter(X.iloc\[outliers == -1, 0\], X.iloc\[outliers == -1, 1\], color='r', s=30., label='Outliers')
plt.legend(loc='best')
plt.title('Isolation Forest Outlier Detection')
plt.xlabel('Age')
plt.ylabel('TSH')
plt.show()

Original Q&A

TechQA.

Trying to write the code for the iforest algorithm in python

There are 0 answers

Related Questions in PYTHON

Related Questions in OUTLIERS

Related Questions in ISOLATION-FOREST

Popular Questions

Popular Tags

Trending Questions