Lokkong for a better way to discretize float values in pandas column

20 views Asked by At

Problem description: I have a pandas DataFrame with a "statistical" column that contains probabilities expressed as values ranging between 0 and 1. I would like to translate these values into discrete outputs as follows: {0.0 < x < 0.3 => 0, 0.7 < x < 1.0 => 1, 0.3 < x < 0.7 => 9}.

This is the piece of code I have so far. It does the job, but I believe it could be written in a better way.

df_predictions["discrete"] = df_predictions["statistical"]
df_predictions["discrete"][df_predictions["statistical"] < 0.3] = 0
df_predictions["discrete"][df_predictions["statistical"] > 0.7] = -1
df_predictions["discrete"][df_predictions["statistical"] > 0.1] = 9
df_predictions["discrete"][df_predictions["statistical"] < -0.1] = 1

Intuitively, it seems like the pandas.map method could be appropriate for this task, but I'm having trouble getting it to work. Maybe it's not suitable.

df_predictions["discrete"] = df_predictions["discrete"].map([{x < 0.3:1, x > 0.7:1, 0.3 < x < 0.7:9} for x in df_predictions["discrete"]])

I also tried numpy.select method, but without much success.

df_predictions["discrete"] = np.select([df_predictions["discrete"] < 0.3, df_predictions["discrete"] > 0.7, 0.3 < df_predictions["discrete"] < 0.7], [0, 1, 9])

I would appreciate any assistance with this. Looking for a simple solution, preferably one-liner.

0

There are 0 answers