Working with Json Logic on Pandas Dataframe

394 views Asked by At

How can I use a manual logic for feature aggregation for example bu using Json Logic (open to other solutions as well) on large dataframes:

For example if I have this dataframe (in reality it's a large DF):

pie_df

       temp  pie_filling
    0  100  "apple"
    1  400  "apple"
    2  70  "cherry"

and this logic (for example inside a json file), in reality the logic file will have multiple aggregations at different nesting levels:

rules = { "and" : [
    {"<" : [ { "var" : "temp" }, 110 ]},
    {"==" : [ { "var" : "pie_filling" }, "apple" ] }
] }

I want the answer to be:

   pie_ready
0  true
1  false
2  false

The logic file should be generic and readable. I can convert the dataframe to json but I am worried this won't be computationally efficient.

I did find this package: https://github.com/nadirizr/json-logic-py but they didn't mention implementing the logic on dataframes

This line doesn't work:

jsonLogic(rules, pie_df.to_json())

I get this error:

{TypeError}'dict_keys' object is not subscriptable
1

There are 1 answers

0
Laurent On

json-logi-py is not maintained anymore, use this fork instead: pip install json-logic-qubit

Then, you can handle your dataframe like this:

from json_logic import jsonLogic

rules = {
    "and": [{"<": [{"var": "temp"}, 110]}, {"==": [{"var": "pie_filling"}, "apple"]}]
}

print(
    pd.DataFrame(
        {"pie_ready": [jsonLogic(rules, row) for row in df.to_dict(orient="records")]}
    )
)

# Output
   pie_ready
0       True
1      False
2      False