Function with many return values

58 views Asked by At

I have a sample dataframe that looks like this:

data = {"ID": [1,2,3],
        "A": ["", "", ""],
        "B": [2,3,1],
        "C": [1,2,0],
        "var_i3": [0,0,0],
        "var_i4": [0,0,0],
        "var_i5": [0,0,0],
        "var_i6": [0,0,0]
        }
df = pd.DataFrame(data)
df

And would like to assign specific values to the different "var_i#" columns based on specific conditions.

Example:

if A is null and B in equal to 2 or 3 then "var_i3" should be 0 and "var_i4" = 1 and "var_i5" = 0.

I have tried the following:

def process(row):
    if pd.isnull(row["A"]) and row["B"] in [2,3]:
        return row["var_i3"] == 0 & row["var_i4"] == 1 & row["var_i5"] == 0
    elif row["C"] == 0:
        return row["var_i6"] == 13
    else:
        if row["C"] >= 1:
            return row["var_i6"] == 0
        
    return row

df = df.apply(process, axis=1)
df

I'm not sure how the syntax works for multiple conditions as an output.

I also tried to use np.where:

def process(row):
    np.where(df["A"].isnull & df["B"] in [2,3], df["var_i3"] == 0 & df["var_i4"] == 1 & df["var_i5"] == 0, 
             np.where(df["C"] == 0, df["var_i6"] == 13, np.where(df["C"] == 1, df["var_i6"] == 0, row)))
df = df.apply(process)
df

Could you provide any feedback on what is wrong in my code?

1

There are 1 answers

2
Code-Apprentice On

if A is null and B in equal to 2 or 3 then "var_i3" should be 0

You can do this directly with an assignment. It looks something like this:

df[df[df.A.isnull & df.B in [2, 3]]].var_i3 = 0

Similarly for any of the other columns. You can even save the condition in a variable to reuse it:

condition = df[df.A.isnull & df.B in [2, 3]]
df[condition].var_i3 = 0
df[condition].var_i4 = 1
df[condition].var_i5 = 0

Warning: This code is untested, but should illustrate the general principle. I recommend that you read more about "broadcasting" in pandas to understand this better.