How to remove values from column on condition or first check and then add data to dataframe?

63 views Asked by At

I am trying to remove values from column2 and column3 under condition that there are values in the column1 (but we have to check it for each row). I've tried many things but it was removing all data from column2 and column3. Could you help me solve this problem?

There is also column0 where I have all data that is needed for my function.

Now I think that it would be better and faster to check at the be beginning if there are 'None' values and then add data to the column2 and 3.

Something like:

  1. check all rows in column1 whether they have None values

  2. if yes, then

    data[column0].apply(lambda i: do_something(i))
    
  3. if row has data do nothing

But I don't know how to do that. So I was trying to remove values at the end, but as I said - it removes everything from column 2 and 3.

So for example I've tried something like this:

    if data['column1'].empty:
        data['column2'] = data['column0'].apply(lambda i: do_something(i))
        data['column3'] = data['column0'].apply(lambda i: do_something(i))

my dataframe looks like:

column1  column2    column3
    
ABCDEFG  ABCDEFG    EFG     
ABCDEFG  ABCDEFG    EFG
ABC     
ABCDEFG  ABCDEFG    EFG
           ABCDEFG  EFG

and I wanna get:

column1  column2    column3
        

ABCDEFG  
ABCDEFG  
ABC     
ABCDEFG

           ABCDEFG  EFG
1

There are 1 answers

0
BERA On

You can use .loc:

import pandas as pd

data = [["ABCDEFG", "ABCDEFG", "EFG"],
        ["ABCDEFG", "ABCDEFG", "EFG"],
        ["ABC",     None,      None],
        ["ABCDEFG", "ABCDEFG", "EFG"],
        [None,      "ABCDEFG", "EFG"]]

df = pd.DataFrame.from_records(data=data, columns=["column1","column2","column3"])
  # column1  column2 column3
  # ABCDEFG  ABCDEFG     EFG
  # ABCDEFG  ABCDEFG     EFG
  #     ABC     None    None
  # ABCDEFG  ABCDEFG     EFG
  #    None  ABCDEFG     EFG


df.loc[df["column1"].notna(), "column2"]=None #Where column1 is not None, set column2 to None
df.loc[df["column1"].notna(), "column3"]=None

  # column1  column2 column3
  # ABCDEFG     None    None
  # ABCDEFG     None    None
  #     ABC     None    None
  # ABCDEFG     None    None
  #    None  ABCDEFG     EFG