Forward fill cols where equal to value - pandas

161 views Asked by At

I'm trying to forward fill specific columns in a df where a equal to a specific value. Using the df below, I want to fill 'Code','Val1','Val2','Val3' where code is equal to item.

The following works fine on this dummy data but when I apply to my actual data it's returning an error:

ValueError: Location based indexing can only have [labels (MUST BE IN THE INDEX), slices of labels (BOTH endpoints included! Can be slices of integers if the index is integers), listlike of labels, boolean] types

The function only works on my dataset when I drop null values prior to executing the update function. However, this is pointless as the df won't be filled.

import pandas as pd
import numpy as np

df = pd.DataFrame({   
        'X' : ['X',np.nan,np.nan,'Y',np.nan,'Z',np.nan,np.nan,np.nan], 
        'Val1' : ['B',np.nan,np.nan,'A',np.nan,'C',np.nan,np.nan,np.nan],            
        'Val2' : ['B',np.nan,np.nan,'A',np.nan,'C',np.nan,np.nan,np.nan],   
        'Val3' : ['A',np.nan,np.nan,'C',np.nan,'C',np.nan,np.nan,np.nan],                 
        'Code' : ['No',np.nan,np.nan,'item',np.nan,'Held',np.nan,np.nan,np.nan],                  
        })

# This function works for this dummy df
df.update(df.loc[df['Code'].str.contains('item').ffill(), ['Code','Val1','Val2','Val3']].ffill())

Intended output:

   Col FULLNAME PERSON_ID STATISTIC_CODE  Helper
0    X        B         B              A      No
1  NaN      NaN       NaN            NaN     NaN
2  NaN      NaN       NaN            NaN     NaN
3    Y        A         A              C  Assign
4  NaN        A         A              C     NaN
5    Z        C         C              C    Held
6  NaN      NaN       NaN            NaN     NaN
7  NaN      NaN       NaN            NaN     NaN
8  NaN      NaN       NaN            NaN     NaN
1

There are 1 answers

3
caiolopes On BEST ANSWER

I think this can do what you want... It is not very elegant, but, you get the idea:

cols = ['Val1', 'Val2', 'Val3', 'Code']
len_df = len(df)

indexes = [i for i, x in enumerate(df['Code'].str.contains('item')) if x is True]

for i in indexes:
    item_row = df.loc[i, cols]

    j = i+1
    current_code = df.loc[j, 'Code']

    while current_code is np.nan:
        df.loc[j, cols] = item_row
        j += 1
        if j < len_df:
            current_code = df.loc[j, 'Code']
        else:
            break

Example (I modified a little bit your example):

Input:

    X       Val1    Val2    Val3    Code
0   X       B       B       A       No
1   NaN     NaN     NaN     NaN     NaN
2   NaN     NaN     NaN     NaN     NaN
3   Y       A       A       C       item
4   NaN     NaN     NaN     NaN     NaN
5   NaN     NaN     NaN     NaN     NaN
6   Z       C       C       C       item
7   NaN     NaN     NaN     NaN     NaN
8   K       T       P       X       Held
9   NaN     NaN     NaN     NaN     NaN

Result:

    X    Val1   Val2    Val3    Code
0   X    B      B       A       No
1   NaN  NaN    NaN     NaN     NaN
2   NaN  NaN    NaN     NaN     NaN
3   Y    A      A       C       item
4   NaN  A      A       C       item
5   NaN  A      A       C       item
6   Z    C      C       C       item
7   NaN  C      C       C       item
8   K    T      P       X       Held
9   NaN  NaN    NaN     NaN     NaN