Derive a new pandas column based on lengh of string in other columns

I want to count the number of columns which have a value in each row and create a new column with that number. Assume if I have 6 columns and two columns starts with a have some value then new column for that row will have the value 2.

df = pd.DataFrame({'ID':['1','2','3'],'ID2':['11','12','13'], 'J1': ['a','ab',''],'J2':['22','','33'],'a1': ['a11','','ab1'],'a2':['22','1','33']})
print df

The output should be like:

  ID  J1  J2 a1 a2 Count_J_cols_have_values count_a_cols_have_values 
0  1   a  22 a11 22             2           2
1  2  ab          1         1           1
2  3  33  ab1   33          1       2

The output should be like:

 ID  J1  J2 a1 a2 Count_J_cols_have_values count_a_cols_have_values 
0  1   a  22 a11 22             2           2
1  2  ab          1         1           1
2  3  33  ab1   33          1       2

2 Answers

2
Sandeep Kadapa On

Use DataFrame.filter with Series.ne and Series.sum as:

df['Count_J_cols_have_values'] = df.filter(regex='^J').ne('').sum(1)
df['count_a_cols_have_values'] = df.filter(regex='^a').ne('').sum(1)

print(df)
  ID ID2  J1  J2   a1  a2  Count_J_cols_have_values  count_a_cols_have_values
0  1  11   a  22  a11  22                         2                         2
1  2  12  ab            1                         1                         1
2  3  13      33  ab1  33                         1                         2
1
U9-Forward On

Or use filter, replace and count:

df['Count_J_cols_have_values'] = df.filter(regex='^J').replace('',np.nan).count(1)
df['count_a_cols_have_values'] = df.filter(regex='^a').replace('',np.nan).count(1)