Python Dataframe: For Each Row Pick One Value from Five Different Columns And Place In New Column Based on if Value is in List

Question

Python Dataframe: For Each Row Pick One Value from Five Different Columns And Place In New Column Based on if Value is in List

52 views Asked by PineNuts0 At 17 November 2023 at 01:09

Let's say I'm starting with the following input table below:

date	id	val1	val2	val3	val4	val5
2023-08-29	B3241	496C
2023-09-08	A3290	349C	078F	274F
2023-09-12	D2903	349C	072F	307C	170F	201D
2023-09-14	I13490	497C	0349	303F	101A

The code to create the initial input table is below:

import pandas as pd
df = pd.DataFrame({'date':["2023-08-29","2023-09-08","2023-09-12", "2023-09-14"],'id':["B3241","A3290","D2903", "I13490"],'val1':["496C","349C","349C", "497C"], 'val2':["","078F","072F", "0349"], 'val3':["","274F","307C", "303F"], 'val4':["","","170F", "101A"], 'val5':["","","201D",""]})

I want to look at columns "val1" through "val5" and see which rows contain a value from my code list. I want to populate the "val_final" column accordingly (if the value is in the list).

code_list = ['349C', '303F', '201D', '497C']

If multiple columns contain values from code_list I want to pick the one that's on the right most column.

Given the above logic, my desired output table would look like this:

date	id	val1	val2	val3	val4	val5	val_final
2023-08-29	B3241	496C
2023-09-08	A3290	349C	078F	274F			349C
2023-09-12	D2903	349C	072F	307C	170F	201D	201D
2023-09-14	I13490	497C	349C	303F	101A		303F

I searched all over StackOverflow to try and solve this problem but to no avail. I assume doing if then else is an option starting with column "val5" and going down to column "val1" but I want to learn a more efficient way to do this.

Original Q&A

There are 1 answers

**Suraj Shourie** · Accepted Answer · 2023-11-17T02:50:08+00:00

IIUC you can use apply for each row with a function that returns the first value that exists in the code_list starting with the right-most columns:

# returns the first match
def lambda_func(row, code_list):
  for x in row:
    if x in code_list:
      return x  
  return None

# define your list of "val" columns
val_cols = [x for x in df.columns if x.startswith('val')][::-1]

df['val_final'] = df.apply(lambda row: lambda_func(row[val_cols], code_list), axis=1)
print(df)

Output:

         date      id  val1  val2  val3  val4  val5 val_final
0  2023-08-29   B3241  496C                              None
1  2023-09-08   A3290  349C  078F  274F                  349C
2  2023-09-12   D2903  349C  072F  307C  170F  201D      201D
3  2023-09-14  I13490  497C  0349  303F  101A            303F

TechQA.

Python Dataframe: For Each Row Pick One Value from Five Different Columns And Place In New Column Based on if Value is in List

There are 1 answers

Related Questions in PYTHON

Related Questions in DATAFRAME

Related Questions in LIST

Related Questions in FILTER

Related Questions in ISIN

Popular Questions

Popular Tags

Trending Questions