How to scan characters in strings to match to another string in different column

192 views Asked by At

I have 2 columns of strings and I'd like to match the strings based on the first 3 characters in each string. Basically code that goes over every character of column 1 row 1 and compares it with rows in column 2 to find the best match.

IE: Row 1 Column 1 scans "p""a""s" and looks in Col2 for strings starting with "p""a""s" and so on for Row 2 Column 1.

I'm fairly new to python; my apologies.

Original Table (unsorted):

+-------------+---------+----------+
|  Row Index  |  Col1   |  Col2    |
+-------------+---------+----------+
| 1           | pasta   | sauce    |
| 2           | sauce   | orange   |
| 3           | orange  | pasta    |
+-------------+---------+----------+

Expected Table (after matching)

+-------------+---------+----------+
|  Row Index  |  Col1   |  Col2    |
+-------------+---------+----------+
| 1           | pasta   | pasta    |
| 2           | sauce   | sauce    |
| 3           | orange  | orange   |
+-------------+---------+----------+

I don't have any code to show as I'm not sure how to start this. Thanks.

1

There are 1 answers

2
Darina On

Probably not the fastest and cleanest solution, but will return what you're asking for:

df['Col3'] = df.Col1.apply(lambda x: [i for i in df.Col2 if i.startswith(x[:3])][0])