This is somewhat similar to this question but a little bit complex.
Assuming I have this table:
data = {'column1': ['A1', 'A1', 'A1', 'A1'],
'column2': ['A9', 'A1', 'A8', 'A1'],
'column3': ['D1', 'D1', 'D1', 'D1'],
'column4': ['A6', 'A2', 'A3', 'A4'],
'column5': ['H1', 'H1', 'H1', 'H1'],
'column6': ['A4', '', '', 'A3'],
'column7': ['A5', '', '', 'A9']}
df = pd.DataFrame(data)
+---------+---------+---------+---------+---------+---------+---------+
| column1 | column2 | column3 | column4 | column5 | column6 | column7 |
+---------+---------+---------+---------+---------+---------+---------+
| A1 | A9 | D1 | A6 | H1 | A4 | A5 |
| A1 | A1 | D1 | A2 | H1 | | |
| A1 | A8 | D1 | A3 | H1 | | |
| A1 | A1 | D1 | A4 | H1 | A3 | A9 |
+---------+---------+---------+---------+---------+---------+---------+
my goal here is to reset the number counterpart of all values containing "A" per row, starting with A1. if "A1" re-occurs on the same row, move to the next cell. Also, values that is not "A" and blanks should be ignored.
+---------+---------+---------+---------+---------+---------+---------+
| column1 | column2 | column3 | column4 | column5 | column6 | column7 |
+---------+---------+---------+---------+---------+---------+---------+
| A1 | A2 | D1 | A3 | H1 | A4 | A5 |
| A1 | A1 | D1 | A2 | H1 | | |
| A1 | A2 | D1 | A3 | H1 | | |
| A1 | A1 | D1 | A2 | H1 | A3 | A4 |
+---------+---------+---------+---------+---------+---------+---------+
If I remember well you had a similar question, you can use a similar
cumsum
logic with a mask:Output: