I encountered the following problem: I have a pandas dataframe that looks like this.
| id_tranc | sum | bid |
|---|---|---|
| 1 | 4000 | 2.3% |
| 1 | 20000 | 3.5% |
| 2 | 100000 | if >=100 000 - 1.6%, if < 100 000 - 100$ |
| 3 | 30000 | if >=100 000 - 1.6%, if < 100 000 - 100$ |
| 1 | 60000 | 500$ |
code_to_create_dataset:
dataframe = pd.DataFrame({
'id_tranc': [1, 1, 2, 3, 1],
'sum': [4000, 20000, 100000, 30000, 60000],
'bid': ['2.3%', '3.5%', 'if >=100 000 - 1.6%, if < 100 000 - 100$',
'if >=100 000 - 1.6%, if < 100 000 - 100$', '500$']})
Necessary to calculated 'commission', depending columns 'sum' and 'bid'. Final dataframe should be look like:
| id_tranc | sum | bid | comission |
|---|---|---|---|
| 1 | 4000 | 2.3% | 92 |
| 1 | 20000 | 3.5% | 700 |
| 2 | 100000 | if >=100 000 - 1.6%, if < 100 000 - 100$ | 1600 |
| 3 | 30000 | if >=100 000 - 1.6%, if < 100 000 - 100$ | 100 |
| 1 | 60000 | 500$ | 500 |
If calculated with df['commission'] = df['sum'] * df['bid'] - getting result only for first 2 record. Please tell me how to do this correctly.
I would write a small parser based on a regex and
operator:Output:
Regex:
regex demo
Reproducible input: