Calculate Difference between columns where criteria in another column is met

33 views Asked by At

I have a dataset as follows:

import pandas as pd
df = pd.DataFrame(
    {"Time": [1,2,3,1,2,3],
     "Prop1": ["A","B","A","A","B","B"],
     "Prop2": [4, 5, 1, 5, 4, 3]})
Time Prop1 Prop2
1 A 4
2 B 5
3 A 1
1 A 5
2 B 4
3 B 3

What I want is the differences...

  1. At each time step
  2. For Prop1 value
  3. between Prop2

From this dataset I'd like to receive something like...

Time Prop1 Prop2 Prop2Diff
1 A 4 NaN
2 B 5 NaN
3 A 1 NaN
1 A 5 (5-4)= 1
2 B 4 (4-5)= -1
3 B 3 NaN

I know there is .diff() function, but I am uncertain how to apply the type of conditions I'm wanting.

This is in an effort to remove deep loops from my scripts, I have already done:

time = df["Time"].unique()
for cTime in time:
    for prop1 in df[df["Time"] == cTime,"Prop1"].unique():
        df.loc[(df["Time"]==cTime) & (df["Prop1"] == prop1] = cDF["Prop2"][1] - cDF["Prop2"][0]

And other variations of the loops.

This is operating on very large datasets, so every loop I can remove tremendously helps.

1

There are 1 answers

0
Panda Kim On

Code

df['Prop2Diff'] = df.groupby(['Time', 'Prop1'])['Prop2'].diff()

df

   Time Prop1  Prop2  Prop2Diff
0     1     A      4        NaN
1     2     B      5        NaN
2     3     A      1        NaN
3     1     A      5        1.0
4     2     B      4       -1.0
5     3     B      3        NaN