Data Frame 1
Person Work_order email
P1 111 [email protected]
P2 222 [email protected]
P3 111 [email protected]
P4 -1 [email protected]
P5 444 [email protected]
val person = Seq(
("P1", "111", "[email protected]"),
("P2", "222", "[email protected]"),
("P3", "111", "[email protected]"),
("P4", "-1", "[email protected]"),
("P5", "444", "[email protected]")).toDF("person", "work_order_person", "email_person")
Data Frame 2
Work_order email
111 [email protected]
222 [email protected]
444 [email protected]
val workOrder = Seq(
("111", "[email protected]"),
("222", "[email protected]"),
("444", "[email protected]")).toDF("work_order", "email")
With the above two dataset[Work order and Person] I have to build this below report. Any idea how this will be done.
Report
Work_order email Total_count_aggregation_on_the_email Count_excluding_the_self_work_order_id
111 [email protected] 4 2
222 [email protected] 4 3
444 [email protected] 1 0
Below one is not part of the report. it is just the explanation of the report counts
Total_count_aggregation_on_the_email
4 = [[email protected],[email protected],[email protected],[email protected]]
4 = [[email protected],[email protected],[email protected],[email protected]]
1 = [[email protected]]
Count_excluding_the_self_work_order_id
2 = [222 [email protected] ,-1 [email protected]]
3 = [111 [email protected], 111 [email protected], -1 [email protected] ]
0 = []
Is this possible with the dataframes to create such reports. Appreciate your help.