I have some data like below:
This is not the actual data, but actual data is similar to this. And, the data comes in a file with 2 spaces between each field. No database is involved in input or output. I am using table format just to make it understandable.
Name Number code
+---------------------+
Albert 122234 xcc
Robert 565435 rtd
Robert 776567 iuy
Robert 452890 yyt
Stuart 776565 ter
In a file the data would look like..
Albert 122234 xcc
Robert 565435 rtd
Robert 776567 iuy
Robert 452890 yyt
Stuart 776565 ter
Now, I need to eliminate the duplicates using SYNCSORT. I can do this using XSUM, but I would get the following data:
Name Number code
+---------------------+
Albert 122234 xcc
Robert 565435 rtd
Stuart 776565 ter
But I need:
Name Number code
+----------------------+
Albert 122234 xcc
Robert 452890 yyt
Stuart 776565 ter
The last set of data has the last occurance of Robert
in the output, while the former set has the first occurance.
So, is there any way to achieve this using XSUM...?
It looks like you want to keep the LAST record of a set of records that have the same sort key.
If you have a recent release of SyncSort then use DUPKEYS with LASTDUP, and EQUALS as mentioned in other answers.
It has been a while since I've used SyncSort, but if I remember correctly, it is possible to code an exit routine that has access to the sortkeys and can accept or reject records. The exit routine is entered for each record and so it is possible to keep prior sortkeys for comparison.
Also, I like writing exits in assembler (BAL), but this could be done with COBOL code.
So, if SyncSort supports a command that does you what want then by all means use it! If not then exits are relatively easy to code...