I am working with JCL and there is what is called an ICEMAN which is which is invoked when the IBM SORT utility DFSORT is used. DFSORT can be used to SORT, COPY or MERGE files, amongst other things. In the example below there the output is from a SORT. My question is how many sortwork (//SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,30)) files are needed. They seem to me to always vary in number when I see them in JCL. Is there a formula for this to figure the size of how many SORTWKnns are needed?
JCL Code:
//STEP5 EXEC PGM=ICEMAN,COND=(4,LT)
//SYSOUT DD SYSOUT=1
//SYSIN DD DSN=CDP.PARMLIB(cardnumberhere),DISP=SHR
//SORTIN DD DSN=filename,DISP=SHR
//SORTOUT DD DSN=filename,DISP=(OLD,KEEP),
// DCB=(LRECL=5000,RECFM=FB),
// SPACE=(CYL,30)
//SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,30)
//SORTWK02 DD UNIT=SYSDA,SPACE=(CYL,30)
//SORTWK03 DD UNIT=SYSDA,SPACE=(CYL,30)
//SORTWK04 DD UNIT=SYSDA,SPACE=(CYL,30)
and
will give you the same result. They are
ALIAS
es of each other, and the same program is executed whichever PGM= is specified.As cschneid has indicated well, SORTWKnn are "sort work datasets", and the tendency to copy JCL without reference to existing "standard" datasets leads to a lot of overallocation of work dataset space.
Workspace for SORT can be specified in two ways, either manually (putting in the SORTWKnn files, and the maximum number is far in excess of 15) or dynamically using DYNALLOC.
DYNALLOC is the recommended approach, as workspace will be allocated on what is understood, by SORT, to be needed. Lookup the associated installation options/overrides on the OPTION statement as well.
Typically, there will be default DYNALLOC values which will deal with the majority of SORT steps, and then specific OPTION parameters will be provided for exceptionally large SORTs.
Manual definition of SORTWKnn datasets in a jobstep will "turn off" any dynamic allocation for that step.
Specific definition of SORTWKnn datasets is sometimes convenient, but not often. The space needed is probably closer to 1.2 times input file these days. You can check the SYSOUT from a typical run of a particular jobstep and see how much space was actually used, adjusting the primary SORTWKnn space or number of datasets to a better fit if there is over-/under-allocation.
It is often a good idea to specify additional information (average record-length, estimated number of records) when DYNALLOC is used for a SORT invoked by a programming language. This is because SORT may not be able to "see" the input dataset, so does not have much information for estimating the workspace required.
Separately, it is best to leave all DCB information off output files. SORT will provide correct DCB information from the input dataset and taking into account any manipulations on the data within SORT Control Cards. If you leave DCB infomation in the JCL (the LRECL, RECFM) you have two places to change it whenever the file changes, rather than one.
In your actual example, over 100 cylinders of space are allocated unnecessarily whilst the step is running. This type of thing, when applied to many JOBs, can lead to failures in other JOBs and even the purchase/charging of/for additional DASD (disk space) which is not needed.