I have 8736 nc4 files (30-minute rainfall from 1 Jun - 31 Dec 2000) downloaded from https://disc.gsfc.nasa.gov/datasets/GPM_3IMERGHH_06/summary?keywords=IMERG with naming convention
3B-HHR.MS.MRG.3IMERG.20000601-S000000-E002959.0000.V06B.HDF5.nc4
3B-HHR.MS.MRG.3IMERG.20000601-S003000-E005959.0030.V06B.HDF5.nc4
Start Date/Time: All files in GPM will be named using the start date/time of the temporal period of the data contained in the product. The field has two subfields separated by a hyphen.
Start Date: YYYYMMDD
Start Time: Begin with Capital S
and follow with HHMMSS
End Time: Begin with Capital E
and follow with HHMMSS
Hours are presented in a 24-hour time format, with ‘00’ indicating midnight. All times in GPM will be in Coordinated Universal Time (UTC).
The half-hour sequence starts at 0000
, and increments by 30 for each half hour of the day.
I would like to merge all the files into single nc4. The reason is, I would like to do further processing ie. calculate rolling sum to get 6 or 12hour rainfall accumulation, and other analysis.
I followed suggestion from other similar topic by using:
cdo mergetime file*.nc4 output.nc4
and ncecat file*.nc4 output.nc4
But both are failed with error argument list too long
As suggested from below answer to split the files into separate lists (by months), I did using following script: for i in $(seq -f "%02g" 1 12); do mkdir -p "Month$i"; mv 3B-HHR.MS.MRG.3IMERG.????$i*.nc4 "Month$i"; done
And increase the limit, now ulimit -s
on my mac give answer 65536
Then I tried again using ncecat file*.nc4 output.nc4
in a folder with 1440 files and its worked.
But I just realized that the result has record dimension UNLIMITED and time = 1.
When I open the output.nc4 using Panoply, Record = 1440 and Time only have 1 information: Date 1 Jun 2000
This is something new for me as new user, I am expecting I will have similar output like I did when using Daily or Monthly data, the time dimension will have UNLIMITED value.
Any suggestion how to solve above problem? Is there any step that I should do?
Sounds like a shell limitation (possibly Windows?) to me.
ncecat
keeps at most 3 files open at one time. The NCO Users Guide describes multiple workarounds for handling arbitrarily long lists of input files. At least one of these methods will work for you. HINT: Try the-n
option combined with symbolic links as shown in the manual.Edit in response to comment, 2020-10-22: Here is how the manual demonstrates creating nicely named symbolic links to a million files:
You can shorten the number of arguments piped to /bin/ls by constraining the list with a pattern, so the shell stops complaining, then repeat until all your files have a link. Then you execute the single
ncecat
command shown in the example, with one filename, and you are done.Edit in response to newest question, 20201101:
It seems like you used
ncecat
when what you really need isncrcat
. Their difference is a bit subtle. Now that you solved the shell limit, the easiest way to solve the issue is just to re-do the command withncrcat
instead ofncecat
: