[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[netCDF #UAU-670796]: Rechunking of a huge NetCDF file
- Subject: [netCDF #UAU-670796]: Rechunking of a huge NetCDF file
- Date: Wed, 12 Feb 2014 11:12:07 -0700
> Hi Russ,
>
> >> I did make some interesting observations. I had previously overlooked the
> >> “-u” flag (it’s documentation is somewhat confusing…?). The time
> >> coordinate has been unlimited in my files. On my Macbook Air:
> >>
> >> nccopy -w -c time/99351,lat/1,lon/1 small.nc test1.nc 11.59s user 0.07s
> >> system 99% cpu 11.723 total
> >>
> >> nccopy -u small.nc small_u.nc
> >>
> >> nccopy -w -c time/99351,lon/1,lat/1 small_u.nc test2.nc 0.07s user 0.04s
> >> system 84% cpu 0.127 total
> >>
> >> That’s amazing!
> >
> > It's because we use the same default chunk length of 1 as HDF5
> > does for unlimited dimensions. But when you use -u, it makes
> > all dimensions fixed, and then the default chunk length is larger.
>
> But both small.nc and small_u.nc are classic netCDF files. So no HDF5
> relations at all…?
Oops, you're right, it has nothing to do with HDF5, but everything to do
with the format for record variables in netCDF classic format files:
http://www.unidata.ucar.edu/netcdf/workshops/2012/performance/ClassicPerf.html
Accessing all the data from a record variable along the unlimited
dimension can require one disk access per value, whereas using the
contiguous storage of fixed-size variables accesses data very
efficiently. On the other hand, if you need to access data from
multiple variables one record-at-a-time, record variables can be
the best layout for data.
> >> However, when I ran a similar test with a bigger (11GB) subset of my
> >> actual data, this time on a cluster (under SLURM), there was no difference
> >> between the two files. Maybe my small.nc is simply too small to reveal
> >> actual differences and everything is hidden behind overheads?
> >
> > That's possible, but you also need to take cache effects into account.
> > Sometimes when you run a timing test, a small file is read into memory
> > buffers, and subsequent timings are faster becasue the data is just
> > read from memory instead of disk, and similarly for writing. With 11GB
> > files, you might not see any in-memory caching, because the system disk
> > caches aren't large enough to hold the file, or even consecutive chunks
> > of a variable.
>
> My non-python timings were naively from the “time” command, which performs
> the command just once. So I don’t think there can be any cache effects here.
Hmmm, not sure how to explain that.
> I’m not sure what I did differently previously with the 11GB test file (maybe
> a cluster with hundreds of users is not the best for performance comparison).
> Anyways, I do think that the -u flag solved my problem. I got fed up with
> queuing for resources on the cluster and decided to go with a normal desktop
> machine with 16GB of memory. So I stripped a single variable from the huge
> file and did the -u operation on the resulting 43GB file, and then run this:
>
> nccopy -m 10G -c time/10000,lat/10,lon/10 shortwave_u.nc shortwave_u_T10.nc
>
> It took only 15 minutes! Without the -u operation the command processed only
> a few GB in 1 hour (after which I cancelled it).
>
> 2/5 variables done now. If no further technical problems arise, I should have
> the data ready for their actual purpose tomorrow. :)
Excellent, and your experience may be useful to other users. I'll add use
of "-u" to my future performance advice.
> Thank you for your help! I will acknowledge you/Unidata in my paper (any
> preference?).
Feel free to acknowledge me. Thanks!
--Russ
> - Henri
>
Russ Rew UCAR Unidata Program
address@hidden http://www.unidata.ucar.edu
Ticket Details
===================
Ticket ID: UAU-670796
Department: Support netCDF
Priority: High
Status: Closed