[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[netCDF #AHZ-822837]: [netcdfgroup] performance issues
- Subject: [netCDF #AHZ-822837]: [netcdfgroup] performance issues
- Date: Wed, 18 Jan 2012 06:43:07 -0700
Hi Karsten,
> I'm working on a 1D ocean model that stores its output in NetCDF.
>
> With NetCDF4 I've experienced some performance issues as shown in the
> timings below for a specific model run:
>
> NetCDF4:
> real 0m42.940s
> user 0m3.630s
> sys 0m0.110s
>
> NetCDF 3.6.3:
> real 0m1.838s
> user 0m1.820s
> sys 0m0.000s
>
> Its run on the same system only recompiling the executable to link
> either NetCDF 3 or 4.
>
> Disk performance is not an issue:
> /dev/sda:
> Timing buffered disk reads: 344 MB in 3.02 seconds = 114.06 MB/sec
>
> as the resulting file is only 10MB.
>
> I'm using the 4.2 release candidate configured as follows:
>
> C:
> CPPFLAGS=-I/opt/hdf5/include LDFLAGS=-L/opt/hdf5/lib ./configure
> --prefix=/opt/netcdf-${netcdf_ver}_IFORT
>
> Fortran:
> FFLAGS="-O3" FC=ifort F77=ifort CPPFLAGS=-I/opt/hdf5/include CFLAGS="-O3"
> LDFLAG S=-L/opt/hdf5/lib ./configure --prefix=/opt/netcdf-${netcdf_ver}_IFORT
>
> i.e. no specific optimisation.
>
> Netcdf 3.6.3 is also compiled without specific optimisation.
>
> Any ideas anyone?
Does your model only write netCDF files, or does it also open and read data from
netCDF files? One problem area with netCDF-4 is time to open a large number of
files and only read a small amount of data from each. That would typically be
slower with netCDF-4 than netCDF-3, but not typically by as much as your timings
indicate.
Would it be convenient for you to compile the same model against the previous
netCDF version 4.1.3 for comparison of times with the 4.2 release candidate?
That would tell us whether the performance problems you're seeing are specific
to a change made for the 4.2 release or just a version 4 vs. version 3 problem.
It's hard to determine the cause of the slowdown from just the timings provided.
It would be interesting to see a profile that showed where the CPU was spending
most of its time. It's possible to get such profiles with "perf" tools on
Linux (see
the perf(1) man page) .
It's just a guess, but are you using an unlimited dimension with most of your
variables, causing use of "chunking"? Maybe the default chunk sizes are bad for
your data. We might be able to determine that if you could send the output of
ncdump -s -h
on the data files you're writing, because that output would include chunk sizes
used for each variable.
--Russ
Russ Rew UCAR Unidata Program
address@hidden http://www.unidata.ucar.edu
Ticket Details
===================
Ticket ID: AHZ-822837
Department: Support netCDF
Priority: Normal
Status: Closed