[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #UPO-861727]: getVar crashed with compressed file



Use of the imap argument is very rare. My curiosity is why
you guys were using it at all. 

> The function is only slow when we are working with compressed netcdf file 
> (produced by nccopy). If we read off the uncompressed file, the performance 
> seems to be ok.
> 
> I did not personally wrote that part of the code, but looking at the 
> documentation:
> 
> Imap
> Vector of integers that specifies the mapping between the dimensions of a 
> netCDF variable and the in-memory structure of the internal data array. 
> imap[0] gives the distance between elements of the internal array 
> corresponding to the most slowly varying dimension of the netCDF variable. 
> imap[n-1] (where n is the rank of the netCDF variable) gives the distance 
> between elements of the internal array corresponding to the most rapidly 
> varying dimension of the netCDF variable. Intervening imap elements 
> correspond to other dimensions of the netCDF variable in the obvious way. 
> Distances between elements are specified in type-independent units of 
> elements (the distance between internal elements that occupy adjacent memory 
> locations is 1 and not the element's byte-length as in netCDF 2).
> 
> 
> Does it matter if the size of the dataValues does not match up with the size 
> of the variables I am trying to read ?
> 
> For example,
> 
> If the size of the variable is 100X100 and I only need 2X2 from it. If I 
> allocate the data array size of 4, do I need to use imap in this case ?
> 
> 
> Thank you very much,
> 
> Norman
> 
> -----Original Message-----
> From: Unidata netCDF Support <address@hidden>
> Sent: Friday, August 17, 2018 4:11 PM
> To: Lo, Norman <address@hidden>
> Cc: address@hidden; Lo, Norman <address@hidden>
> Subject: [netCDF #UPO-861727]: getVar crashed with compressed file
> 
> Now I understand; you were using the getvarm functions of the netcdf-c API. 
> These are known to be slow for large reads.
> What led you to use the version of getVar with the imap argument.
> Did you read a piece of documentation that said to use it?
> 
> >
> > So originally we were using the function:
> >
> > void        getVar (const std::vector< size_t > &start, const std::vector< 
> > size_t > &count, const std::vector< ptrdiff_t > &stride, const std::vector< 
> > ptrdiff_t > &imap, double *dataValues) const
> >
> > to get a subset of a variable.
> >
> > After we change to use the following function:
> >
> > void        getVar (const std::vector< size_t > &start, const std::vector< 
> > size_t > &count, double *dataValues) const
> >
> >
> >
> > Everything seems to be normal now.
> >
> > Thanks,
> >
> > Norman
> >
> >
> >
> > -----Original Message-----
> > From: Unidata netCDF Support <address@hidden>
> > Sent: Friday, August 3, 2018 3:43 PM
> > To: Lo, Norman <address@hidden>
> > Cc: address@hidden; Lo, Norman <address@hidden>
> > Subject: [netCDF #UPO-861727]: getVar crashed with compressed file
> >
> > The netcdf library uses the HDF5 dynamically loaded filter mechanism.
> > Using is somewhat complicated. A alightly simpler possibility is to use the 
> > szip compression.
> > From our FAQ:
> >
> > > *Optionally*, you can also build netCDF-4 with the szip library (a.k.a.
> > > szlib). If building with szlib, get szip 2.0 or later. Technically,
> > > we mean that the HDF5 library is built with szip support. The netcdf
> > > build will then inherit szip support from the HDF5 library.
> > ? If you intend to write files with szip compression, then we suggest
> > > that you use [libaec](https://gitlab.dkrz.de/k202009/libaec.git)
> > > to avoid patent problems. That library can be used as a drop-in
> > > replacement for the standard szip library.
> >
> > > I have tried with different level of compression level, but the result is 
> > > about the same.
> > >
> > > Is there any other compression tool I can try with ?
> > >
> > > Thanks,
> > >
> > > Norman
> > >
> > > -----Original Message-----
> > > From: Unidata netCDF Support <address@hidden>
> > > Sent: Thursday, August 2, 2018 11:49 AM
> > > To: Lo, Norman <address@hidden>
> > > Cc: address@hidden; Lo, Norman <address@hidden>
> > > Subject: [netCDF #UPO-861727]: getVar crashed with compressed file
> > >
> > > Baffling. The only thing I can think of is that for some reason the 
> > > dataset you have does not compress well with libz.
> > > Perhaps you might try different compression levels to see if that affects 
> > > things either slower or faster.
> > >
> > > >
> > > > I would like to add that the getVar operation is just very slow. It 
> > > > will eventually return after 10-15 mins on a compressed file using 
> > > > nccopy. If using non compressed, or other already compressed files, the 
> > > > function returns well within a second.
> > > >
> > > > Thanks,
> > > >
> > > > Norman
> > > >
> > > > -----Original Message-----
> > > > From: Unidata netCDF Support <address@hidden>
> > > > Sent: Monday, July 30, 2018 12:51 PM
> > > > To: Lo, Norman <address@hidden>
> > > > Cc: address@hidden; Lo, Norman
> > > > <address@hidden>
> > > > Subject: [netCDF #UPO-861727]: getVar crashed with compressed file
> > > >
> > > > Ok, it sounds like the netcdf-c library per-se is ok (else nccopy would 
> > > > fail). So, the problem would seem to be in either your program or in 
> > > > getVar().
> > > >
> > > > What I would like to see is the arguments to getVar that are failing 
> > > > plus the ncdump output of that specific variable. I would be looking to 
> > > > make sure that the arguments are consistent with the actual variable 
> > > > definition. You could try doing some print statements before the 
> > > > getVar, or if you are on Linux, you could use gdb to put a breakpoint 
> > > > inside getVar.
> > > >
> > > > > Correct. I can read step (3), but not step (2).
> > > > >
> > > > > The program is proprietary. Is there something you can suggest me to 
> > > > > try ?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Norman
> > > > >
> > > > > -----Original Message-----
> > > > > From: Unidata netCDF Support <address@hidden>
> > > > > Sent: Sunday, July 29, 2018 5:48 PM
> > > > > To: Lo, Norman <address@hidden>
> > > > > Cc: address@hidden; Lo, Norman
> > > > > <address@hidden>
> > > > > Subject: [netCDF #UPO-861727]: getVar crashed with compressed
> > > > > file
> > > > >
> > > > > So I read your response to mean that nccopy can read b.nc (step 3) 
> > > > > but your program cannot read b.nc using getVar().
> > > > > Is the program small enough (and non-proprietary) so that you can 
> > > > > send it to me to try out?
> > > > >
> > > > > > The steps are correct. When I try to read the c.nc file again, the 
> > > > > > program does not actually crash (Sorry I described the situation 
> > > > > > incorrectly), instead, it's just freeze at the call for a long time 
> > > > > > (CPU is running 100%). So I have to break it eventually.
> > > > > >
> > > > > > I tried to do the following:
> > > > > >
> > > > > > (1) original file: a.nc
> > > > > > (2) compress a.nc to become b.nc  (using nccopy -d 1)
> > > > > > (3) decompress b.nc to become c.nc. (using nccopy -d 0)
> > > > > >
> > > > > > I can read the c.nc file fine.
> > > > > >
> > > > > >
> > > > > > Thank you very much for your time,
> > > > > >
> > > > > > Norman
> > > > > >
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Unidata netCDF Support <address@hidden>
> > > > > > Sent: Thursday, July 26, 2018 10:30 PM
> > > > > > To: Lo, Norman <address@hidden>
> > > > > > Cc: address@hidden
> > > > > > Subject: [netCDF #UPO-861727]: getVar crashed with compressed
> > > > > > file
> > > > > >
> > > > > > Let me make sure I understand.
> > > > > > 1. You have some uncompressed file, call it u.nc.
> > > > > > 2. You compress it using nccopy -d 1 u.nc c.nc to produce the
> > > > > > file c.nc 3. You then try to read the c.nc file using a
> > > > > > program that calls getVar and when you do so, your program crashes.
> > > > > > Do I have that right?
> > > > > >
> > > > > > >
> > > > > > > We have  program that uses the netcdf-cxx-4.2 interface to 
> > > > > > > manipulate data stored in netcdf files. As far as I can tell, 
> > > > > > > everything works fine if the netcdf files are not compressed.
> > > > > > >
> > > > > > > However, if I use "nccopy -d 1" to compress the files. The 
> > > > > > > function getVar crashed.
> > > > > > >
> > > > > > >
> > > > > > > The function we are using is:
> > > > > > >
> > > > > > >
> > > > > > > void netCDF::NcVar::getVar        (              const 
> > > > > > > std::vector< size_t > &        start,
> > > > > > > const std::vector< size_t > &        count,
> > > > > > > const std::vector< ptrdiff_t > & stride, const std::vector<
> > > > > > > ptrdiff_t
> > > > > > > > & imap, short * dataValues
> > > > > > > )                              const
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Could you let me know if I did something wrong? Thank you
> > > > > > > very much,
> > > > > > >
> > > > > > > Norman
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > =Dennis Heimbigner
> > > > > > Unidata
> > > > > >
> > > > > >
> > > > > > Ticket Details
> > > > > > ===================
> > > > > > Ticket ID: UPO-861727
> > > > > > Department: Support netCDF
> > > > > > Priority: High
> > > > > > Status: Open
> > > > > > ===================
> > > > > > NOTE: All email exchanges with Unidata User Support are recorded in 
> > > > > > the Unidata inquiry tracking system and then made publicly 
> > > > > > available through the web.  If you do not want to have your 
> > > > > > interactions made available in this way, you must let us know in 
> > > > > > each email you send to us.
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > =Dennis Heimbigner
> > > > > Unidata
> > > > >
> > > > >
> > > > > Ticket Details
> > > > > ===================
> > > > > Ticket ID: UPO-861727
> > > > > Department: Support netCDF
> > > > > Priority: High
> > > > > Status: Open
> > > > > ===================
> > > > > NOTE: All email exchanges with Unidata User Support are recorded in 
> > > > > the Unidata inquiry tracking system and then made publicly available 
> > > > > through the web.  If you do not want to have your interactions made 
> > > > > available in this way, you must let us know in each email you send to 
> > > > > us.
> > > > >
> > > > >
> > > > >
> > > >
> > > > =Dennis Heimbigner
> > > > Unidata
> > > >
> > > >
> > > > Ticket Details
> > > > ===================
> > > > Ticket ID: UPO-861727
> > > > Department: Support netCDF
> > > > Priority: High
> > > > Status: Open
> > > > ===================
> > > > NOTE: All email exchanges with Unidata User Support are recorded in the 
> > > > Unidata inquiry tracking system and then made publicly available 
> > > > through the web.  If you do not want to have your interactions made 
> > > > available in this way, you must let us know in each email you send to 
> > > > us.
> > > >
> > > >
> > > >
> > >
> > > =Dennis Heimbigner
> > > Unidata
> > >
> > >
> > > Ticket Details
> > > ===================
> > > Ticket ID: UPO-861727
> > > Department: Support netCDF
> > > Priority: High
> > > Status: Open
> > > ===================
> > > NOTE: All email exchanges with Unidata User Support are recorded in the 
> > > Unidata inquiry tracking system and then made publicly available through 
> > > the web.  If you do not want to have your interactions made available in 
> > > this way, you must let us know in each email you send to us.
> > >
> > >
> > >
> >
> > =Dennis Heimbigner
> > Unidata
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: UPO-861727
> > Department: Support netCDF
> > Priority: High
> > Status: Open
> > ===================
> > NOTE: All email exchanges with Unidata User Support are recorded in the 
> > Unidata inquiry tracking system and then made publicly available through 
> > the web.  If you do not want to have your interactions made available in 
> > this way, you must let us know in each email you send to us.
> >
> >
> >
> 
> =Dennis Heimbigner
> Unidata
> 
> 
> Ticket Details
> ===================
> Ticket ID: UPO-861727
> Department: Support netCDF
> Priority: High
> Status: Open
> ===================
> NOTE: All email exchanges with Unidata User Support are recorded in the 
> Unidata inquiry tracking system and then made publicly available through the 
> web.  If you do not want to have your interactions made available in this 
> way, you must let us know in each email you send to us.
> 
> 
> 

=Dennis Heimbigner
  Unidata


Ticket Details
===================
Ticket ID: UPO-861727
Department: Support netCDF
Priority: High
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.