[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDFJava #GSU-352271]: OutOfMemory Error



Netcdf-4 files (based on HDF5 format) divide a variable
into chunks of data: basically n-dimensional rectangles.
The error indicates that the chunk size for one or more
of the variables in that dataset is too large to process
in memory -- as a rule one or more chunks will get read into memory
at some point.

I think the first thing to do is to see how big the chunks actually are.
You should be able to use the NcDumpW program (java) or the ncdump program (C)
to print out the chunk sizes for each variable.

If you can find some machine that can handle the given chunk sizes, then you
should be able to read the file. Otherwise, you are going to need to
preprocess the file to change its chunk size to something smaller.

Also if you have any control over the original writing of those files,
you might investigate writing them with a different set of chunk sizes.



> Package Version: 4.6
> Operating System: macOS High Sierra 10.13.4 (16GB ram)
> Hardware: MacBook Pro
> Description of problem: Ahoy!
> 
> We are attempting to use the NetCDF Java Library to read in a data file
> that's approximately 750mb (and could be more).  In reading some of the
> threads, I increased the min heap size to 6/8/10+GB, and am presently
> running a test class in Eclipse to read the entire file into memory.
> We run into the following error:
> 
> java.lang.OutOfMemoryError: Ran out of memory trying to read HDF5 filtered 
> chunk. Either increase the JVM's heap size (use the -Xmx switch) or reduce 
> the size of the dataset's chunks (use nccopy -c).
> at 
> ucar.nc2.iosp.hdf5.H5tiledLayoutBB$DataChunk.getByteBuffer(H5tiledLayoutBB.java:255)
> at ucar.nc2.iosp.LayoutBBTiled.hasNext(LayoutBBTiled.java:128)
> at ucar.nc2.iosp.hdf5.H5tiledLayoutBB.hasNext(H5tiledLayoutBB.java:152)
> at ucar.nc2.iosp.IospHelper.readData(IospHelper.java:348)...
> 
> Even using NetcdfFile.openFile(...) runs into the same issue when trying
> to read a specific Section, or read() the whole Variable.  That said,
> in files roughly 100 GB or less, we are not encountering this issue.
> 
> Should we consider using a different OS, extremely large heap size,
> re-evaluate the size of the data chunks used, or consider using smaller
> files in Java.  The goal here is to take these files, process/"unzip"
> them into Tiles, and  import the datasets into HBase using the Java.
> 
> Thanks in advance,
> 
> 
> 

=Dennis Heimbigner
  Unidata


Ticket Details
===================
Ticket ID: GSU-352271
Department: Support netCDF Java
Priority: Critical
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.