[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[python #SOS-815998]: Opening Large netCDF files on HPC
- Subject: [python #SOS-815998]: Opening Large netCDF files on HPC
- Date: Tue, 23 Nov 2021 17:39:26 -0700
Greetings!
There are many reasons for the difference on HPC vs. locally:
1. An issue with how the data are stored on the HPC system
2. Difference in versions of netcdf4-python, libnetcdf (netcdf-C), or HDF5 on
your local system vs. HPC--if the libraries are not configured correctly on the
HPC system that could create major issues
The first test I would run is see how this runs on the HPC system vs. locally:
fobj = open('path/to/netCDFdata.nc', 'rb')
while buf := fobj.read(1024 * 1024):
continue
This reads the full file in 1 megabyte chunks and can help narrow down where
the problem is.
Out of curiosity, in this command:
nc4.Dataset(“path/to/netCDFdata.nc”, mode="r", format="NETCDF4",
diskless=True)
Why are you opening an existing file on disk using "diskless" mode? I believe
the normal use for that option is for creating a new netcdf dataset without
writing the data to disk.
Cheers!
Ryan
> Hi,
>
> I am working with a 6GB netCDF file.
>
> When I try to open the netCDF data on an a gpfs HPC system, it keeps loading
> for a long time.
> However, I am able to open the data on my local computer.
>
> Do you know of anything I can do to be able to load the dataset easily on an
> HPC?
>
> Below is the code I use to open the file:
> nc_data = nc4.Dataset(“path/to/netCDFdata.nc”, mode="r", format="NETCDF4",
> diskless=True)
Ticket Details
===================
Ticket ID: SOS-815998
Department: Support Python
Priority: Low
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata
inquiry tracking system and then made publicly available through the web. If
you do not want to have your interactions made available in this way, you must
let us know in each email you send to us.