[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #YYU-156317]: list index / stride reads very slow

This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.


  • Subject: [netCDF #YYU-156317]: list index / stride reads very slow
  • Date: Sat, 15 Dec 2018 12:30:23 -0700

This is a known problem in some older versions of netcdf.
The speed was improved starting with version 4.6.2.
Please upgrade to that version and see if it gives you
adequate performance.

> Package Version: netcdf-4.3.3.1-5.el7.x86_64
> Operating System: centos 7
> Hardware: VM
> Description of problem: The following python program is very slow using 
> netcdf-4.3.3.1 and ok with netcdf-4.1.1-3
> 
> import numpy as np
> import datetime
> from netCDF4 import Dataset
> 
> nc4file     = Dataset('/net/satarch/Sjaak/test.hdf5','r')
> data = nc4file.variables["data"]
> 
> data.shape
> st = datetime.datetime.now()
> print np.mean(data[0])
> print datetime.datetime.now() - st
> st = datetime.datetime.now()
> print np.mean(data[3])
> print datetime.datetime.now() - st
> st = datetime.datetime.now()
> print np.mean(data[[0,3]])
> print datetime.datetime.now() - st
> 
> netcdf-4.1.1-3 (centos 6.7)
> ===============
> >>> import numpy as np
> >>> import datetime
> >>> from netCDF4 import Dataset
> >>> nc4file     = 
> >>> Dataset('/net/satarch/CommonSense/DataCubes/Test/MODIS500/base-NDVI-out2.hdf5','r')
> >>> data = nc4file.variables["data"]
> >>> data.shape
> (744, 4000, 4000)
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[0])
> 499999.473664
> >>> print datetime.datetime.now() - st
> 0:00:00.583850
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[3])
> 3499999.36307
> >>> print datetime.datetime.now() - st
> 0:00:00.590855
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[[0,3]])
> 2000000.5161
> >>> print datetime.datetime.now() - st
> 0:00:02.076450
> 
> netcdf-4.3.3.1 (centos 7)
> ===============
> >>> import numpy as np
> >>> import datetime
> >>> from netCDF4 import Dataset
> >>> nc4file     = Dataset('/net/satarch/Sjaak/test.hdf5','r')
> >>> data = nc4file.variables["data"]
> >>> data.shape
> (744, 4000, 4000)
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[0])
> 499999.473664
> >>> print datetime.datetime.now() - st
> 0:00:00.415814
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[3])
> 3499999.36307
> >>> print datetime.datetime.now() - st
> 0:00:00.401048
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[[0,3]])
> 
> (stopped after waiting for more than 2 minutes)
> 
> ncdump
> =======
> ncdump  -sh /net/satarch/Sjaak/test.hdf5
> netcdf test {
> dimensions:
> time = 744 ;
> latitude = 4000 ;
> longitude = 4000 ;
> variables:
> int time(time) ;
> time:_Storage = "contiguous" ;
> time:_Endianness = "little" ;
> int latitude(latitude) ;
> latitude:_Storage = "contiguous" ;
> latitude:_Endianness = "little" ;
> int longitude(longitude) ;
> longitude:_Storage = "contiguous" ;
> longitude:_Endianness = "little" ;
> float data(time, latitude, longitude) ;
> data:_Storage = "chunked" ;
> data:_ChunkSizes = 1, 200, 200 ;
> 
> // global attributes:
> :_Format = "netCDF-4" ;
> }
> 
> 
> 
> 

=Dennis Heimbigner
  Unidata


Ticket Details
===================
Ticket ID: YYU-156317
Department: Support netCDF
Priority: Critical
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.