[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: argo netCDF data
- Subject: Re: argo netCDF data
- Date: Sat, 17 Aug 2002 11:34:24 -0600
Hi,
One more thing occurred to me about the netCDF file with the
corrupted header that you referenced. It would be consistent with
what I noticed if each of the omitted bytes were the hexadecimal 0d
byte, which is an ASCII CR (carriage-return) character. So if the
file were FTP'd in text mode instead of binary mode, maybe the FTP
server deleted the "0d" bytes as part of text conversion end-of-line
handling for a different platform. The problem could be fixed by
FTP'ing the file in binary mode rather than text mode (which is the
default for many ftp programs, unfortunately).
--Russ
------- Forwarded Message
Date: Fri, 16 Aug 2002 13:04:37 -0600
From: Russ Rew <address@hidden>
To: "shen yingshuo" <address@hidden>
cc: address@hidden, support-netcdf
Subject: Re: argo netCDF data
>To: address@hidden
>From: "shen yingshuo" <address@hidden>
>Subject: Re: 20020815: argo netCDF data
>Organization: School of Ocean and Earth Science and Technology, University
>of Hawaii
>Keywords: malloc failure, header corruption
Hi,
> recently we downloaded some argo netCDF from ifremer. we tried to
> open it with several netCDF-able program.. but it did not work..
> would you please help us in finding out what is wrong with this
> dataset?
>
> i put one dataset for ftp
>
> ftp apapane.soest.hawaii.edu with anonymous as user
> cd /users/poppy
> get Q19000222.prof.nc
I downloaded the above file in binary mode and could reproduce the
problem here with ncdump, which produced the error message:
$ ncdump -c ~/tmp/Q1900022_prof.nc
ncdump: /home/russ/tmp/Q1900022_prof.nc: Memory allocation (malloc) failure
Examining the file in a binary editor, it appears that a byte was
deleted from the header of the file somewhere before the byte that
represents the length of the "N_PARAM" dimension. The effect of this
is to make the "N" byte appear as part of the length of a dimension
named "_PARAM...", which throws everything off after that, since the
netCDF library interprets this as specifying that the dimension name
has 1870 characters. 1870 is the decimal value of the hexadecimal
integer represented as "074e" in the hex dump below of the file
header:
00000000: 4344 4601 0000 000a 0000 000a 0000 000e CDF.............
00000010: 0000 0009 4441 5445 5f54 494d 4500 0000 ....DATE_TIME...
00000020: 0000 000e 0000 0009 5354 5249 4e47 3235 ........STRING25
00000030: 3600 0000 0000 0100 0000 0008 5354 5249 6...........STRI
00000040: 4e47 3634 0000 0040 0000 0008 5354 5249 address@hidden
00000050: 4e47 3332 0000 0020 0000 0008 5354 5249 NG32... ....STRI
00000060: 4e47 3136 0000 0010 0000 0007 5354 5249 NG16........STRI
00000070: 4e47 3800 0000 0008 0000 0007 5354 5249 NG8.........STRI
00000080: 4e47 3400 0000 0004 0000 0007 5354 5249 NG4.........STRI
00000090: 4e47 3200 0000 0002 0000 0006 4e5f 5052 NG2.........N_PR
000000a0: 4f46 0000 0000 0000 0000 074e 5f50 4152 OF.........N_PAR
000000b0: 414d 0000 0000 0300 0000 084e 5f4c 4556 AM.........N_LEV
000000c0: 454c 5300 0000 2b00 0000 0c4e 5f54 4543 ELS...+....N_TEC
000000d0: 485f 5041 5241 4d00 0000 1900 0000 074e H_PARAM........N
000000e0: 5f43 414c 4942 0000 0000 0a00 0000 094e _CALIB.........N
000000f0: 5f48 4953 544f 5259 0000 0000 0000 0000 _HISTORY........
00000100: 0000 0000 0000 0000 0000 0b00 0000 3500 ..............5.
00000110: 0000 0944 4154 415f 5459 5045 0000 0000 ...DATA_TYPE....
00000120: 0000 0100 0000 0400 0000 0c00 0000 0100 ................
00000130: 0000 0763 6f6d 6d65 6e74 0000 0000 0200 ...comment......
00000140: 0000 0944 6174 6120 7479 7065 0000 0000 ...Data type....
00000150: 0000 0200 0000 1000 0028 1800 0000 0e46 .........(.....F
00000160: 4f52 4d41 545f 5645 5253 494f 4e00 0000 ORMAT_VERSION...
...
If I insert a single 0 byte before the length of the "N_PARAM" name
(around the 168th byte), all the dimensions get read in OK, then the
global attributes are read in, then the first few variables.
(You can parse binary data like this by following the "Appendix B File
Format Specification" in the netCDF User's Guide.)
A similar error occurs some time after reading in the header
information for the "DATA_CENTRE" variable. Looking at the header, it
appears another byte has been deleted before the length of the
DATE_CREATION variable string, so that the first character "D" is
interpreted as part of the length of the "ATE_CREATION" variable,
which makes the length wrong. Inserting a byte for the right length
of the "DATE_CREATION" name results in getting further, where the
"WMO_INST_TYPE" variable again has a byte missing somewhere before the
name.
It appears as though additional single bytes have been deleted in the
header subsequently.
To diagnose the cause of this problem, it would be useful to know
something about how this file was created and what subsequent
processing occurred. You could narrow down where the problem occurred
by using something like "ncdump -h" or "ncdump -c" on the file when it
is first created and subsequently at each stage of copying, moving, or
processing it to determine exactly what process deleted the bytes from
the header.
If the file was created this way, the program that created the file is
suspect. If you suspect the problem is in the netCDF library and can
provide us with an example of a program that creates a file with such
a corrupt header, we would be very interested in trying to reproduce
and fix the problem. But I've never encountered a case of the netCDF
library creating corrupt headers with a few missing bytes. It seems
to me more likely that this is a symptom of a processing problem or a
hardware error. If the file is the result of sending bytes over a
communication channel with no error checking, for example, it may just
be caused by dropped bytes in the communications channel.
Please let us know if you find anything further about the cause of
this ...
--Russ
_____________________________________________________________________
Russ Rew UCAR Unidata Program
address@hidden http://www.unidata.ucar.edu
------- End of Forwarded Message