[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 980421: SunOS 5.5.1, nc_close() wild free()-ing?
- Subject: Re: 980421: SunOS 5.5.1, nc_close() wild free()-ing?
- Date: Tue, 21 Apr 1998 09:58:31 -0600
>To: address@hidden,
>To: address@hidden,
>To: address@hidden
>cc: address@hidden,
>cc: address@hidden,
>cc: address@hidden,
>cc: address@hidden,
>cc: address@hidden,
>cc: address@hidden,
>cc: address@hidden,
>cc: address@hidden,
>cc: address@hidden
>From: Phil Sackinger <address@hidden>
>Subject: SunOS 5.5.1, nc_close() wild free()-ing?
>Organization: Sandia National Labs
>Keywords: 199804211430.IAA14553
Hi Phil,
> Lately I've experienced unusual memory problems under SunOS 5.5.1 with
> our finite element application program (goma) that uses the EXODUS II
> API and netCDF.
>
> Basically, memory that has been dynamically allocated by goma for its
> own use ends up getting freed somehow deep down in netcdf. Later,
> these (int *) variables that goma needs point to wild locations that
> give a Bus Error during execution and, compiled under Purify 4.0,
> provide Memory Segment Errors (MSE) and Free Memory Reads (FMR).
>
> In more detail, an integer array exo->eb_num_nodes_per_elem[] that was
> dynamically allocated earlier in goma and filled with meaningful data
> (malloc() returning a "nice" address like 0x599848) seems to point to
> a not-so-nice address like 0x3f800000. (If it were cleanly free()-ed I
> thought maybe it would be set to point to NULL (0x0)?)
>
> The log file from Purify is shown below.
> ______________________________________________________________________________
> **** Purify instrumented gomad (pid 27063 at Mon Apr 20 15:55:39 1998)
> * Purify 4.0 Solaris 2, Copyright (C) 1992-1996 Pure Software Inc. All
> rights reserved.
> * For contact information type: "purify -help"
> * For TTY output, use the option "-windows=no"
> * Command-line: gomad -a -i ttc.input
> * Options settings: -purify -cache-dir=/tmp \
> -purify-home=/usr/local/pure/purify-4.0-solaris2 \
> -real_ild_linker=/opt/SUNWspro/bin/../SC4.2/bin/ild
> * Purify licensed to SANDIA NATIONAL LABORATORIES
> * Purify checking enabled.
>
> **** Purify instrumented gomad (pid 27063) ****
> Process 27064 about to exec /bin/sh as "sh".
>
> **** Purify instrumented gomad (pid 27063) ****
> Process 27066 about to exec /bin/sh as "sh".
>
> **** Purify instrumented gomad (pid 27063) ****
> FMR: Free memory read:
> * This is occurring while in:
> read_mesh_exoII [rd_mesh.c:267]
> main [main.c:472]
> _start [crt1.o]
> * Reading 4 bytes from 0x5e7404 in the heap.
> * Address 0x5e7404 is 1061 bytes past end of a freed block at 0x5e6f90 of
> 80 bytes.
> * This block was allocated from:
> malloc [rtlib.o]
> new_NC [nc.c:93]
> nc__open [nc.c:914]
> nc_open [nc.c:961]
> ncopen [v2i.c:167]
> ex_open [libexoIIc.a]
> * There have been 10 frees since this block was freed from:
> free [rtlib.o]
> free_NC [nc.c:84]
> nc_close [nc.c:1030]
> ncclose [v2i.c:206]
> ex_close [libexoIIc.a]
> rd_exo [rd_exo.c:826]
>
> **** Purify instrumented gomad (pid 27063) ****
> MSE: Memory segment error:
> * This is occurring while in:
> read_mesh_exoII [rd_mesh.c:267]
> main [main.c:472]
> _start [crt1.o]
> * Accessing a memory range that crosses a memory segment boundary.
> Addressing 0x3f800000 for 4 bytes ending at 0x3f800004,
> which is neither in the heap nor the main stack.
>
> ______________________________________________________________________________
>
> Initially I thought using the newer versions of the EXODUS II library
> (upgrading from 2.17 to 3.00) and the netCDF library (upgrading from
> 3.3.1 to 3.4) might alleviate this problem, but it does not seem to
> have helped.
>
> Usually Purify helps track down these problems better, but evidently
> in this case all I've been able to determine is that, in the routine
> free_NC(), when free(ncp) is performed, it appears the nice memory
> handle created by malloc() for use independent of netCDF is getting
> corrupted.
>
> The ex_close() routines do appear to call some routines like
> rm_stat_ptr() that call free() also, but Purify evidently indicates
> the problems originate with the free() calls occurring in nc.c
>
> Since I'm not familiar with the data structures like "struct obj_stats
> **obj_ptr" used in EXODUS II nor the "struct NC" used by netCDF, I
> cannot easily determine if they are being misused to free() something
> they shouldn't.
>
>
> Also, FWIW, other symptoms include unrelated data structures in goma
> getting written over somehow with "other" data - evidently without
> attracting a warning from Purify.
>
> The routines in goma open the EXODUS/netCDF file, read the data,
> allocating memory as needed, and close the file immediately after all
> the interesting data has been read. The actual memory segment error
> occurs once the ex_close()/ncclose() has taken place when a print
> statement attempts to look at some of the data in
> exo->eb_num_nodes_per_elem[0].
>
>
> At this point I'm almost ready to suspect the Solaris malloc() of
> being broken. Building the whole thing on another platform should be a
> a good check to see if this might be the case.
>
> OTOH, since we've historically undertaken much of our development
> under Solaris, it would be nice if these essential API's worked
> solidly on the Sun.
>
> Any suggestions, comments, hints, or recommendations would be
> most appreciated.
I just ran our extensive netCDF test for the C interface, nc_test, under
SunOS 5.6 and Purify 4.1:
Purify instrumented ./nc_test (pid 11828 at Tue Apr 21 09:44:21 1998)
Purify 4.1 Solaris 2, Copyright (C) 1992-1997 Rational Software Corp. All
rights reserved.
For contact information type: "purify -help"
For TTY output, use the option "-windows=no"
Command-line: ./nc_test
Options settings: -purify -purify-home=/opt/pure/purify-4.1-solaris2
Purify licensed to UCAR Unidata or Purify Evaluation User
Purify checking enabled.
This is a very extensive test, exercising each documented interface
multiple times. The result was
Memory leaked: 0 bytes (0%); potentially leaked: 0 bytes (0%)
Purify Heap Analysis (combining suppressed and unsuppressed blocks)
Blocks Bytes
Leaked 0 0
Potentially Leaked 0 0
In-Use 0 0
----------------------------------------
Total Allocated 0 0
Program exited with status code 0.
This doesn't prove there are no memory allocation errors in the netCDF
library, but yours is the only report of a symptom of a malloc/free
problem we've seen since releasing netCDF 3.4, now being used at
hundreds of other sites. I guess from this I would first suspect the
error you are seeing is a symptom of a problem somewhere else ...
If you are still suspicious of the netCDF library on your platform
(SunOS 5.5.1), you might also try running nc_test under Purify ...
_____________________________________________________________________
Russ Rew UCAR Unidata Program
address@hidden http://www.unidata.ucar.edu