[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[LDM #GNY-856216]: LDM 6.12.6 has crashed twice today with this message...
- Subject: [LDM #GNY-856216]: LDM 6.12.6 has crashed twice today with this message...
- Date: Tue, 28 Oct 2014 16:01:53 -0600
Gilbert,
> Hello Tom and Steve,
>
> Been a rough day for weather3.admin.niu.edu. The LDM
> crashed twice today on it. Here's the latest crash
> message:
>
> Oct 28 15:10:54 weather3 pqact[11012] ERROR: [filel.c:305] Deleting failed
> PIPE entry: pid=19989, cmd="dcgrib2 -d data/gempak/logs/dcgrib2_AWC_TURB.log
> -e GEMTBL=/home/gempak/GEMPAK7/gempak/tables
> data/gempak/model/awc/YYYYMMDD_turb.gem"
dcgrib2(1) eh? That program has caused more problems...
> Oct 28 15:10:54 weather3 pqact[11012] ERROR: child 19988 exited with status 1
> Oct 28 15:10:54 weather3 pqact[11012] ERROR: child 19989 exited with status 1
The last error-message above is due to the pqact(1) process noticing that the
dcgrib2(1) process terminated with an unsuccessful exit status. You should
check the file "data/gempak/logs/dcgrib2_AWC_TURB.log" for the reason that
dcgrib2(1) failed.
> Oct 28 15:10:54 weather3 ldmd[11006] NOTE: child 11012 terminated by signal
> 11: /home/ldm/bin/pqact -f NEXRAD3|UNIDATA /home/ldm/etc/pqact.gempak
The message above is due to the top-level LDM process noticing that the
pqact(1) process terminated due to receiving a SIGSEGV (segmentation
violation). Ouch! That shouldn't happen.
Did you build the pqact(1) program with debugging enabled? Is there a core
file? If so, what's the stack trace?
> Oct 28 15:10:54 weather3 ldmd[11006] NOTE: Killing (SIGTERM) process group
> Oct 28 15:10:54 weather3 sasquatch.tamu.edu(feed)[12090] NOTE: Exiting
> Oct 28 15:10:54 weather3 sasquatch.tamu.edu(feed)[12084] NOTE: Exiting
> Oct 28 15:10:54 weather3 weather.admin.niu.edu(feed)[8570] NOTE: Exiting
> Oct 28 15:10:54 weather3 sasquatch.tamu.edu(feed)[12087] NOTE: Exiting
> Oct 28 15:10:54 weather3 ldmd[11006] NOTE: Exiting
> Oct 28 15:10:54 weather3 96.8.93.16[11036] NOTE: Exiting
> Oct 28 15:10:54 weather3 pqact[11015] NOTE: Exiting
> Oct 28 15:10:54 weather3 sasquatch.tamu.edu(feed)[12086] NOTE: Exiting
> Oct 28 15:10:54 weather3 sasquatch.tamu.edu(feed)[12088] NOTE: Exiting
> Oct 28 15:10:54 weather3 hprcc2.unl.edu(feed)[19032] NOTE: Failure;
> COMINGSOON: RPC: Unable to receive; errno = Bad file descriptor
> Oct 28 15:10:54 weather3 pqact[11017] NOTE: Exiting
> Oct 28 15:10:54 weather3 ldm-relay1.tamu.edu(feed)[11090] NOTE: Exiting
> Oct 28 15:10:54 weather3 96.8.94.15[11035] NOTE: Exiting
> Oct 28 15:10:54 weather3 pqact[11041] NOTE: Exiting
> Oct 28 15:10:54 weather3 pqsurf[11020] NOTE: Exiting
> --More--
>
> I am getting lots of those GEMPAK errors, and I have no idea why.
It's possible that if the dcgrib2(1) process can be made to work properly, then
the parent pqact(1) process won't crash. This doesn't excuse pqact(1), but it
might be a quicker workaround than waiting for me to debug pqact(1).
Check that dcgrib2(1) log file.
> Permission to log in if neccesary granted.
>
> Gilbert
>
> *******************************************************************************
> Gilbert Sebenste ********
> (My opinions only!) ******
> Staff Meteorologist, Northern Illinois University ****
> E-mail: address@hidden ***
> web: http://weather.admin.niu.edu **
> Twitter: http://www.twitter.com/NIU_Weather **
> Facebook: http://www.facebook.com/niu.weather *
> *******************************************************************************
Regards,
Steve Emmerson
Ticket Details
===================
Ticket ID: GNY-856216
Department: Support LDM
Priority: Normal
Status: Open