[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #CTE-620214]: Satellite images slow or halt each day



The error messages mean that nothing was received by the downstream LDM process 
for 35 seconds -- which is odd because the corresponding upstream LDM process 
sends something at least every 30 seconds.

I suspect, therefore, that there was considerable network congestion during 
that time.

Because the "Desired product class" messages all have different MD5 signatures 
for the last successfully-received data-product, the downstream LDM did receive 
data products ever during the time of congestion.

All in all, I'm afraid the error messages don't show anything that would 
account for a multiple-hour outage of radar data. Do you have any other 
evidence?

I see your're using the default value of 500 MB for the size of the queue. This 
might be too small. Would you please send the output of the command "pqmon -S".

> Here you go,
> 
> hostname:              xxx.xxx.xxx
> os:                    Linux
> release:               4.1.12-124.25.1.el7uek.x86_64
> ldmhome:               /home/ldm
> LDM version:           6.13.6
> PATH:                  
> /home/ldm/ldm-6.13.6/bin:/home/ldm/decoders:/home/ldm/util:/home/ldm/bin:/home/ldm/decoders:/home/ldm/util:/home/ldm/bin:/home/ldm/perl5/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/ldm/.local/bin:/home/ldm/bin:/home/gempak/GEMPAK6.7.0/os/linux64/bin:/home/gempak/GEMPAK6.7.0/bin
> LDM conf file:         /home/ldm/etc/ldmd.conf
> pqact(1) conf file:    /home/ldm/etc/pqact.conf
> scour(1) conf file:    /home/ldm/etc/scour.conf
> product queue:         /home/ldm/var/queues/ldm.pq
> queue size:            500M bytes
> queue slots:           default
> reconciliation mode:   do nothing
> pqsurf(1) path:        /home/ldm/var/queues/pqsurf.pq
> pqsurf(1) size:        2M
> IP address:            0.0.0.0
> port:                  388
> PID file:              /home/ldm/ldmd.pid
> Lock file:             /home/ldm/.ldmadmin.lck
> maximum clients:       256
> maximum latency:       3600
> time offset:           3600
> log file:              /home/ldm/var/logs/ldmd.log
> numlogs:               7
> log_rotate:            1
> netstat:               /bin/netstat -A inet -t -n
> top:                   /bin/top -b -n 1
> metrics file:          /home/ldm/var/logs/metrics.txt
> metrics files:         /home/ldm/var/logs/metrics.txt*
> num_metrics:           4
> check time:            1
> delete info files:     0
> ntpdate(1):            /usr/sbin/ntpdate
> ntpdate(1) timeout:    5
> time servers:          ntp.ucsd.edu ntp1.cs.wisc.edu ntppub.tamu.edu 
> otc1.psu.edu timeserver.unidata.ucar.edu
> time-offset limit:     10

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: CTE-620214
Department: Support LDM
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.