[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[LDM #CTE-620214]: Satellite images slow or halt each day
- Subject: [LDM #CTE-620214]: Satellite images slow or halt each day
- Date: Mon, 12 Aug 2019 12:34:31 -0600
The error messages mean that nothing was received by the downstream LDM process
for 35 seconds -- which is odd because the corresponding upstream LDM process
sends something at least every 30 seconds.
I suspect, therefore, that there was considerable network congestion during
that time.
Because the "Desired product class" messages all have different MD5 signatures
for the last successfully-received data-product, the downstream LDM did receive
data products ever during the time of congestion.
All in all, I'm afraid the error messages don't show anything that would
account for a multiple-hour outage of radar data. Do you have any other
evidence?
I see your're using the default value of 500 MB for the size of the queue. This
might be too small. Would you please send the output of the command "pqmon -S".
> Here you go,
>
> hostname: xxx.xxx.xxx
> os: Linux
> release: 4.1.12-124.25.1.el7uek.x86_64
> ldmhome: /home/ldm
> LDM version: 6.13.6
> PATH:
> /home/ldm/ldm-6.13.6/bin:/home/ldm/decoders:/home/ldm/util:/home/ldm/bin:/home/ldm/decoders:/home/ldm/util:/home/ldm/bin:/home/ldm/perl5/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/ldm/.local/bin:/home/ldm/bin:/home/gempak/GEMPAK6.7.0/os/linux64/bin:/home/gempak/GEMPAK6.7.0/bin
> LDM conf file: /home/ldm/etc/ldmd.conf
> pqact(1) conf file: /home/ldm/etc/pqact.conf
> scour(1) conf file: /home/ldm/etc/scour.conf
> product queue: /home/ldm/var/queues/ldm.pq
> queue size: 500M bytes
> queue slots: default
> reconciliation mode: do nothing
> pqsurf(1) path: /home/ldm/var/queues/pqsurf.pq
> pqsurf(1) size: 2M
> IP address: 0.0.0.0
> port: 388
> PID file: /home/ldm/ldmd.pid
> Lock file: /home/ldm/.ldmadmin.lck
> maximum clients: 256
> maximum latency: 3600
> time offset: 3600
> log file: /home/ldm/var/logs/ldmd.log
> numlogs: 7
> log_rotate: 1
> netstat: /bin/netstat -A inet -t -n
> top: /bin/top -b -n 1
> metrics file: /home/ldm/var/logs/metrics.txt
> metrics files: /home/ldm/var/logs/metrics.txt*
> num_metrics: 4
> check time: 1
> delete info files: 0
> ntpdate(1): /usr/sbin/ntpdate
> ntpdate(1) timeout: 5
> time servers: ntp.ucsd.edu ntp1.cs.wisc.edu ntppub.tamu.edu
> otc1.psu.edu timeserver.unidata.ucar.edu
> time-offset limit: 10
Regards,
Steve Emmerson
Ticket Details
===================
Ticket ID: CTE-620214
Department: Support LDM
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata
inquiry tracking system and then made publicly available through the web. If
you do not want to have your interactions made available in this way, you must
let us know in each email you send to us.