This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Carissa, re: > Our Boulder data center is offline right now, that is why woc is not > responding. OK. re: > It should be back online by Friday. Thanks. re: > Let me know if you need any assistance with these latency issues. Something is definitely wrong. The problem is we don't know what that might be. Observations: - the machine we operate to REQUEST CONDUIT from both conduit.ncep.noaa.gov and ncepldm4.woc.noaa.gov is basically idling (meaning that its load average is VERY low, like 0.05 for the 5-minute load number reported by 'top' So, any latencies we experience on this machine (daffy.unidata.ucar.edu) are not likely to be caused by the machine itself. The latencies that daffy is seeing from conduit.ncep.noaa.gov parallel those being experienced at Penn State and UW/AOS. Here is the URL for daffy's CONDUIT latency graph: http://rtstats.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?CONDUIT+daffy.unidata.ucar.edu Here is the same kind of plot for the UW/AOS machine: http://rtstats.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?CONDUIT+idd.aos.wisc.edu I would list Penn State's machine, but Art stopped REQUESTing/relaying the 0.25 degree GFS data yesterday. - this morning, I changed the 5-way split of REQUEST lines for CONDUIT data to a 10-way split to see if this would result in reduced latencies seen on daffy The verdict is not in on this test, but there seems to be some reduction of latencies being experienced. If this trend continues it might mean that there is some sort of per-connection bandwidth limitation going on somewhere. The "classic" example of bandwidth limitation is when there is "packet shaping" going on. Another classic example is when there is simply not enough bandwidth to handle the volume of data being relayed - A review of the traffic flowing out of the conduit.ncep.noaa.gov cluster and through any/all switches seems to be in order at this point. I say this because our monitoring of the bandwith flowing out of our top level IDD relay cluster, idd.unidata.ucar.edu, showed us that the volumes would plateau on some of the real server backends, and this plateauing effect was not a function of a maximum in the data that was being sent. We learned that some of our relay cluster backend machines were connected to UCAR switches that had a 1 Gbps maximum, and there were other machines that were also using lots of bandwidth on those switches. We also learned that the volume of data that both our cluster front end "accumulator" machines and our backend real servers was greater than 1 Gbps for substantial times of the day. We worked with the network folks to move our machines to lessen impacts by other machines, and we bonded two Gbps Ethernet interfaces together on each of the machines so that up to 2 Gbps could be sent from each machine. Questions about the NCEP CONDUIT top leve setup: - how many downstream machines are currently REQUESTing CONDUIT? And, is it possible that the sum of the volume of data attempting to be relayed exceeds 1 Gbps (or whatever your Ethernet interface and network supports). If yes, this is a possible source of the latency problem. - how much other traffic is on the NCEP network where conduit.ncep.noaa.gov operates? - what is the network capacity for the network through which the CONDUIT data is flowing? - the real-time stats plots indicates/suggests that the source of the CONDUIT data is/are virtual machines. for example: vm-lnx-conduit2.ncep.noaa.gov Is(are) this(these) machine(s) really VMs? If yes, is it possible that there is some sort of limitation in the VM networking? I am sure that there are other questions that should be posed at this point, but I think that it is important for the above to get through soon so that someone on your side (Data Flow team?) can start thinking about potential bottlenecks there. Cheers, Tom -- **************************************************************************** Unidata User Support UCAR Unidata Program (303) 497-8642 P.O. Box 3000 address@hidden Boulder, CO 80307 ---------------------------------------------------------------------------- Unidata HomePage http://www.unidata.ucar.edu **************************************************************************** Ticket Details =================== Ticket ID: GSZ-336115 Department: Support CONDUIT Priority: Normal Status: Closed