This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Pete, re: > We have some money ($16K or so) to spend on a new data ingest/archive > machine, and I am curious if you have any suggestions on what to > look for/avoid as I'm spec'ing it out. We don't have much experience in the archive end, so all comments will be related to data ingest and relay. > Ideally, this will be our primary ldm ingest/feed machine, to > replace f5.aos.wisc.edu. I'd like to have all of the data feeds > that I currently get running through it (DDPLUS, HDS, MCIDAS, CONDUIT, > NIMAGE, NEXRAD, NEXRAD2, etc), be able to feed several downstream > sites that I currently feed, and have the power and io bandwidth > to be able to store data locally in native and decoded to gempak > and/or netcdf format, and make this data available via nfs to our > computer classroom and other machines on our network. So, you want this machine to serve as your toplevel relay AND do data decoding? Our approach was to split data decoding and serving (NFS, ADDE) to two different machines/groups of machines. You may already be aware that the toplevel IDD relay that we operate here at the UPC, idd.unidata.ucar.edu, is actually a Linux cluster composed of: 2 - accumulators, machines that request data feeds from upstream sites 1 - director, a machine that sends feed requests to back end data servers 4 - data servers, these are the machines that feed downstream sites For a couple of years the accumulator machines consisted of a dual 1.8 Ghz Opteron PC w/2 GB of RAM running Fedora Core Linux (1, then 3, then 4) and a dual 1 Ghz P4 box w/3 GB of ram running FreeBSD 4.x. We have recently upgraded the Opteron box to a dual 2.8 Ghz Xeon EM64T (64-bit) box w/6 GB of RAM running 64-bit Fedora Core 5 Linux. We have plans to upgrade the FreeBSD box in the not too distant future. The director is currently a dual 2.8 Ghz, 32-bit Xeon machine, but it really doesn't need to be. We could get run a much less richly configured machine as the director. The dataservers are all dual 2 Ghz Opteron boxes with 14 or 16 GB of RAM. The large amount of memory allows us to keep over two hours of ALL of the data being relayed in the IDD in the LDM queue. The cluster has been show to be able to relay up to 900 Mbps of data in stress tests almost a year ago. It currently relays over 230 Mbps of data _on average_ to about 400 downstream connections, and routinely has peak relay rates of 440 Mbps. Having 4 dataservers in the cluster and a Gpbs network allows us to act as the failover for any/all IDD connections in the world. The entire cluster cost under $25K to put together. A more modest cluster could be put together for under $15K. The sizing of the cluster would depend on how many downstream sites one desired to be able to feed and how much data one wanted to have available in the LDM queue. > I'd also like to keep on the order of a years worth of some of this > data available online, so a large amount of storage (probably RAID5?) > is needed as well. I would recommend that you split your archive and ingest/relay duties between two or more machines. For instance, you could purchase a machine similar to one of the data servers we are using (dual Opteron or Xeon EM64T) with alot of memory (>=6 GB) for about $5K. I would dedicate a system like this to ingest and relay. I would then get another machine with a large RAID to do your data decoding, storage and serving (NFS, etc.). To save one year's of data you will need a HUGE amount of disk storage. For instance, the LEAD project has a 40 TB RAID system that is intended to store at least 6 months of data. It is uncertain really how much data can actually be stored, but it currently appears to be much more that 6 months (plus the LEAD RAID is storing lots and lots of model output including ensembles). > Does it make sense to have a single machine handle both of these tasks, No. > or would it maybe make more sense to get one machine that does the > ingest/feed/decoding and short-term storage, and another that is > aimed more at the long term storage? I suggest moving the decoding off of the ingest/relay machine. > I'm also wondering if you have any suggestions regarding SCSI/SATA, > or iSCSI or Fibre Channel, etc.. On Linux boxes, external SCSI-based RAIDs appear to perform much better than RAIDs created using internally mounted disks and a RAID card. The RAIDs we have attached to Sun SPARC machines are all connected with Fiber Channel. These appear to work much better than the external units that are built of much cheaper disk drives, but your milage may vary. > What kind of machines are people currently buying for ldm > ingest/feed/storage machines? Penn State and U Nebraska-Lincoln are creating clusters similar to the one that we built. TAMU followed our lead in purchasing dual Opteron-based machines with lots of memory for data ingest and relay. TAMU also split off their processing from their ingest/ relay duties, and it has apparently worked pretty well for them. I realize that the cluster I described briefly above is not what you were asking for, but it includes a number of features that we feel are necessary for new ingest/relay machines: fast processors and lots of memory. Again, we don't have that much experience in archive systems (out side of LEAD where folks are in a learning mode), so we can't say much there. Cheers, Tom **************************************************************************** Unidata User Support UCAR Unidata Program (303) 497-8642 P.O. Box 3000 address@hidden Boulder, CO 80307 ---------------------------------------------------------------------------- Unidata HomePage http://www.unidata.ucar.edu **************************************************************************** Ticket Details =================== Ticket ID: QXN-439339 Department: Support Platforms Priority: Normal Status: Closed