[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[LDM #HEQ-649192]: LDM fault tolerance
- Subject: [LDM #HEQ-649192]: LDM fault tolerance
- Date: Mon, 20 Apr 2015 15:57:04 -0600
Geoffry,
> It looks like LDM is entirely a push system, with no way to re-request
> notifications of products from an upstream node that were missed while a
> node was down, is that correct?
If an upstream node is down for less than the minimum residency time of its
product-queue (typically one hour), then a downstream node requesting from it
won't miss anything.
> Then the only way to eliminate a single
> point of failure is to have multiple nodes receive these push notifications
> of new products (then deduping requests before potentially multiple retries
> to download the actual data)?
The source of the data is always a single point of failure.
When possible, we recommend that a downstream site make identical requests to
two, distinct upstream sites. One of those upstream sites will then transfer
products in primary mode and the other will transfer products in alternate
mode. The primary mode transfer will be as fast as the network allows. The
alternate mode transfer will use very little bandwidth because the products
will likely have already arrived on the primary mode connection. If and when
the primary mode connection slows down or breaks, then the alternate mode
connection will be switched by the downstream site to primary mode.
The product-queue automatically removes duplicate data-products -- so having
two upstream feeds is safe as well as efficient.
> This model assumes that you can request a download multiple times after you
> receive a notification that a product is available, is that true as well?
Not quite. See the above description.
Regards,
Steve Emmerson
Ticket Details
===================
Ticket ID: HEQ-649192
Department: Support LDM
Priority: Normal
Status: Closed