[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Suggested LDM updates
- Subject: Re: Suggested LDM updates
- Date: Fri, 14 Jul 2000 17:06:32 -0600
Hi Bob,
Thanks for the clarification - I see now what you are trying to do with the
filtered products.
> We have tried the sequence number approach and it didn't work...
> In pqinsert.c you'll see that the MD5 signature, which is used
> to identify duplicates, is calculated on the product contents
> only, and does not depend on any part of the prod_info
> structure.
>
The sequence number is also part of the product contents. (I assume that's
where
the ldm gets it in order to put it in the prod_info.) Here are the first few
bytes of a typical product. You'll see that the sequence number starts at the
5th
byte of the product.
001 \r \r \n 5 4 1 \r \r \n Y U I E 5
I was suggesting that you modify those bytes, not the info in prod_info.
I assume there are products for which this isn't the case. The truth is that I
don't yet know enough about the products to say which ones have a sequence
number
in this location and which don't.
Anyway, for the products to which this applies, there is some variation in the
format of the sequence number - I think some ingest sites pad it with blanks,
some
don't, I'm not sure what all the variations are. Below is some code I wrote
that
I use to determine whether to classify a product as a wmo product or not. It
does
this by looking for two particular strings in the first few bytes of a product.
The characters in between those two strings are considered the sequence number.
I use this function to skip over all those chars in calculating the checksum if
the user invokes the -5 option to do so. That way products that are the same
except for their sequence numbers will be considered duplicates if the option is
invoked.
I hope this helps. If you'd like more info let me know.
Anne
-----
/*
* Determine if a product starts with the string
* "\r\r\n<sequnceNumber>\r\r\n". If it doesn't, return. If it does,
* return a pointer to the start of the product, skipping over those
* leading control chars.
*
* A sequence number is expected to be any string of at most MAX_SEQ_NUM_LEN
* digits with possibly leading or trailing blanks included in that count.
* However, the only check done here is to see that the sequence number
* consists of MAX_SEQ_NUM_LEN or fewer characters. No other checks are
* performed.
*/
char *
wmo_prod(const char *prod)
{
#define PART1_SIZE 4
#define PART2_SIZE 4
#define MAX_SEQ_NUM_LEN 4
char part1[PART1_SIZE] = {'', '\r', '\r', '\n'};
char part2[PART2_SIZE] = {'\r', '\r', '\n', '\0'}; /* '\0' is for strstr */
char *startPart2;
int seqNumLength;
/*
* If part1 is not at start of product, return
*/
if ((strncmp(part1, prod, PART1_SIZE)) != 0)
return 0;
/*
* If part2 doesn't occur somewhere after part1, return
*/
if ((startPart2 = strstr (prod+PART1_SIZE, part2)) == 0)
return 0;
/*
* Pick out substring between part1 and part2 that contains the
* sequence number
*/
seqNumLength = startPart2 - (prod+PART1_SIZE);
/*
* Sanity check: if the length of the sequence number string is
* too big, return
*/
if (seqNumLength > MAX_SEQ_NUM_LEN)
return 0;
/*
* If we got here, we've classified it as a wmo product.
* Return a pointer to the beginning of the product.
*/
return startPart2 + PART2_SIZE - 1; /* exclude trailing '\0' */
}
--
***************************************************
Anne Wilson UCAR Unidata Program
address@hidden P.O. Box 3000
Boulder, CO 80307
----------------------------------------------------
Unidata WWW server http://www.unidata.ucar.edu/
****************************************************