[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[LDM #GRO-598397]: Question about regular expression in pqact file
- Subject: [LDM #GRO-598397]: Question about regular expression in pqact file
- Date: Fri, 04 Sep 2009 13:27:56 -0600
Howard,
> I ran across the following issue when working on my pqact.conf_exp
> file.
>
> I have files that are similar to the following example:
>
> WWRFDM01-RCI_20090904T0000_20090904T0015_20090904T2330_00_ZRENCInwnd.nc.gz
>
> I want to match these files and pipe them to a decoder I have. Here is
> the regular expression I am using.
>
> ^(WWRFDM01-RCI)_(.*)T(.*)([0-9][0-9])_(.*)_(.*)T(.*)([0-9][0-9])_([0-9][0-9])(.*)_(.*RENCI.*nc.gz$)
>
> which does parse the pattern correctly.
>
> However, I want to pass the complete file name to my script. I have
> been doing this by reconstructing the original file name using the
> back references. However we have just made a change to our system and
> the filename is now being parsed using the above regex into 11
> different groups. This is a problem for which I see three possible
> solutions (and maybe there are others).
>
> 1) recode the regular expression so that it captures fewer groups. I'm
> working on that...
The regex(1) utility, which comes with the LDM package, can probably help.
Execute the command "man regex" for more information. You can also use it to
time your matches and, thus, improve their efficiency.
> 2) figure out how to express \10 and \11. Do you know the syntax for
> that? I haven't found it anywhere
The pqact(1) configuration-file syntax for backreferences beyond "\9" is
"\(nn)" (e.g., "\(10)", "\(11)"). For more information, see
<http://www.unidata.ucar.edu/software/ldm/ldm-6.8.1/basics/pqact.conf.html#argref>.
> 3) Use a token that represents the entire pattern. Is there such a
> pattern and if so what is it?
If you nest the entire pattern in another pair of parentheses, then the entire
matching string is available via the backreference "\1". If you do this, then
you'll have to increment all other backreferences by one.
> Thanks much
> Howard
> --
> Howard Lander <mailto:address@hidden>
> Senior Research Software Developer
> Renaissance Computing Institute <http://www.renci.org>
> The University of North Carolina at Chapel Hill
> Duke University
> North Carolina State University
> 100 Europa Drive
> Suite 540
> Chapel Hill, NC 27517
> 919-445-9651
Regards,
Steve Emmerson
Ticket Details
===================
Ticket ID: GRO-598397
Department: Support LDM
Priority: Normal
Status: Closed