This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
David, This is discouraging, I spent hours looking through raw bulletins to "try" make the decoder correct. I might try looking at the station ID before processing, don't know if that will help. My philosophy is it's better to disregard bulletins/reports rather then enter "bad" data into a file. That said, your example bulletin should be discarded. ugh. Will let you know about my new ideas. Robb... On Thu, 4 Mar 2004, David Larson wrote: > I do see non-US bulletins that are split "badly" according to this code > change ... > > 628 > SANK31 MNMG 041706 > METAR > MNPC 041700Z 08012KT 7000 BKN016 29/26 Q1015 > MNRS 041700Z 06010KT 9999 FEW022 BKN250 30/23 Q1012 > MNJG 041700Z 36004KT 7000 VCRA BKN016 22/17 Q1015 > MNJU 041700Z 10006KT 9999 SCT025 32/19 Q1013 > MNCH 041700Z 02010KT 9999 SCT030 33/21 Q1012 > MNMG 041700Z 07016KT 9999 SCT025 32/20 Q1011 A2988 > MNBL 041700Z 10008KT 9999 SCT019 SCT070 29/25 Q1014= > > Not only is each line a new report, but what worsens it is that the > *last* entry *is* separated by an equal! Yuck. Perhaps this is just a > junk bulletin? I'm suprised that it could even go out this way. > > Does anyone you know even make an attempt to use the perl metar decoder > for non-US stations? I've tried long enough to estimate the work as a > *lot*. > > Dave > > David Larson wrote: > > > I've looked into this problem, which I didn't know existed. > > > > Your code is now: > > > > # Separate bulletins into reports > > if( /=\n/ ) { > > s#=\s+\n#=\n#g ; > > @reports = split( /=\n/ ) ; > > } else { > > #@reports = split ( /\n/ ) ; > > s#\n# #g ; > > next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; > > $reports[ 0 ] = $_ ; > > } > > > > But based on your assumption that these bulletins will not, and > > cannot, contain multiple reports (which seems and appears to be > > reasonable), then there really only needs to be one split, right? > > Because if there is no equal, the entire line will be placed into the > > first report. This seems to be a slight simplification: > > > > # Separate bulletins into reports > > if( /=\n/ ) { > > s#=\s+\n#=\n#g ; > > } else { > > s#\n# #g ; > > } > > @reports = split( /=\n/ ) ; > > ... snip ... the next line is placed down many lines > > next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; > > > > Also, it is an error to have multiple time specifications in any > > report, right? So that can be generalized as well, as I have done above. > > > > You asked for my comments, and well, there you have them! :-) I > > might take a closer look at the rest of the changes as well, but that > > will be delayed a bit. > > > > I sure appreciate your quick responses to all correspondence. > > > > Dave > > > > Robb Kambic wrote: > > > >> David, > >> > >> Yes, I know about the problem. The problem exists in bulletins that > >> don't > >> use the = sign to seperate reports. The solution is to assume that > >> bulletins > >> that don't use = only have one report. I scanned many raw reports and > >> this > >> seems to be true, so I changed the code to: > >> > >> < @reports = split ( /\n/ ) ; > >> --- > >> > >> > >>> #@reports = split ( /\n/ ) ; > >>> s#\n# #g ; > >>> next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; > >>> $reports[ 0 ] = $_; > >>> > >> > >> > >> The new code is attached. I'm also working on a newer version of the > >> decoder, it's in the ftp decoders directory. ie > >> > >> metar2nc.new and metar.cdl.new > >> > >> The pqact.conf entry needs to change \2:yy to \2:yyyy because it now > >> uses > >> the century too. The cdl is different, merges vars that have different > >> units into one. ie wind knots, mph, and m/s are all store using winds > >> m/s. Also, store all reports per station into one record. Take a > >> look, I > >> would appreciate any comments before it's released. > >> > >> Robb... > >> > >> > >> On Tue, 2 Mar 2004, David Larson wrote: > >> > >> > >> > >>> Robb, > >>> > >>> I've been chasing down a problem that seems to cause perfectly good > >>> reports to be discarded by the perl metar decoder. There is a comment > >>> in the 2.4.4 decoder that reads "reports appended together wrongly", > >>> the > >>> code in this area takes the first line as the report to process, and > >>> discards the next line. > >>> > >>> To walk through this, I'll refer to the following report: > >>> > >>> 132 > >>> SAUS80 KWBC 021800 RRD > >>> METAR > >>> K4BL 021745Z 12005KT 3SM BR OVC008 01/M01 RMK SLP143 NOSPECI 60011 > >>> 8/2// T00061006 10011 21017 51007= > >>> > >>> The decoder attempts to classify the report type ($rep_type on line 257 > >>> of metar2nc), in doing so, it classifies this report as a "SPECI" ... > >>> which isn't what you'd expect by visual inspection of the report. > >>> However, perl is doing the right thing given that it is asked to match > >>> on #(METAR|SPECI) \d{4,6}Z?\n# which exists in the remarks of the > >>> report. > >>> > >>> The solution is probably to bind the text to the start of the line with > >>> a caret. Seems to work pretty well so far. > >>> > >>> I've changed the lines (257-263) in metar2nc-v2.4.4 from: > >>> > >>> if( s#(METAR|SPECI) \d{4,6}Z?\n## ) { > >>> $rep_type = $1 ; > >>> } elsif( s#(METAR|SPECI)\s*\n## ) { > >>> $rep_type = $1 ; > >>> } else { > >>> $rep_type = "METAR" ; > >>> } > >>> > >>> To: > >>> > >>> if( s#^(METAR|SPECI) \d{4,6}Z?\n## ) { > >>> $rep_type = $1 ; > >>> } elsif( s#^(METAR|SPECI)\s*\n## ) { > >>> $rep_type = $1 ; > >>> } else { > >>> $rep_type = "METAR" ; > >>> } > >>> > >>> I simply added the caret (^) to bind the pattern to the start of the > >>> report. > >>> > >>> Let me know what you think. > >>> Dave > >>> > >>> > >> > >> > >> =============================================================================== > >> > >> Robb Kambic Unidata Program Center > >> Software Engineer III Univ. Corp for Atmospheric Research > >> address@hidden WWW: http://www.unidata.ucar.edu/ > >> =============================================================================== > >> > >> > >> ------------------------------------------------------------------------ > >> > >> #! /usr/local/bin/perl > >> # > >> # usage: metar2nc cdlfile [datatdir] [yymm] < ncfile > >> # > >> # > >> #chdir( "/home/rkambic/code/decoders/src/metar" ) ; > >> > >> use NetCDF ; > >> use Time::Local ; > >> # process command line switches > >> while ($_ = $ARGV[0], /^-/) { > >> shift; > >> last if /^--$/; > >> /^(-v)/ && $verbose++; > >> } > >> # process input parameters > >> if( $#ARGV == 0 ) { > >> $cdlfile = $ARGV[ 0 ] ; > >> } elsif( $#ARGV == 1 ) { > >> $cdlfile = $ARGV[ 0 ] ; > >> if( $ARGV[ 1 ] =~ /^\d/ ) { > >> $yymm = $ARGV[ 1 ] ; > >> } else { > >> $datadir = $ARGV[ 1 ] ; > >> } > >> } elsif( $#ARGV == 2 ) { > >> $cdlfile = $ARGV[ 0 ] ; > >> $datadir = $ARGV[ 1 ] ; > >> $yymm = $ARGV[ 2 ] ; > >> } else { > >> die "usage: metar2nc cdlfile [datatdir] [yymm] < ncfile $!\n" ; > >> } > >> print "Missing cdlfile file $cdlfile: $!\n" unless -e $cdlfile ; > >> > >> if( -e "util/ncgen" ) { > >> $ncgen = "util/ncgen" ; > >> } elsif( -e "/usr/local/ldm/util/ncgen" ) { > >> $ncgen = "/usr/local/ldm/util/ncgen" ; > >> } elsif( -e "/upc/netcdf/bin/ncgen" ) { > >> $ncgen = "/upc/netcdf/bin/ncgen" ; > >> } elsif( -e "./ncgen" ) { > >> $ncgen = "./ncgen" ; > >> } else { > >> open( NCGEN, "which ncgen |" ) ; > >> $ncgen = <NCGEN> ; > >> close( NCGEN ) ; > >> > >> if( $ncgen =~ /no ncgen/ ) { > >> die "Can't find NetCDF utility 'ncgen' in PATH, util/ncgen > >> /usr/local/ldm/util/ncgen, /upc/netcdf/bin/ncgen, or ./ncgen : $!\n" ; > >> } else { > >> $ncgen = "ncgen" ; > >> } > >> } > >> # the data and the metadata directories $datadir = "." if( ! $datadir > >> ) ; > >> $metadir = $datadir . "/../metadata/surface/metar" ; > >> # redirect STDOUT and STDERR > >> open( STDOUT, ">$datadir/metarLog.$$.log" ) || > >> die "could not open $datadir/metarLog.$$.log: $!\n" ; > >> open( STDERR, ">&STDOUT" ) || > >> die "could not dup stdout: $!\n" ; > >> select( STDERR ) ; $| = 1 ; > >> select( STDOUT ) ; $| = 1 ; > >> > >> die "Missing cdlfile file $cdlfile: $!\n" unless -e $cdlfile ; > >> > >> # year and month > >> if( ! $yymm ) { > >> $theyear = (gmtime())[ 5 ] ; > >> $theyear = ( $theyear < 100 ? $theyear : $theyear - 100 ) ; > >> $theyear = sprintf( "%02d", $theyear ) ; > >> $themonth = (gmtime())[ 4 ] ; > >> $themonth++ ; > >> $yymm = $theyear . sprintf( "%02d", $themonth ) ; > >> } else { > >> $theyear = substr( $yymm, 0, 2 ) ; > >> $themonth = substr( $yymm, 2 ) ; > >> } > >> # file used for bad metars or prevention of overwrites to ncfiles > >> open( OPN, ">>$datadir/rawmetars.$$.nc" ) || die "could not > >> open $datadir/rawmetars.$$.nc: $!\n" ; > >> # set error handling to verbose only > >> $result = NetCDF::opts( VERBOSE ) ; > >> > >> # set interrupt handler > >> $SIG{ 'INT' } = 'atexit' ; > >> $SIG{ 'KILL' } = 'atexit' ; > >> $SIG{ 'TERM' } = 'atexit' ; > >> $SIG{ 'QUIT' } = 'atexit' ; > >> > >> # set defaults > >> > >> $F = -99999 ; > >> $A = \$F ; > >> $S1 = "\0" ; > >> $AS1 = \$S1 ; > >> $S2 = "\0\0" ; > >> $AS2 = \$S2 ; > >> $S3 = "\0\0\0" ; > >> $AS3 = \$S3 ; > >> $S4 = "\0\0\0\0" ; > >> $AS4 = \$S4 ; > >> $S8 = "\0" x 8 ; > >> $AS8 = \$S8 ; > >> $S10 = "\0" x 10 ; > >> $AS10 = \$S10 ; > >> $S15 = "\0" x 15 ; > >> $AS15 = \$S15 ; > >> $S32 = "\0" x 32 ; > >> $AS32 = \$S32 ; > >> $S128 = "\0" x 128 ; > >> $AS128 = \$S128 ; > >> > >> %CDL = ( > >> "rep_type", 0, "stn_name", 1, "wmo_id", 2, "lat", 3, "lon", 4, > >> "elev", 5, > >> "ob_hour", 6, "ob_min", 7, "ob_day", 8, "time_obs", 9, > >> "time_nominal", 10, "AUTO", 11, "UNITS", 12, "DIR", 13, "SPD", 14, > >> "GUST", 15, "VRB", 16, "DIRmin", 17, "DIRmax", 18, "prevail_VIS_SM", > >> 19, "prevail_VIS_KM", 20, "plus_VIS_SM", 21, "plus_VIS_KM", 22, > >> "prevail_VIS_M", 23, "VIS_dir", 24, "CAVOK", 25, "RVRNO", 26, > >> "RV_designator", 27, "RV_above_max", 28, "RV_below_min", 29, > >> "RV_vrbl", 30, "RV_min", 31, "RV_max", 32, "RV_visRange", 33, "WX", > >> 34, "vert_VIS", 35, "cloud_type", 36, "cloud_hgt", 37, > >> "cloud_meters", 38, "cloud_phenom", 39, "T", 40, "TD", 41, > >> "hectoPasc_ALTIM", 42, "inches_ALTIM", 43, "NOSIG", 44, > >> "TornadicType", 45, "TornadicLOC", 46, "TornadicDIR", 47, > >> "BTornadic_hh", 48, "BTornadic_mm", 49, > >> "ETornadic_hh", 50, "ETornadic_mm", 51, "AUTOindicator", 52, > >> "PKWND_dir", 53, "PKWND_spd", 54, "PKWND_hh", 55, "PKWND_mm", 56, > >> "WshfTime_hh", 57, "WshfTime_mm", 58, "Wshft_FROPA", 59, "VIS_TWR", 60, > >> "VIS_SFC", 61, "VISmin", 62, "VISmax", 63, "VIS_2ndSite", 64, > >> "VIS_2ndSite_LOC", 65, "LTG_OCNL", 66, "LTG_FRQ", 67, "LTG_CNS", 68, > >> "LTG_CG", 69, "LTG_IC", 70, "LTG_CC", 71, "LTG_CA", 72, "LTG_DSNT", 73, > >> "LTG_AP", 74, "LTG_VcyStn", 75, "LTG_DIR", 76, "Recent_WX", 77, > >> "Recent_WX_Bhh", 78, "Recent_WX_Bmm", 79, "Recent_WX_Ehh", 80, > >> "Recent_WX_Emm", 81, "Ceiling_min", 82, "Ceiling_max", 83, > >> "CIG_2ndSite_meters", 84, "CIG_2ndSite_LOC", 85, "PRESFR", 86, > >> "PRESRR", 87, > >> "SLPNO", 88, "SLP", 89, "SectorVIS_DIR", 90, "SectorVIS", 91, "GR", 92, > >> "GRsize", 93, "VIRGA", 94, "VIRGAdir", 95, "SfcObscuration", 96, > >> "OctsSkyObscured", 97, "CIGNO", 98, "Ceiling_est", 99, "Ceiling", 100, > >> "VrbSkyBelow", 101, "VrbSkyLayerHgt", 102, "VrbSkyAbove", 103, > >> "Sign_cloud", 104, "Sign_dist", 105, "Sign_dir", 106, "ObscurAloft", > >> 107, > >> "ObscurAloftSkyCond", 108, "ObscurAloftHgt", 109, "ACFTMSHP", 110, > >> "NOSPECI", 111, "FIRST", 112, "LAST", 113, "Cloud_low", 114, > >> "Cloud_medium", 115, "Cloud_high", 116, "SNINCR", 117, > >> "SNINCR_TotalDepth", 118, > >> "SN_depth", 119, "SN_waterequiv", 120, "SunSensorOut", 121, > >> "SunShineDur", 122, > >> "PRECIP_hourly", 123, "PRECIP_amt", 124, "PRECIP_24_amt", 125, > >> "T_tenths", 126, > >> "TD_tenths", 127, "Tmax", 128, "Tmin", 129, "Tmax24", 130, "Tmin24", > >> 131, "char_Ptend", 132, "Ptend", 133, "PWINO", 134, "FZRANO", 135, > >> "TSNO", 136, "PNO", 137, "maintIndicator", 138, "PlainText", 139, > >> "report", 140, "remarks", 141 ) ; > >> > >> # default netCDF record structure, contains all vars for the METAR > >> reports > >> @defaultrec = ( $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS3, > >> $A, $A, > >> $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS2, $A, $A, [( $S3, $S3, $S3, > >> $S3 )], > >> [( $F, $F, $F, $F )], [( $F, $F, $F, $F )], [( $F, $F, $F, $F )], [( > >> $F, $F, $F, $F )], [( $F, $F, $F, $F )], [( $F, $F, $F, $F )], $AS32, > >> $A, > >> [( $S4, $S4, $S4, $S4, $S4, $S4 )], [( $F, $F, $F, $F, $F, $F )], > >> [( $F, $F, $F, $F, $F, $F )], [( $S4, $S4, $S4, $S4, $S4, $S4 )], > >> $A, $A, $A, $A, $A, $AS15, $AS10, $AS2, $A, $A, $A, $A, $AS4, $A, $A, > >> $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS10, $A, $A, $A, $A, $A, > >> $A, $A, $A, $A, $A, $AS2, [( $S8, $S8, $S8 )], [( $F, $F, $F )], > >> [( $F, $F, $F )], [( $F, $F, $F )], [( $F, $F, $F )], $A, $A, $A, $A, > >> $A, $A, $A, $A, $AS2, $A, $A, $A, $A, $AS2, $AS8, $A, $A, $A, $A, > >> $AS3, $A, $AS3, $AS10, $AS10, $AS10, $AS8, $AS3, $A, $A, $A, $A, $A, > >> $AS1, $AS1, $AS1, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, > >> $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS128, $AS128, > >> $AS128 ) ; > >> > >> # two fold purpose array, if entry > 0, then var is requested and > >> it's value > >> # is the position in the record, except first entry > >> @W = ( 0 ) x ( $#defaultrec +1 ) ; > >> $W[ 0 ] = -1 ; > >> > >> # open cdl and create record structure according to variables > >> open( CDL, "$cdlfile" ) || die "could not open $cdlfile: $!\n" ; > >> $i = 0 ; > >> while( <CDL> ) { > >> if( s#^\s*(char|int|long|double|float) (\w{1,25})## ) { > >> ( $number ) = $CDL{ $2 } ; > >> push( @rec, $defaultrec[ $number ] ) ; > >> $W[ $number ] = $i++ ; > >> } > >> } > >> close CDL ; > >> undef( @defaultrec ) ; > >> undef( %CDL ) ; > >> > >> # read in station data > >> if( -e "etc/sfmetar_sa.tbl" ) { > >> $sfile = "etc/sfmetar_sa.tbl" ; > >> } elsif( -e "./sfmetar_sa.tbl" ) { > >> $sfile = "./sfmetar_sa.tbl" ; > >> } else { > >> die "Can't find sfmetar_sa.tbl station file.: $!\n" ; > >> } > >> open( STATION, "$sfile" ) || die "could not open $sfile: $!\n" ; > >> > >> while( <STATION> ) { > >> s#^(\w{3,6})?\s+(\d{4,5}).{40}## ; > >> $id = $1 ; > >> $wmo_id = $2 ; > >> $wmo_id = "0" . $wmo_id if( length( $wmo_id ) == 4 ) ; > >> ( $lat, $lon, $elev ) = split ; > >> $lat = sprintf( "%7.2f", $lat / 100 ) ; > >> $lon = sprintf( "%7.2f", $lon / 100) ; > >> > >> # set these vars ( $wmo_id, $lat, $lon, $elev ) $STATIONS{ > >> "$id" } = "$wmo_id $lat $lon $elev" ; > >> } > >> close STATION ; > >> > >> # read in list of already processed reports if it exists > >> # open metar.lst, list of reports processed in the last 4 hours. > >> if( -e "$datadir/metar.lst" ) { > >> open( LST, "$datadir/metar.lst" ) || die "could not open > >> $datadir/metar.lst: $!\n" ; > >> while( <LST> ) { > >> ( $stn, $rtptime, $hr ) = split ; > >> $reportslist{ "$stn $rtptime" } = $hr ; > >> } > >> close LST ; > >> #unlink( "$datadir/metar.lst" ) ; > >> } > >> # Now begin parsing file and decoding observations breaking on cntrl C > >> $/ = "\cC" ; > >> > >> # set select processing here from STDIN > >> START: > >> while( 1 ) { > >> open( STDIN, '-' ) ; > >> vec($rin,fileno(STDIN),1) = 1; > >> $timeout = 1200 ; # 20 minutes > >> $nfound = select( $rout = $rin, undef, undef, $timeout ); > >> # timed out > >> if( ! $nfound ) { > >> print "Shut down, time out 20 minutes\n" ; > >> &atexit() ; > >> } > >> &atexit( "eof" ) if( eof( STDIN ) ) ; > >> > >> # Process each line of metar bulletins, header first > >> $_ = <STDIN> ; > >> #next unless /METAR|SPECI/ ; > >> s#\cC## ; > >> s#\cM##g ; > >> s#\cA\n## ; > >> s#\c^##g ; > >> > >> s#\d\d\d \n## ; > >> s#\w{4}\d{1,2} \w{4} (\d{2})(\d{2})(\d{2})?.*\n## ; > >> $tday = $1 ; > >> $thour = $2 ; > >> $thour = "23" if( $thour eq "24" ) ; > >> $tmin = $3 ; > >> $tmin = "00" unless( $tmin ) ; > >> next unless ( $tday && defined( $thour ) ) ; > >> $time_trans = thetime( "trans" ) ; > >> if( s#(METAR|SPECI) \d{4,6}Z?\n## ) { > >> $rep_type = $1 ; > >> } elsif( s#(METAR|SPECI)\s*\n## ) { > >> $rep_type = $1 ; > >> } else { > >> $rep_type = "METAR" ; > >> } > >> # Separate bulletins into reports if( /=\n/ ) { > >> s#=\s+\n#=\n#g ; > >> @reports = split( /=\n/ ) ; > >> } else { > >> #@reports = split ( /\n/ ) ; > >> s#\n# #g ; > >> next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; > >> $reports[ 0 ] = $_ ; > >> } > >> > >> > > > =============================================================================== Robb Kambic Unidata Program Center Software Engineer III Univ. Corp for Atmospheric Research address@hidden WWW: http://www.unidata.ucar.edu/ ===============================================================================