How to filter stations out of the CSV files
Several of the mesonets available from MADIS are national in scope, and contain thousands of stations. To cut down on the data volume, you can filter out the stations that are outside your geographic area of interest, by specifying the corners of a latitude/longitude rectangle. Only stations inside the rectangle will be sent into LDAD. If the mesonet is MesoWest (also called UTMESNET), there's an additional option to include only those subproviders that are specified by the user. The filtering scheme will also work with any mesonet obtained from a source other than MADIS, as long as the mesonet meets this requirement: the first column of the data files must contain the same site ID's as found in the first column of the station table. Note that some of the mesonets from MADIS do not meet this criteria, and cannot be filtered with this scheme, but these are all regional and shouldn't need filtering. Here is the list of which mesonets can and cannot be filtered:
Mesonet | Can be Filtered | Has Header | |
---|---|---|---|
1-min ASOS | Yes | No | |
AFA | Yes | No | |
AIRNow | Yes | No | |
AKDOT | No | No | |
AK-Meso | Yes | No | |
APG | No | No | |
APRSWXNET | Yes | No | |
ARLFRD | Yes | No | |
AWS | Yes | No | |
AWX | Yes | No | |
CA-Hydro | Yes | No | |
CAIC | Yes | No | |
CODOT | Yes | No | |
CO_E-470 | Yes | No | |
DCNet | Yes | Yes | |
DDMET | Yes | No | |
DEOS | Yes | No | |
FLDOT | Yes | No | |
FL-Meso | Yes | No | |
GADOT | Yes | No | |
GLDNWS | Yes | No | |
GLOBE | Yes | No | |
GoMOOS | Yes | No | |
GPSMET | Yes | Yes | |
HADS | Yes | No | |
IADOT | No | Yes | |
IEM | Yes | No | |
INDOT | Yes | No | |
INTERNET | Yes | No | |
ITD | Yes | No | |
KSDOT | Yes | Yes | |
KYTC-RWIS | Yes | No | |
LCRA | Yes | No | |
LSU-JSU | No | Yes | |
MAP | Yes | No | |
MDDOT | No | No | |
MEDOT | No | No | |
MesoWest | Yes | No | |
MISC | Yes | No | |
MNDOT | No | No | |
MOComAgNet | Yes | No | |
MQT-Meso | Yes | No | |
NC-ECONet | Yes | No | |
NDDOT | No | No | |
NEDOR | Yes | Yes | |
NERRS | Yes | No | |
NHDOT | Yes | No | |
NJWxNet | Yes | No | |
NonFedAWOS | Yes | No | |
NOS-NWLON | Yes | No | |
NOS-PORTS | Yes | No | |
NWS-COOP | Yes | No | |
OHDOT | Yes | No | |
OK-Meso | No | Yes | |
PADEP | Yes | No | |
RAWS | Yes | No | |
RDMTR | Yes | No | |
SFWMD | Yes | No | |
UDFCD | Yes | Yes | |
UrbaNet | Yes | No | |
USouthAL | Yes | No | |
VADOT | Yes | No | |
VTDOT | Yes | No | |
WIDOT | Yes | No | |
WT-Meso | Yes | No | |
WxFlow | Yes | No | |
WXforYou | Yes | No | |
WYDOT | No | No |
The way this works is that you choose which mesonets you want to filter, create the list of stations to be included by running a script, then process incoming data files with a second script that will filter the unwanted stations and send files containing only the included stations into LDAD.
Here are the details:
- Login to ls1 as user ldad.
- Make a directory to use for the filtering. The example here (and inside the scripts) uses "/ldad/madisfilter", but you can use whatever you like.
- Get the scripts from https://madis-data.ncep.noaa.gov/madisWfo/wfo:
- https://madis-data.ncep.noaa.gov/madisWfo/ [e.g., account= CRP_madis_wfo, password= ?????]
- change directory to wfo/scripts get MADISfilter.pl get build_station_idx.pl quit
- Put the scripts into /ldad/madisfilter, and make them executable.
- Make this subdirectory: /ldad/madisfilter/new.
- Edit MADISfilter.pl:
- For the location of perl (if not in /usr/local/bin/perl).
- For the filter directory (if you're not using /ldad/madisfilter).
- Edit build_station_idx.pl:
- For the location of perl (if not in /usr/local/bin/perl).
- For the filter directory (if you're not using /ldad/madisfilter).
- For the latitude/longitude corners of the rectangle to use for the filtering.
- For how you want to handle MesoWest data, if applicable.
- Copy over all the *Station.txt files from ds1 /data/fxa/LDAD/data for
the mesonets you want to filter. Then run the build_station_idx.pl
script to create an "include list" that specifies only the stations
that are in the rectangle:
- cd /ldad/madisfilter
- ./build_station_idx.pl
The include list that's created is called "included_stations.txt". The script will also save the previous version of that file as "included_stations.txt.old", so you can go back if something goes wrong.
* Whenever you update your LDAD station tables on ds1, you will also need to update them here on ls1, rerun build_station_idx.pl, and then copy over the filtered station tables back to /data/fxa/LDAD/data.
The /ldad/madisfilter/new subdirectory will contain versions of the *Station.txt files that only contain stations that are in the include list. These should be copied to /data/fxa/LDAD/data. This is important because if you go over 15,000 stations progressive disclosure won't work.
- If the desc file for the mesonet to be filtered contains any header
variables, these need to be commented out in the desc file.
As none of the info in the header line of the raw data files is actually needed, this will cause no harm.
- If you are using ftp to ingest data:
- Modify any scripts that put the data files into /data/Incoming to put them somewhere else instead. Then:
- cat MESOWHATEVER.dat | ./ldad/madisfilter/MADISfilter.pl MESOWHATEVER.dat
The input file (MESOWHATEVER.dat) can be in any directory you like *except* /data/Incoming. You can keep these around if you want to see what was in there before the filtering, or you can just toss them once you're satisfied that the filter is working as expected.
The output from the script will be: /data/Incoming/MESOWHATEVER.dat, which will then be processed in the normal manner by LDAD.
- If you are using ldm to ingest data:
- Edit your pqact.conf file to change how you're processing incoming data.
You probably have something like this:
FSL5 ^LDAD\.aws\.(.*) FILE -overwrite -close /data/Incoming/\1 FSL5 ^LDAD\.raw\.(.*) FILE -overwrite -close /data/Incoming/\1 FSL4 ^GPSIPW_CSV_(gpsmet\..*) FILE -overwrite -close /data/Incoming/\1 - Change that to: FSL5 ^LDAD\.aws\.(.*) PIPE -close /ldad/madisfilter/MADISfilter.pl \1 FSL5 ^LDAD\.raw\.(.*) PIPE -close /ldad/madisfilter/MADISfilter.pl \1 FSL4 ^GPSIPW_CSV_(gpsmet\..*) PIPE -close /ldad/madisfilter/MADISfilter.pl \1
- To activate the changes in pqact.conf: ldmadmin pqactHUP.
- Edit your pqact.conf file to change how you're processing incoming data.
You probably have something like this:
* The MADIS team would like to thank Clark Safford of FFC for contributing the original software on which this filtering scheme is based. (Please contact us, however, and not Clark, with any questions.)
Last updated 16 March 2017