Technical Support for Environmental Data Collection Software
Email This Article
Your Name: *  
Your Email: *
Friend's Email: *
Comment:
  *Required Field
Verification Code: *

Decode CSV USBR Web Download with Abstract Web Data Source

  • Author: Michael Maloney
  • Create Time: 05/02/2017 10:52:11
  • Last Update Time: 05/02/2017 10:55:34
This article will explain how to decode a CSV (Comma-Separated Value) download from a USBR Web Site, using a list of platforms and an Abstract URL.

Consider the web reports available at:
https://www.usbr.gov/pn-bin/webdaycsv.pl?parameter=UNY%20fb,UNY%20af&incr=15&back=4&format=2

In this URL, 'UNY' is the station name for Unity Reservoir. We want two parameters:
  • fb = Forebay Elevation in ft (SHEF code HR)
  • af = Acre Feet, representing the reservoir storage (SHEF code LS)

The report selects increment = 15 minutes and a 4 hour backlog. The output looks like this:

USBR Pacific Northwest Region
Hydromet/AgriMet Data Access
Although the US Bureau of Reclamation makes efforts to maintain the accuracy of data found in the Hydromet system databases, the data is largely unverified and should be considered preliminary and subject to change. Data and services are provided with the express understanding that the United States Government makes no warranties, expressed or implied, concerning the accuracy, complete- ness, usability or suitability for any particular purpose of the information or data obtained by access to this computer system, and the United States shall be under no liability whatsoever to any individual or group entity by reason of any use made thereof.

BEGIN DATA
DATE       TIME ,  UNY     FB      ,  UNY     AF      
05/02/2017 03:30,    3818.25 ,   23340.00 
05/02/2017 03:45,    3818.23 ,   23322.00 
05/02/2017 04:00,    3818.26 ,   23350.00 
05/02/2017 04:15,    3818.25 ,   23340.00 
05/02/2017 04:30,    3818.24 ,   23331.00 
05/02/2017 04:45,    3818.26 ,   23350.00 
05/02/2017 05:00,    3818.26 ,   23350.00 
05/02/2017 05:15,    3818.26 ,   23350.00 
END DATA


See that there is no identifying header. The report is all data for the single station UNY. We would want to process only the data lines that are in CSV format between the BEGIN and END data tags.

Start by creating Site records for two sites: UNY and BUL. The only thing we really need is a site name. We used Local name type. We also added a description:




Next we created a single configuration record to handle the CSV format. We added the two sensors that are contained in the report:



For each sensor we defined a name, a SHEF data type and a sampling time of 15 minutes. Although, the report contains a time stamp for each sample, so the sampling interval is not really important.

The script to decode a message is called 'CSV' and looks like this:



The 'FIND_BEGIN' statement keeps skipping lines until it finds one that starts with the string 'BEGIN DATA'. When it finds this it skips the line with the column header and then jumps to the DATA statement.

The DATA statement skips to the next line. It then checks for a number at the start of the line. If not found, it jumps to the DONE statement. If it is a number it parses the date, time, and the two sensor values.

Now we are ready to create the two platform records on the Platforms tab. Create a new platform for each site as follows:

  • Select the site (UNY or BUL).
  • Select the config USBR-CSV-FB-AF.
  • Create a transport medium with medium type 'other', and the Medium Identifier containing the station name.

Here are the two platform records we created with the transport medium dialogs shown:






The next step is to create a Network List containing these two stations. We called the list 'USBR-CSV':



Now the Data Source to build URLs for each of the stations in the network list:


The important properties are:

  • abstractUrl - a template containing ${MEDIUMID} to be substituted with the medium ID for each platform in the network list. (see below)
  • header - other, meaning the file has no identifying header.
  • OneMessageFile - means that the entire downloaded report (file) is to be considered a single 'message' for the purposes of decoding.

The Abstract URL is all on one line, as follows:

https://www.usbr.gov/pn-bin/webdaycsv.pl?parameter=${MEDIUMID}%20fb,${MEDIUMID}%20af&incr=15&back=4&format=2

Finally, the routing spec ties it all together:


Specify the data source name and include the network list under Platform Selection.

For our tests, we specified output=pipe (meaning simply output to stdout on the command line), and Output Format=human readable. Use whatever destination and data source suits your needs.

We can run the routing spec manually with the command:

   rs USBR-test

The output is in Human Readable format:

Message for Platform UNY
                    | Elevation | Storage |
                    |    HR     |   LS    |
        UTC         |    ft     |  ac-ft  |
05/02/2017 06:00:00 | 3818.25   | 23340   |
05/02/2017 06:15:00 | 3818.26   | 23350   |
05/02/2017 06:30:00 | 3818.28   | 23368   |
05/02/2017 06:45:00 | 3818.28   | 23368   |
05/02/2017 07:00:00 | 3818.27   | 23359   |
05/02/2017 07:15:00 | 3818.27   | 23359   |
05/02/2017 07:30:00 | 3818.29   | 23377   |
05/02/2017 07:45:00 | 3818.29   | 23377   |
05/02/2017 08:00:00 | 3818.28   | 23368   |
05/02/2017 08:15:00 | 3818.28   | 23368   |

Message for Platform BUL
                    | Elevation | Storage |
                    |    HR     |   LS    |
        UTC         |    ft     |  ac-ft  |
05/02/2017 06:00:00 | 2516.1    | 23768   |
05/02/2017 06:15:00 | 2516.1    | 23768   |
05/02/2017 06:30:00 | 2516.1    | 23768   |
05/02/2017 06:45:00 | 2516.1    | 23768   |
05/02/2017 07:00:00 | 2516.1    | 23768   |
05/02/2017 07:15:00 | 2516.1    | 23768   |
05/02/2017 07:30:00 | 2516.11   | 23777   |
05/02/2017 07:45:00 | 2516.1    | 23768   |
05/02/2017 08:00:00 | 2516.1    | 23768   |
05/02/2017 08:15:00 | 2516.1    | 23768   |
05/02/2017 08:30:00 | 2516.11   | 23777   |
05/02/2017 08:45:00 | 2516.11   | 23777   |


 You can now use the scheduler to have the routing spec run automatically at desired intervals.