Decode CSV USBR Web Download with Abstract Web Data Source


This article will explain how to decode a CSV (Comma-Separated Value) download from a USBR Web Site, using a list of platforms and an Abstract URL.

Consider the web reports available at:
https://www.usbr.gov/pn-bin/webdaycsv.pl?parameter=UNY%20fb,UNY%20af&incr=15&back=4&format=2

In this URL, 'UNY' is the station name for Unity Reservoir. We want two parameters:

The report selects increment = 15 minutes and a 4 hour backlog. The output looks like this:

USBR Pacific Northwest Region
Hydromet/AgriMet Data Access
Although the US Bureau of Reclamation makes efforts to maintain the accuracy of data found in the Hydromet system databases, the data is largely unverified and should be considered preliminary and subject to change. Data and services are provided with the express understanding that the United States Government makes no warranties, expressed or implied, concerning the accuracy, complete- ness, usability or suitability for any particular purpose of the information or data obtained by access to this computer system, and the United States shall be under no liability whatsoever to any individual or group entity by reason of any use made thereof.

BEGIN DATA
DATE       TIME ,  UNY     FB      ,  UNY     AF      
05/02/2017 03:30,    3818.25 ,   23340.00 
05/02/2017 03:45,    3818.23 ,   23322.00 
05/02/2017 04:00,    3818.26 ,   23350.00 
05/02/2017 04:15,    3818.25 ,   23340.00 
05/02/2017 04:30,    3818.24 ,   23331.00 
05/02/2017 04:45,    3818.26 ,   23350.00 
05/02/2017 05:00,    3818.26 ,   23350.00 
05/02/2017 05:15,    3818.26 ,   23350.00 
END DATA


See that there is no identifying header. The report is all data for the single station UNY. We would want to process only the data lines that are in CSV format between the BEGIN and END data tags.

Start by creating Site records for two sites: UNY and BUL. The only thing we really need is a site name. We used Local name type. We also added a description:




Next we created a single configuration record to handle the CSV format. We added the two sensors that are contained in the report:



For each sensor we defined a name, a SHEF data type and a sampling time of 15 minutes. Although, the report contains a time stamp for each sample, so the sampling interval is not really important.

The script to decode a message is called 'CSV' and looks like this:



The 'FIND_BEGIN' statement keeps skipping lines until it finds one that starts with the string 'BEGIN DATA'. When it finds this it skips the line with the column header and then jumps to the DATA statement.

The DATA statement skips to the next line. It then checks for a number at the start of the line. If not found, it jumps to the DONE statement. If it is a number it parses the date, time, and the two sensor values.

Now we are ready to create the two platform records on the Platforms tab. Create a new platform for each site as follows:

Here are the two platform records we created with the transport medium dialogs shown:






The next step is to create a Network List containing these two stations. We called the list 'USBR-CSV':



Now the Data Source to build URLs for each of the stations in the network list:


The important properties are:

The Abstract URL is all on one line, as follows:

https://www.usbr.gov/pn-bin/webdaycsv.pl?parameter=${MEDIUMID}%20fb,${MEDIUMID}%20af&incr=15&back=4&format=2

Finally, the routing spec ties it all together:


Specify the data source name and include the network list under Platform Selection.

For our tests, we specified output=pipe (meaning simply output to stdout on the command line), and Output Format=human readable. Use whatever destination and data source suits your needs.

We can run the routing spec manually with the command:

   rs USBR-test

The output is in Human Readable format:

Message for Platform UNY
                    | Elevation | Storage |
                    |    HR     |   LS    |
        UTC         |    ft     |  ac-ft  |
05/02/2017 06:00:00 | 3818.25   | 23340   |
05/02/2017 06:15:00 | 3818.26   | 23350   |
05/02/2017 06:30:00 | 3818.28   | 23368   |
05/02/2017 06:45:00 | 3818.28   | 23368   |
05/02/2017 07:00:00 | 3818.27   | 23359   |
05/02/2017 07:15:00 | 3818.27   | 23359   |
05/02/2017 07:30:00 | 3818.29   | 23377   |
05/02/2017 07:45:00 | 3818.29   | 23377   |
05/02/2017 08:00:00 | 3818.28   | 23368   |
05/02/2017 08:15:00 | 3818.28   | 23368   |

Message for Platform BUL
                    | Elevation | Storage |
                    |    HR     |   LS    |
        UTC         |    ft     |  ac-ft  |
05/02/2017 06:00:00 | 2516.1    | 23768   |
05/02/2017 06:15:00 | 2516.1    | 23768   |
05/02/2017 06:30:00 | 2516.1    | 23768   |
05/02/2017 06:45:00 | 2516.1    | 23768   |
05/02/2017 07:00:00 | 2516.1    | 23768   |
05/02/2017 07:15:00 | 2516.1    | 23768   |
05/02/2017 07:30:00 | 2516.11   | 23777   |
05/02/2017 07:45:00 | 2516.1    | 23768   |
05/02/2017 08:00:00 | 2516.1    | 23768   |
05/02/2017 08:15:00 | 2516.1    | 23768   |
05/02/2017 08:30:00 | 2516.11   | 23777   |
05/02/2017 08:45:00 | 2516.11   | 23777   |


 You can now use the scheduler to have the routing spec run automatically at desired intervals.
 






Print     Close