revising riverbot outline and specifications christian skalka

30
Revising Riverbot Outline and Specifications Christian Skalka

Post on 21-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Revising Riverbot Outline and Specifications Christian Skalka

Revising Riverbot

Outline and Specifications

Christian Skalka

Page 2: Revising Riverbot Outline and Specifications Christian Skalka

The Riverbot System

• Riverbot is an information retrieval service for ww boaters

• Various webpages (USGS, USACE) report real time gage data for thousands of rivers

• Small subset of online gages of interest to whitewater (ww) boaters

Page 3: Revising Riverbot Outline and Specifications Christian Skalka

Popularity of Riverbot

• Serving the online mid-atlantic ww boating community for over four years

• Approximately 30-40 hits per day

• Currently over 600 subscribers (stable after initial exponential growth)

Page 4: Revising Riverbot Outline and Specifications Christian Skalka

Riverbot Components

1. Database (db) of gages for ww rivers, and mailing list accounts

2. Web robot for retrieving current gage data

3. Web interface for checking gage data, mailing list signup

4. Daily mailing list report

Page 5: Revising Riverbot Outline and Specifications Christian Skalka

Database

• Implemented as formatted UNIX files

• Database search and editing via file I/O, locks used to preserve consistency

• Relatively small size of DB (~650 users, data for ~60 gages) ensures efficiency

Page 6: Revising Riverbot Outline and Specifications Christian Skalka

Web Robot

• Written in Perl, using the LWP library for www access

• Retrieves web pages listed in db

• Uses Perl pattern matching to parse webpages, extract data for db update

• Activated as a cron job, several times daily

Page 7: Revising Riverbot Outline and Specifications Christian Skalka

Web Robot

• Robot essentially a series of http get requests, parses results

• Respects the robot exclusion standard, which mediates robot traffic on websites:– http://www.robotstxt.org/wc/norobots.html – Allows any site to specify where robots

can/cannot roam– USGS, USACE allow robots

Page 8: Revising Riverbot Outline and Specifications Christian Skalka

Mailing List Report

• Riverbot gage report mailed to subscribers daily at 8:00AM

• Written in Perl, scheduled as a cron job

• Automatically composes report based on particular subscriber’s choices

Page 9: Revising Riverbot Outline and Specifications Christian Skalka

Web Interface

• Riverbot website (http://www.riverbot.com) provides a simple interface:

- View current gage levels

- Sign up for mailing list

• Written in HTML, uses Perl CGI to interact with db

Page 10: Revising Riverbot Outline and Specifications Christian Skalka

Riverbot Homepage

Page 11: Revising Riverbot Outline and Specifications Christian Skalka

Gage Selection Page

Page 12: Revising Riverbot Outline and Specifications Christian Skalka

Mailing List Signup Page

Page 13: Revising Riverbot Outline and Specifications Christian Skalka

Current Site Specification

Page 14: Revising Riverbot Outline and Specifications Christian Skalka

Time for the next step

• Extend coverage to entire US, revise web interface

• Allow users more fine-grained, secure control over accounts

• Implement database in SQL to accommodate expanded coverage, user base

• Enhance administration toolkit

Page 15: Revising Riverbot Outline and Specifications Christian Skalka

Time for the next step

• New site currently being implemented by David Van Horn as independent study

• Work-in-progress

• Available by mid-summer(?)

Page 16: Revising Riverbot Outline and Specifications Christian Skalka

New Site Specification

Page 17: Revising Riverbot Outline and Specifications Christian Skalka

New user account features

• Users are distinct entities, and may have several distinct accounts

• Web interface allows user and accounts creation, editing, and deletion

• User passwords authenticate changes to user and account profiles

Page 18: Revising Riverbot Outline and Specifications Christian Skalka

Accounts Management Interface

Page 19: Revising Riverbot Outline and Specifications Christian Skalka

Administration Interface

• Common tasks: deleting accounts, adding and editing gage information

• Administration currently hacked as a collection of shell scripts and file editing

• New implementation, good programming practices prescribe better tools

• Web interface simple and effective

Page 20: Revising Riverbot Outline and Specifications Christian Skalka

Site Administration Interface

Page 21: Revising Riverbot Outline and Specifications Christian Skalka

Implementation Details

• New backend written in PLT Scheme– Modern dialect of Lisp, developed at

Brown/NEU/Rice– Functional, safe language

• Site running on PLT Scheme Webserver– Modern webserver under development at NEU

Page 22: Revising Riverbot Outline and Specifications Christian Skalka

Why PLT Scheme: Safety

• PLT Scheme is a safe language:– Predictable behavior– Buffer overflow attacks prevented

• Web sites written in PLT Scheme more secure than:– Websites written in e.g. C– Websites running on e.g. Apache

Page 23: Revising Riverbot Outline and Specifications Christian Skalka

Why PLT Scheme: State

• Subsequent webpages naturally viewed as I/O during phases of single program

• Fact: http does not allow state to be maintained between requests

• With CGI scripting:– state is a hack; maintained in urls, dynamically

generated form actions

– Various phases of computation must be defined as separate scripts

Page 24: Revising Riverbot Outline and Specifications Christian Skalka

Why PLT Scheme: State

• PLT Scheme webservers use continuations:– Phases of computation represented in single

program– During particular phase, continuation represents

phases still to be computed– Webserver maintains db of continuations,

accessed by generated url in form action

• A principled approach to modelling state within the confines of http requests

Page 25: Revising Riverbot Outline and Specifications Christian Skalka

Another Issue

• Problem: incorrectly entered and dead email addresses mean bouncebacks

• Most significant administrative task is deleting bounceback accounts

• New user interface may help, but what happens when site goes national?

Page 26: Revising Riverbot Outline and Specifications Christian Skalka

Automated Administration

• Solution: use mail preprocessing to filter bouncebacks from incoming messages

• Route bouncebacks to logs

• Automatically delete accounts that bounceback repeatedly

• Many available mail filters, e.g. procmail

Page 27: Revising Riverbot Outline and Specifications Christian Skalka

Miscellaneous Upgrades

• New site graphics, riverbot logo

• Bumperstickers

• Showcase whitewater photography?

Page 28: Revising Riverbot Outline and Specifications Christian Skalka

Conclusion

• Riverbot is a popular and useful website serving the whitewater community

• Time to go national (international?)

• Significant implementation, integration of advanced languages and software systems

Page 29: Revising Riverbot Outline and Specifications Christian Skalka

Conclusion

• Riverbot is a popular and useful website serving the whitewater community

• Time to go national (international?)

• Significant implementation, integration of advanced languages and software systems

Page 30: Revising Riverbot Outline and Specifications Christian Skalka

Conclusion

• Riverbot is a popular and useful website serving the whitewater community

• Time to go national (international?)

• Significant implementation, integration of advanced languages and software systems