Berkeley Sensor Database总结
2011-11-16 20:53 shy.ang 阅读(328) 评论(0) 编辑 收藏 举报Berkeley Sensor Database
l Based on Observation Data Model(ODM), MySQL, Apache Web Server and Perl
l Modified ODM Schema
l Data Loader
l Web Interface
l Administrative and Reporting functions
Data Loader
runs hourly to check for new measurements and then populate the Sensor Database.
performs a basic “sanity check” on incoming values, and issues email alerts to the datasdream’s contack if invalid conditions are found.
functions to be performed below:
- Checks for new timestamps arriving from the field (compares new timestamps to the latest one in the DB)
- Checks for new devices and logger configuration changes in the logger file
- Gets metadata from the database for each new measurement (i.e., variable name and units, station name, contact, etc.)
- Performs sanity checks: is the new value within the device's specs range? has the device stopped reporting?
- Flags any values that fail the sanity check, prepares alerts
- Converts raw data to geophysical units as required
- Assigns a Data Quality Level (raw, converted, etc.) for each datastream
- Inserts new records into the MySQL database
- Updates statistics
User Interface
Web Interface interacted with the Sensor Database, Apache Web Server, MySQL, Perl software, Perl modules. Users can:
- Query all data, or only specific stations(s) or variables(s)
- View query results as a graph, or optionally in a table
- Download query results, or in bulk by station
- View all metadata (stations, methods, sites, etc.)
- View data statistics
- Report "incidents" (such as damage to a device) that affect data quality
- Add and edit all metadata
- Read documentation about the database
Controlling Access to the Data
web-based: login
“people” table: access level 1-4
“Station”: access code 1-4
sql queries with access code are dynamically generated.
Data Integrity
The requirements for our system are:
(1) All versions of data (raw, converted, derived, and corrected) must be stored. Converted, corrected and derived levels of data should be annotated.
(2) Users must be able to flag questionable data.
(3) The system must do a basic "sanity check" on incoming data.
(4) Any flags or comments on the data should be displayed with query results.
Data Quality Levels
QualityControlLevelCode is assigned to each incoming measurement by the Data Loader.
Data Qualifiers
QualifierID is assigned to each incoming measurement by the Data Loader.
Administrative and Reporting Modules
monitor the workflow,update statistics,refresh data caches,and issue reports.
Workflow Monitor
- datalogger polling: A server in the field polls dataloggers hourly to collect new data, and retrieves it over a wireless network.
- data transfer from the field: Once per hour, the field server transmits new data to a staging area at UC Berkeley.
- data loading: Once per hour, the Data Loader checks the staging area for new data and loads it into the database as described above.
Updating Statistics
Refreshing Data Caches
Bulk Data Download:
The Growth of the database makes querying large spans of data take a long time
“Bulk Data Download”: all the data for a specific station can be downloaded as a Zip file.
Caching Metadata:
Minimize the number of queries needed to satify a user request.
Cache metadata