![Page 1: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/1.jpg)
Vega: A Flexible Data Model for Environmental Time Series Data
L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz
![Page 2: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/2.jpg)
Storing High Resolution Sensor Data in a Relational Database
• Deploy system• Create data table • Date/Time column• Each variable is
unique column
Mendota_Buoy_Table:
![Page 3: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/3.jpg)
Accommodate Additional Site• Create Additional
Table• Table Name from Site
Name
Mendota_Buoy_Table:
Long_Lake_Buoy_Table:
1 2
• What about 5 sites? • Or 10?
![Page 4: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/4.jpg)
Changes in Measured Variables
• Add or remove variables
• End up with many NULL fields
• ‘Legacy Structure’
![Page 5: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/5.jpg)
Add Complex Metadata
• Add Metadata– Sensor Info– Data steward– Offset (depth, height)– Sampling Method
• Combine in Field Name– DO_05M– DO_DOPTO_05M– DO_YSI_10M– DO_YSI_CALIBRATED_10M– WIND_SPEED_VECTOR_AVG
![Page 6: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/6.jpg)
Long-term datasets are becoming more common
Sparkling Lake: Air Temperature
-30
-20
-10
0
10
20
30
40
1/1/1987 1/1/1991 1/1/1995 1/1/1999 1/1/2003 1/1/2007 1/1/2011
Date
Air Temp (C)
![Page 7: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/7.jpg)
Vega Data Model
• Goals– Accommodate dataset
changes over time• Eliminate legacy
structure
– Easy to understand and develop software
– Maintain rapid query times
• Inspired by the CUAHSI ODM
![Page 8: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/8.jpg)
Central Concepts
• Values– Individual observation (floating point format)– Air temp at airport at 12:00 1-1-2007 (-5.1° C)– Individually linked to metadata
• Data Streams– Group of Values which vary only in time– Individual time series– All air temp sampled at airport
• Wind speed is different ‘Data Stream’
![Page 9: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/9.jpg)
Vega: Simple
![Page 10: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/10.jpg)
Indexing
• Speeds up searching through large tables– Vega impossible without it
• Similar to an alphabetized phonebook
• With Index: – Time ~ Log(number of rows)
• Without Index:– Time ~ number of rows
• Values Index (also Unique)– DateTime– StreamID
![Page 11: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/11.jpg)
Performance
• 40 million Value Database Time to Query– One Value: 0.07 Sec– ~20k Values: 0.5 Sec
• Data Volumes– GLEON ~90,000 new values per day– Currently storing 30 million values– Values table 2.6 GB
![Page 12: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/12.jpg)
Software Development Gains
• Software for one site works for all sites
• Example: HTML– Many document formatting standards– HTML emerged as standard– Millions of websites can be read by one
browser
![Page 13: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/13.jpg)
Current software for GLEON and Madison LTER: Data Acquisition
![Page 14: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/14.jpg)
Data Retrieval:dbBadger.gleonrcn.org
![Page 15: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/15.jpg)
Data QA/QC
![Page 16: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/16.jpg)
Vision
• Simple software package– No IT support required– Facilitate web-enabled data sharing
• Future– Expand to all GLEON sites– Include those with custom IM system in place
![Page 17: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/17.jpg)
Acknowledgements
• This work was supported by awards from the National Science Foundation grants DEB-0217533, DBI-0639229, and DBI-0446017 and the Gordon and Betty Moore Foundation.
![Page 18: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/18.jpg)
![Page 19: Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz](https://reader035.vdocuments.site/reader035/viewer/2022062618/5514bb01550346f06e8b66c7/html5/thumbnails/19.jpg)
Performance
0
10
20
30
40
50
60
70
80
90
100
3000000 13000000 23000000 33000000
Values Stored
Time to Execute Query (sec)
ODM:37316
ODM:15537
ODM:7245
ODM:1
Vega:44639
Vega:17,279
Vega:8639
Vega:1