pairs user’s manual & api documentation · pairs user’s manual table of contents 1. ......

65
1 PAIRS User’s Manual & API Documentation

Upload: others

Post on 19-Apr-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

1

PAIRS User’s

Manual & API

Documentation

Page 2: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

2

PAIRS User’s Manual

Table of Contents 1. INTRODUCTION ..................................................................................................................................... 4

2. PAIRS ACCESS ........................................................................................................................................ 5

A. Web address & Log in ....................................................................................................................... 5

3. PAIRS WEB MENU ................................................................................................................................. 6

A. Query................................................................................................................................................. 6

3.A.1. Submit new ........................................................................................................................... 8

3.A.2. Area of interest ................................................................................................................... 12

B. JOBS ................................................................................................................................................. 13

C. Metadata ......................................................................................................................................... 14

3.C.1. Dataset ................................................................................................................................ 14

3.C.2. Data layers .......................................................................................................................... 15

3.C.3. Data table ............................................................................................................................ 16

3.C.4. Data regions ........................................................................................................................ 17

3.C.5. Color Table .......................................................................................................................... 17

D. Administration ................................................................................................................................ 18

E. Help ................................................................................................................................................. 19

F. Logout ............................................................................................................................................. 19

4. JOIN DATASETS AND FILTER WITH CONDITIONS ................................................................................ 19

A. Temporal data retrieval from PAIRS ............................................................................................... 19

B. Data filtering and joining principles ................................................................................................ 20

C. Add condition (filtering and joining) ............................................................................................... 22

D. Aggregate ........................................................................................................................................ 23

Page 3: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

3

5. EXAMPLES OF WEB QUERY ................................................................................................................. 24

6. PAIRS API ............................................................................................................................................. 24

A. Dataset (/ws/datasets).................................................................................................................... 24

B. Datalayer (/ws/datalayers) ............................................................................................................. 25

C. Submitting a Query (/ws/query/submit) ........................................................................................ 25

6.C.1. Spatial Coverage .................................................................................................................. 26

6.C.2. Temporal Coverage ............................................................................................................. 27

6.C.3. Data Selection ..................................................................................................................... 27

6.C.4. Filtering ............................................................................................................................... 28

D. Query Job (/ws/queryjobs) ............................................................................................................. 29

E. Area of Interest (/ws/queryaois) .................................................................................................... 30

F. Query Examples .............................................................................................................................. 30

6.F.1. Single point query ............................................................................................................... 30

6.F.2. Rectangular area query ....................................................................................................... 31

6.F.3. Examples with GFS and MODIS ........................................................................................... 32

6.F.4. Using wget ........................................................................................................................... 32

7. PAIRS USER DATA UPLOAD ................................................................................................................. 33

A. Create a New Dataset (only for PAIRS Admins) .............................................................................. 34

B. Create a New DataLayer ................................................................................................................. 35

C. Uploading Data Onto PAIRS ............................................................................................................ 36

D. Technical Notes: .............................................................................................................................. 38

8. DATA TABLE CREATION and UPLOAD ................................................................................................. 39

A. Create a New Data Table ................................................................................................................ 39

B. Upload Point Data into Data Table ................................................................................................. 42

C. Query Point Data on PAIRS ............................................................................................................. 43

Page 4: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

4

Appendix ..................................................................................................................................................... 47

A. DETAILS OF DATASETS .................................................................................................................... 47

1) High resolution satellite biweekly Landsat 8 .................................................................................. 49

2) High resolution satellite biweekly Landsat 8 (SR) ........................................................................... 49

3) Medium resolution satellite daily: Aqua (13), Aqua (09 SR), Terra (13), Terra (09 SR) .................. 51

4) Prism Climate Data .......................................................................................................................... 51

5) USA weather forecast ..................................................................................................................... 52

6) California weather condition measurements ................................................................................. 52

7) Global weather forecast.................................................................................................................. 53

8) ECMWF (European Center for Medium-Range Weather Forecasting) ........................................... 54

9) Historical crop planting map ........................................................................................................... 55

10) Elevation ..................................................................................................................................... 57

11) Soil properties ............................................................................................................................. 58

12) Reference Evapotranspiration .................................................................................................... 59

13) SMT – IBM’s cognitive forecast in USA ....................................................................................... 59

14) SMT-IBM’s Long Term Forecast Globally .................................................................................... 60

B. Acknowledgements ......................................................................................................................... 61

C. Glossary ........................................................................................................................................... 63

D. References ...................................................................................................................................... 65

1. INTRODUCTION Physical Analytics Integrated Data Repository and Services (PAIRS) is a big data analytics platform

coupled with a massive store of aligned pre-processed geo-spatial data for macroscopic analytics. The

spatial and temporal indexed data store from various data layers is the key differentiator of PAIRS,

drastically accelerating analytics workflows by minimizing data discovery and processing for large scale

analytics [1]. Complex multilayer queries can be achieved orders of magnitudes faster than through

Page 5: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

5

conventional data services. PAIRS is based on open source Hadoop/ HBase distributed data technology,

and the PAIRS API uses a RESTFul Web Service implementation.

PAIRS accepts spatial queries in the form of physical boundaries (polygons, rectangles), or single points,

combined with temporal queries in the form of time intervals or single dates. Spatial queries can also be

based on characteristics of data layers (e.g., satellite, weather, soil type, etc.), where only those areas

that satisfies certain criteria are returned. This filtering process can be applied to different data sets

within a single query. These filtering and cross-layer type of queries can be extremely powerful, allowing

users to leverage the big-data platform while only downloading data from areas that they are ultimately

interested in.

Current datasets include up-to-date global satellite imagery, weather and climate data, topography,

historical measurement data, drone images, soil properties, land use data, and one of a kind analytics

datasets. The analytics layers currently available on PAIRS are (1) a significantly improved short term

weather forecast (SMT, approximately 30 % better than other publicly available forecasts) based on

machine learning [2], (2) a significantly improved long term weather forecast (6 months ahead daily

forecast) based on machine learning, (3) global reference evapotranspiration (ET0) forecasts. The

datasets can be accessed through a web interface (Section 2-5) or an API (Section 6). Queried data is

returned in standard file formats (e.g., geotiff for 2 dimensional data, csv or json for point data).

Another key value of PAIRS platform is that users are allowed to upload their own data of interest onto

the platform and have their data integrated and cross-linked the same way to all the available data sets

on PAIRS. A detail instruction on how to upload user data is in Section 7.

An introduction video to PAIRS is available https://www.youtube.com/watch?v=CneLY8XAp-Q as well as a demo video available at https://www.youtube.com/watch?v=HtiF7McC8ck.

2. PAIRS ACCESS

Web address & Log in

PAIRS web access is provided through https://pairs.res.ibm.com (to get the most recent updates, refresh

the browser periodically).

Fig 2.1 shows the log in page.

Page 6: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

6

Fig. 2.1 Log in page

A user account and password are required for the use of PAIRS. The signup process is self-explanatory.

Once a User account has been granted, you can log in with your account and password to see the main

menu. There are 6 main menu items on the main page: Query, Jobs, Metadata, Administration, Help,

and Logout. We will introduce them one by one in the next section.

3. PAIRS WEB MENU

Query

The Query tab leads to a page where query parameters can be entered. There are two different modes,

“Submit new” issues a query, and “Area of Interest” lets users upload their own polygons, after which

they are available for new queries.

Page 7: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

7

Page 8: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

8

Submit new

“Submit new” query requires 3 types of inputs: Spatial coverage, Temporal coverage, and Data selection,

as seen in Fig 3.1 of the Query - Submit New page.

Spatial coverage can be defined as either: “Single Point”, “Polygon”, or “Area”.

Temporal coverage can be an “interval” of days or a single “date” of interest.

Data selection defines the datasets and parameters of interest for the query. We currently have 5

categories of data available: Satellite, Weather, Survey, Analytics, and Client data (only visible to the

owner of the data).

Single point is the most convenient in retrieving data for a single location or a set of point locations. Fig

3.1 shows the Single point query interface. A detail tutorial video of Point Query is available here:

https://pairs.res.ibm.com/manual/videos/point_query.webm Please note the Latitude and

Longitude take the convention that Latitude has positive values in the Northern Hemisphere and

negative values in the Southern Hemisphere, while Longitude takes positive values in the Eastern

Hemisphere, and takes negative values in the Western Hemisphere. For example, in USA continent the

Latitude will be positive, and the Longitude will be negative. In Australia, the Latitude will be negative,

and the Longitude will be Positive. The interactive map interface lets users click on anywhere of the map

to define the latitude and longitude.

Fig. 3.2 shows the Polygon query interface. Polygons are predefined in the “Area of Interest” (aoi) page

under the Query menu. Under Polygon query, the list of available polygons such as “Australia-

Victoria_State(Aus-Victoria)” for “name (key)” of an aoi is shown for Victoria State of Australia. Currently

we only support kml file upload of polygons. Polygons uploaded by the user will be shown under the

Personal tab, while polygons shared by other users within the same user group will be shown under the

Group tab. There is also a new feature called “search repository” which provides a base set of polygons

for users to use that includes all the states of USA. (all the countries worldwide are coming up soon).

Just start your search by typing, in this case I typed “usa”, and all the states shown up in the list. Pick one

by clicking on it will choose it for the query.

A rectangular Area query on the map can be defined using the Latitude/Longitude (from SW) – the

south west corner location, and the Latitude/Longitude (to NE) – the north east corner location of the

rectangular area of interest, as shown in Fig 3.3. The rectangular area can be defined by clicking and

dragging the mouse on the map. The rectangle can be dragged to different locations and resized once

defined on the map.

Page 9: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

9

Fig. 3.2 Polygon query

Fig. 3.3 Rectangular Area query

Page 10: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

10

Interval as shown in Fig 3.1 defines the start date and end time of the query temporal range. (from

2015-07-01 to 2015-07-02 will retrieve data with timestamp >= “2015-07-01 00:00:00” and <= “2015-07-

02 00:00:00”.

Date is used to pick a single timestamp for the query. 2015-07-01 00:00:00 will retrieve data from the

closest available temporal point at or before 2015-07-01 00:00:00.

Datasets are grouped into 5 main categories: Satellite, Weather, Survey, Analytics, and Clients. The

parameters or bands of a dataset are called data layer. For example, different bands of satellite images

are considered layers for the satellite dataset, while different weather parameters are layers for a

specific weather model. ECMWF (European Centre for Medium-Range Weather Forecast) Weather

Forecast for example has tens of parameters. To date we have ingested the key weather parameters

into PAIRS, including temperature, wind, pressure, precipitation, solar irradiance, etc.

The details of the available datasets are listed in the appendix section.

Once a dataset is chosen, the corresponding available layers will populate the datalayer field. The user

can add datalayers to selection by highlighting them and clicking the double right arrow.

Multiple Datasets and multiple Layers can be selected for the same query. You can also deselect one

layer at a time by using the left arrow.

Click SUBMIT to submit the request and the JOBS page window will open automatically. Each request

will be logged and saved in the users account. The data retrieval is saved in the user account, so the user

does not have to wait around for the query to complete in case the query requests large amount of

data.

There are two quick warnings built into the query submission, the first is the large data size warning. If

the retrieved data size is larger than 200MB, a pop up box will show “Large Query Warning! The query

requests PAIRS to search through xxxExx MB of data. Continue?”. You can determine if you want to

continue, or CANCEL the query to scale down the size. Please see Fig 3.4.

The second warning is the “no data at the center location” warning message as seen in Fig 3.5. In this

case, the query is trying to get CIMIS (California Weather Condition Measurements) data in Kansas. For

whatever reason, if the center point of the query area has no data, this warning box will show up. You

can decide to CANCEL and correct the query conditions, or CONTINUE the query since you know there

will be data in your area selection outside the geometrical center of the area selection.

Page 11: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

11

Fig. 3.4 Large query warning box.

Fig. 3.5 No data at the center location of interest warning box.

Once the data has been retrieved, you can find it under “JOBS” on the main menu. Small datasets should

be available quickly, and large datasets can take some time. You don’t need to wait for the result to

show up in the JOBS menu. The results will be there even after you log out and log in again. This is

different than a typical web search function. Data retrieval is done in the background. Here are some

empirical guidelines for query:

- elevation has very high resolution of 10 meter, so it is recommended to query an area below the

level of a county typically around 100 square miles or less

- crop planting map and Landsat data have a resolution of 30 meter, so it is recommended to

query below the level of a medium size state around 1,000 square miles or less

- MODIS satellite data resolution is ~250 meter, this dataset can be reliably queried in the

medium size state level ~ 250,000 square miles or less

Page 12: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

12

- all weather data have relatively coarse resolution, so it can be queried in the country level ~

millions of square miles

Basic aggregation can be performed by checking the Min, Max, or Mean fields. This aggregates the data

for its Minimum, Maximum, or Mean over the selected temporal period.

Additional filtering and join operations can be added using the “Add condition” button as will be

explained in section 4.

Area of interest

Areas of interests are predefined regions that can be used to submit polygon queries as shown in Fig 3.2.

They are specified using GIS shape formats. Currently only KML is supported. Fig 3.6 shows the Area of

Interest page, where users can upload their own kml polygon files.

Fig. 3.6 Area of Interest

Clicking the Add sign above will bring you to the polygon upload page shown in Fig 3.7.

Page 13: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

13

Fig. 3.7 Upload polygon file in kml format to define Area of Interest

Specify the Key, and Name, then click Browse… to choose a kml file on your computer. In this case we

choose KS.kml. You can choose to share your polygon files with other users within your default user

group by checking the box next to “Share with group?”. Once the polygon is uploaded, it will show up

under “Submit new” queries, in the Personal Polygon list and also in other people’s Group Polygon list if

you choose to share with the group.

JOBS

Jobs are queries submitted to PAIRS. Once a query is submitted, you will automatically be directed to

the JOBS page. The JOBS page shows current jobs that are still in progress with progress percentages,

and lists all completed jobs from previous queries. The QueryJobs page is shown in Fig 3.8 with its

download and visualize functions. There are 8 main features listed on the figure. Please explore them

accordingly, and watching the tutorial video can also be very helpful.

https://www.youtube.com/watch?v=HtiF7McC8ck

Page 14: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

14

Fig. 3.8 JOBS page

A completed job will let the user download all the files in geotiff format by clicking the download button

next to the job name. Users are encouraged to download the results and carry out further analytics for

their study.

Metadata

Metadata contains overviews of available data in terms of Datasets, Layers, Regions, and Color Table.

Dataset

The first item in the drop-down menu of the METADATA menu is Dataset (Fig. 3.9). The displayed list

can be narrowed by Filtering on a string entered in the Name field (press Filter again after entering a

Name). Clicking the file Size icon in the last row will show the dataset size in MB.

Page 15: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

15

Fig. 3.9 Available datasets under METADATA

Data layers

The second item in the METADATA drop-down menu is data layer. As discussed above (3.1.1.), a

datalayer represents a parameter or band of a dataset. For a chosen dataset, clicking the Filter button

will show only the associated datalayers in that dataset(Fig 3.10). Here the layer name will be shown

together with a Column Family and Column Qualifier, which can be ignored for now. The Level is the

corresponding resolution that the data layer is projected onto. The higher the number the higher the

resolution is. The addition sign enables the users to upload their own data layers within a data set

when granted permission. Please note that users will not be able to add a new layer if the user is not the

owner of the data set. For example, users can not edit any of the base data sets we provide. We will see

in the section 7 – PAIRS User Data Upload on how to use this to add additional layers to the data set the

users upload.

Page 16: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

16

Fig. 3.10 Example of available data layers in the case of PRISM data, accessed through METADATA and

Data Layers.

Data table

We are developing a new capability to host csv files called data table on PAIRS. This will enable many

time series data, such as Internet of Things (IoT) sensor data collection. They are essentially time series

data with location information. We can add multiple data tables to a new or existing data set. So data

table is the sibling to data layer in 2D raster data. We have a separate section describing how data table

works, how to upload this type of data and then query it. Please refer to section 8 for details. Fig 3.11

shows an example of available data table view under pairsadmin dataset (trial users won’t have access).

Fig 3.11 An example of data table view under the pairsadmin dataset

Page 17: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

17

Data regions

The third item in the METADATA drop-down menu is data region (Fig. 3.12). A datalayer is associated

with a spatial coverage, and a temporal coverage. Data region shows the detailed temporal availability

of each region. For example, MODIS satellite data comes in tiles which are defined by horizontal/vertical

index. Satellite images all have regions, which can be scenes, tiles, or other definition of regions. Most

other data sets do not have regions. For example, weather data comes as a single region for each

parameter under a data set.

Fig. 3.12 Data Region view

Color Table

Color Table lets users define their preferred color scales to be used for specific data sets and data layers

as shown in the Fig 3.8 item #4. As shown in Fig 3.13 users can add, edit, delete color tables. Please do

not delete or modify any existing color tables because all of us are sharing this list. You can add your

favorite color table and it will become available for the whole community to use. We plan to implement

user specific color table collection, so users will only be able to edit/modify/delete their own color

tables, but can use all the available color tables.

Page 18: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

18

Fig. 3.13 Color Table submenu view: top is the current list, and bottom is editing view of Radiator color

table.

Administration

Password change is done through the Administration page as seen in Fig 3.14.

Page 19: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

19

Fig. 3.14 Password change under Administration tab

Help

User manual and tutorial video links are available under the submenu of the Help page. There is also an

Acknowledgement page for the sources and citation of data sets.

Logout

Logout will bring you back to the Logon page.

4. JOIN DATASETS AND FILTER WITH CONDITIONS

Temporal data retrieval from PAIRS

It is very easy to join datasets in PAIRS in both spatial and temporal terms.

PAIRS support two types of temporal query: snapshot of a single day (Date) and Interval.

• Snapshot (Date): In this mode, PAIRS will return a single snapshot for each of the selected datalayers.

Only one timestamp is entered in the query, and for each data layer the closest date at or before the

snapshot is returned.

Page 20: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

20

Fig. 4.1 Illustration of snapshot temporal data query

In this case, if all 3 datalayers were selected PAIRS would return the 01/01/14 timestamp of A, the

07/01/14 timestamp of B and the 01/01/14 timestamp of C. The timestamp chosen is the closest

timestamp before the snapshot.

• Interval: in this mode, PAIRS will return all the data that falls between the two timestamps entered in

the query. This can be zero, one, or many timestamps for each chosen data layer.

Fig. 4.2 Illustration of interval temporal data query

In this query, if all the three datalayers are selected PAIRS would return: the 01/01/14 timestamp of A;

the 12/01/13, 01/05/14, 07/01/14, 12/01/14 timestamps of B; and the 01/01/14 timestamp of C.

Data filtering and joining principles

PAIRS allows different filters to be applied during a query, returning only the filtered data to be used in

your analytics, here are some examples:

Page 21: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

21

query 1, simple filter on the same layer selected: (filtering is defined by the spatial selection functions

mentioned under the query function – only for polygon or rectangular regions, not for single point

queries)

Fig. 4.3 Filtering on single data layer on a chosen spatial area

Here the filter (Data Layer A EQ 8) was applied on the same layer selected on the data coverage, the

result looks like the raster on the right side on Fig 4.3.

query 2, filter applied on a different layer, same resolution

Here the layer used in the filter (Data Layer B EQ 4) is not entered in the selected layers. PAIRS will apply

the filter to find the spatial coverage and return the data for the selected layers with the filter applied as

shown in Fig. 4.4.

Fig. 4.4 Filtering on different datalayers with the same spatial grid resolution

Page 22: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

22

query3. filter applied on a different layer with a different resolution:

In this case the filter (Data Layer C EQ 5) was applied on a different layer that also has a different

resolution (lower). PAIRS will align all the layers using the higher resolution and return the filtered data,

see Fig 4.5.

Fig. 4.5 Filtering and joining datalayers with different spatial grid resolution

Add condition (filtering and joining)

The filtering and joining functions introduced in 4.2 are realized by using the “Add Condition” button on

the bottom of the query page. The Add condition button lets users add additional operations to filter

the data for the selected layers in polygon or rectangular spatial queries, as shown in Fig 4.6.

OPERATION defines the operation that can be applied, the options are:

• Equals to: value needs to be equal to

• Greater than: value needs to be greater than

• Less than: value need to be lower than

• Between: value needs to be in between two values

• Among: value needs to be among the list

VALUE is the value that should be applied to the condition.

Multiple conditions can be connected with logical operators. Currently there are only two operators

available AND and OR. “AND” will only return true if all the conditions are true. “OR” will return true if

any of the conditions is true. In cases where multiple conditions are connected with AND and OR, AND

Page 23: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

23

takes precedence over OR. For example a filter entered as A OR B AND C, where A, B, and C are

conditions, will be executed as A OR (B AND C).

Fig. 4.6 Data filtering & joining

Aggregate

In cases where a temporal aggregate of a data layer is of interest, PAIRS offers the option of

downloading only the calculated aggregate rather than the complete set of raw data timestamps.

Currently offered aggregation functions include min, max, and mean, and they are chosen by checking

the corresponding box. Multiple aggregation functions can be chosen in the same query. Fig. 4.7 shows

such an example. The functions are applied over temporal period.

Page 24: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

24

Fig. 4.7 Aggregate retrieved data

5. EXAMPLES OF WEB QUERY Here are a few video demos showing examples of cross join of different datalayers.

https://www.youtube.com/watch?v=aDlHsxyRlys (Orange Farms in Florida)

https://www.youtube.com/watch?v=Bx_c1pykelQ (Wild Fire Potential)

https://www.youtube.com/watch?v=igJcm6uWFcQ (Multiple demos)

6. PAIRS API PAIRS provides an API interface for users to write scripts to perform queries. Use your PAIRS access URL

<PAIRS URL>: https://pairs.res.ibm.com/ as a prefix to /ws/… described below, i.e. <PAIRS

URL>/ws/..., e.g.

https://pairs.res.ibm.com/ws/datasets/list

Returned data will be in either JSON or CSV format for point queries, and GeoTiff format for

area/polygon queries.

Table 1 in Appendix A shows the current list of datasets with its id, key, name, level, and status.

Dataset (/ws/datasets)

A dataset object is defined by the following properties:

• id (numeric): unique ID of a dataset

• key (text): unique string key

• name (text): the dataset name

• level (numeric): the PAIRS resolution level associated with the dataset

• layers (list<Datalayer>): list of all datalayers of this dataset

These are the operations available for datasets:

• Get

Page 25: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

25

(/ws/datasets/<dataset ID>): returns a full description of the dataset with the ID provided,

e.g. /ws/datasets/5

• List (/ws/datasets/list): returns a list of all datasets available: /ws/datasets/list

• Search (/ws/datasets/search): returns a list of all datasets that satisfy the filter – any dataset

property can be used to filter this list, e.g.

/ws/datasets/search?name=satellite

Datalayer (/ws/datalayers)

The PAIRS datalayer represents a layer of data in raster format. A datalayer has a spatial as well as a

temporal coverage. These are the properties associated with datalayers:

• id (numeric): unique ID of a datalayer

• name (text): the datalayer's name

• dataset (text): the parent dataset of this datalayer

• type (text): the datatype, options available are

bt: byte (integer), 1 byte

sh: short (integer), 2 bytes

in: integer (integer), 4 bytes

fl: float (floating point number), 4 bytes

db: double (floating point number), 8 bytes

These are the operations available for datalayers:

• Get (/ws/datalayers/<datalayer ID>): returns a full description of the datalayer

with the ID provided, e.g.

/ws/datalayers/111

• List (/ws/datalayers/list): returns a list of all datalayers available

• Search (/ws/datalayers/search): returns a list of all datalayers that satisfy the filter, e.g.

/ws/datalayers/search?name=NDVI

Submitting a Query (/ws/query/submit)

The PAIRS API can be used to submit many kinds of query. The URL to submit a query is:

Page 26: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

26

/ws/query/submit

Currently three types of queries are supported on PAIRS: rectangular, polygon, and point. They have a

lot in common, except for the spatial coverage specification.

The parameters required to submit a query can be divided into 4 types: spatial coverage, temporal

coverage, data selection, and filtering conditions.

Here are some examples together with the definitions:

Spatial Coverage

There are three types of Area of Interest (AoI): rectangular area, polygon, and point.

Rectangular Area (spatial area/bounding box, type=square): To perform a query on a rectangular

region, two coordinates need to be provided (lower left SW and upper right corners NE). The

coordinates use latitude and longitude separated by comma (,) with latitude followed by longitude. Here

are some examples:

/ws/query/submit?type=square&coordinates=38,-122,39,-121&datalayers=26015&intervals=03/31/16

This will query a rectangular region with bounding box from 38N, 122W to 39N, 121W for cloud cover

from ECMWF for the date 03/31/2016

/ws/query/submit?type=square&coordinates=-40.55,105.2,-

40,105.6&datalayers=26015&intervals=03/31/16

This will query a rectangular region with bounding box from 40.55S, 105.2E to 40S, 105.6E

The response from these queries will be similar to the following:

{ "id": "1456870865483_69044", "status": "Running", "start": 1458833191762, "pql": null, "swLat": 38, "swLon": -122, "neLat": 39, "neLon": -121, "exPercent": 0, "flag": false }

The id field above is the job id you need to download the data or query the status of the query.

Polygon (spatial area/polygon): PAIRS supports the submission of queries using a predefined area of

interest (AoI). This has to be pre-loaded, so it can be used to submit new queries. To list the available

AoI associated with your profile, check the section 6.5 of this document (/ws/queryaois/ will list all your

AoI and with ID). Once you have the AoI, you can specify it during query submission using the AoI

parameter. Here are some examples:

Page 27: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

27

/ws/query/submit?type=poly&aoi=22&datalayers=25005&intervals=05/01/16

This will use the AoI with ID 22 (this is central valley California, 25005 is for Rain forecast from SMT-Long

Term Forecast)

/ws/query/submit?type=poly&aoi=kansas&datalayers=25005&intervals=05/01/16

This will use the AoI with key equal to kansas

Point (point location): The third query type supported by PAIRS is the point query. In this case you get

the data from all layers selected for all the points. Different from the previous two, the point query has

the option to return the data either in CSV or in JSON format. The default is in JSON format. Here are

some examples:

/ws/query/submit?type=point&formatType=csv&coordinates=38,-122,38.1,-

122.1&datalayers=111,140&intervals=02/01/15

This will return crop planting (111) and elevation (140) data for two locations: 38N, 122W and 38.1N,

122.1W for the two parameters (111 and 140) in csv format.

/ws/query/submit?type=point&coordinates=38,-122,40,-

121&datalayers=51&intervals=03/31/16,05/31/16

This will return data for multiple locations 38N, 122E and 20S, 121W (51 is MODIS_aqua satellite NDVI)

and for the time between 03/31/2016 and 05/31/2016 in JSON format.

Temporal Coverage

PAIRS supports two different types of temporal coverage when it comes to querying: snapshot and

interval.

Snapshot (date): In this mode, PAIRS will return a snapshot of all the datalayers selected for the given

timestamp. Only one timestamp is provided. Here is an example of a snapshot time query:

/ws/query/submit?type=point&coordinates=38,-122,40,-121&datalayers=51&intervals=02/01/15

this returns available data in PAIRS closest to and before/on Feb 1, 2015

Interval (time frame): In this mode, PAIRS will return all versions of the data in between the two

timestamps defined. Here is an example of time interval query:

/ws/query/submit?type=point&coordinates=38,-122,40,-

121&datalayers=51&intervals=02/01/14,02/01/15

this returns data available within the time frame from Feb 1, 2014 to Feb 1, 2015

Data Selection

Data selection will use the datalayer ID which can be retrieved from the metadata API of datalayers,

section 6.2. Here are two examples:

Page 28: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

28

/ws/query/submit?type=point&coordinates=38,-122,40,-121&datalayers=10& intervals=08/01/16

returns datalayer 10 only

/ws/query/submit? type=point&coordinates=38,-122,40,-121&datalayers=10, 80,

90&intervals=08/01/16

returns datalayers 10, 80 and 90

Filtering

PAIRS provides different kinds of filters to be applied during a query. The parameter to specify the filter

is filter.pql. Here are some examples:

/ws/query/submit?filter.pql=10EQ8

simple filter on the same layer selected: layer ID 10 equals to 8

/ws/query/submit?filter.pql=10GT8AND20EQ100

filter applied on a different layer: layer ID 10 greater than 8 and layer 20 equals to 100

The filter can be a combination of multiple expressions connected by a logical operator. Each expression

has 3 elements: <LAYER> <OPERATOR> <VALUE>. Here

LAYER is the ID of the layer that this filter should be applied to

OPERATION defines the operation that should be applied, the options are:

EQ (Equals): value needs to be equal to <VALUE>

GT (Greater than): value needs to be greater than <VALUE>

LT (Lower than): value need to be lower than <VALUE>

BT (Between): value needs to be in between two values <VALUE> which are

comma separated, e.g. 10 BT 1,4.6 for the values of datalayer with ID 10 in between 1 and 4.6

(boundary values not included!), if the first value is greater than the second one, an error is thrown

AM (Among): value needs to be among a comma separated list <VALUE>, e.g.

8 AM 1,5.3,2.12 for values matching 1, 5.3 or 2.12 in layer with ID 8

VALUE is the value(s) that should be applied on the expression

Expressions are connected to each other by a logical operator. There are two options available right

now:

AND logical “and” having precedence over

OR logical “or”

The former will only return true if all the expressions are true and the latter will return true if any of the

expressions are true.

Here are two examples for using filtering with query conditions:

Page 29: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

29

/ws/query/submit?type=square&coordinates=3.4,114.0,21.0,130.5&intervals=10/17/16

03:35&datalayers=51&filter.pql=51GT3000

/ws/query/submit?type=square&coordinates=-4.3,109.6,1.7,117.9&intervals=10/17/16

20:00&datalayers=51&filter.pql=51GT5000AND26015LT0.5

Query Job (/ws/queryjobs)

A QueryJob represents any query submitted to PAIRS. These objects are used to retrieve status of a

submitted query, as well as getting the data back from a finished query. These are the properties of a

QueryJob object.

• id (text): the unique ID of a query job

• status (text): the current status of the job, options are:

Running: Job not finished yet

Succeeded: Job successfully finished. Data ready for download

Failed: Job failed – technical issue.

NoDataFound: Job finished, but no data found in the area requested.

• start (date): the start time of the current query job

• pql (text): description of all filters (PAIRS Query Language) used on this query job

• folder (text): the folder where the data can be download through FTP

These are the operations available for a QueryJob:

• Get (/ws/queryjobs/<job ID>): returns a full description of the query job

with the ID provided, below is the example:

{ "id": "1456870865483_69044", "status": "Succeeded", "start": 1458833191762, "pql": null, "swLat": 38, "swLon": -122, "neLat": 39, "neLon": -121, "exPercent": 0, "flag": false }

Page 30: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

30

• Download

(/ws/queryjobs/download/<job ID>): if the job is done and data is retrieved, this command

will write all the GeoTiff files zipped in one file.

Area of Interest (/ws/queryaois)

Area of interest (AoI) is a pre-defined region that can be used to submit queries. It is specified using a

GIS shape format. Currently KML with a single polygon is supported only. Section 3.1.2 describes how to

load your own AoI. Here are the properties of an AoI object:

• id (text): the unique ID of an area of interest

• key (text): this is a unique key defined by the user that can be used to submit queries

• name (text): the area of interest name

Query Examples

Single point query

Note: You can use the PAIRS metadata API to translate names to a corresponding ID.

<PAIRS URL>/ws/query/submit?type=point&coordinates=38,-122&datalayers=111&intervals=02/01/15

RESULT:

[{"dataset": {"id": 11, "key": "cropscape-prs", "name": "Historical crop planting map (USA)"},

"datalayer": {"id": 111, "name": "Crop"},

"lat": 38, "lon": -122,

"timestamp": 1388534400000,

”value": 176, "group": null}]

This is the JSON representation of the data. In particular, it represents the value of the “cropscape” layer

for the geo-location 38,-122, namely 176.

Here is an example to query multiple points by adding the lat/lon pairs separated by comma:

<PAIRS URL>/ws/query/submit?type=point&coordinates=38,-121,38.2,-

122&datalayers=111&intervals=02/01/15

The RESULT JSON file is:

[{"dataset":{"id":11,"key":"cropscape-prs","name":"Historical crop planting map (USA)","crs":null,"ftpPassword":null},

Page 31: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

31

"datalayer":{"id":111,"name":"Crop","crs":null,"shortKey":null,"interval":31536000,"unitsbl":null}, "lat":38.20000076293945,"lon":-122.0, "timestamp":1420070400000, "value":176.0,"group":null}, {"dataset":{"id":11,"key":"cropscape-prs","name":"Historical crop planting map (USA)","crs":null,"ftpPassword":null}, "datalayer":{"id":111,"name":"Crop","crs":null,"shortKey":null,"interval":31536000,"unitsbl":null}, "lat":38.0,"lon":-121.0, "timestamp":1420070400000, "value":121.0,"group":null}]

Rectangular area query

<PAIRS URL>/ws/query/submit?type=square&coordinates=38,-122,38.5,-

121.5&datalayers=111&intervals=02/01/15

RESULT:

{"id": "1448399768880_3736",

"status": "Running",

"start": 1448471857213,

"pql": null,

"swLat": 38, "swLon": -122, "neLat": 38.5, "neLon": -121.5,

"exPercent": 0}

All the area queries on PAIRS run offline, when you submit your query, a job and a corresponding job ID

will be created to extract the data. The result of your query submission is the job information. As you

can confirm from the previous example, the status of the job is “Running”.

You can then monitor the job using another URL:

<PAIRS URL>/ws/queryjobs/1448399768880_3736

RESULT:

{"id": "1448399768880_3736",

"status": "Succeeded",

"start": 1448471857213,

"pql": null,

"swLat": 38, "swLon": -122, "neLat": 38.5, "neLon": -121.5,

"exPercent": 0}

Now the job is finished, “Succeeded”, and it is ready to be downloaded using this URL:

<PAIRS URL>/ws/queryjobs/download/1448399768880_3736

Page 32: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

32

In contrast

<PAIRS URL>/ws/queryjobs/download/1454563001483_0006

will return an error.

Obviously, the link is the job ID that you received once you submitted your query. This URL will allow

you to download a ZIP file with one or many GeoTIFF images containing the data requested on your

query.

These are two very simple queries that you can submit and check if it works for you. You will have to

authenticate to submit them.

PAIRS output standard formats. The first case is JSON (text) that can be parsed by many languages. The

second query returns a GeoTIFF format image – a special image with geo-location information readable

by most GIS software tools such as e.g. QGIS.

Examples with GFS and MODIS

GFS Temperature

<PAIRS URL>/ws/query/submit?type=point&formatType=csv&coordinates=51.506,-

0.114&datalayers=16100&intervals=02/01/14,02/01/16

MODIS Aqua NDVI

<PAIRS URL>/ws/query/submit?type=point&coordinates=51.506,-

0.114&datalayers=51&intervals=02/01/16

MODIS Terra NDVI

<PAIRS URL>/ws/query/submit?type=point&coordinates=51.506,-

0.114&datalayers=71&intervals=02/01/16

Using wget

Most Linux distributions ship with the GNU Wget command line tool. You can employ it directly

to submit and download your query results from the PAIRS web server.

E.g. after you submitted a one-by-one degree square area query to get a corresponding job ID,

say 1448399768897_0783, saved in JSON format in response.txt:

Page 33: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

33

wget -O response.txt --user=xxxxxx --password=xxxxxxxx "<PAIRS

URL>/ws/query/submit?type=square&coordinates=38,-122,39,-

121&datalayers=51&intervals=02/01/16"

you can download the result as QueryResult.zip using

wget -O QueryResult.zip --user=xxxxxx --password=xxxxxxxx "<PAIRS

URL>/ws/queryjobs/download/1448399768897_0783"

In the case of a point query it is even simpler since you get the result directly in JSON format

(saved as PointQueryResult.txt in this example):

wget -O PointQueryResult.txt --user=xxxxxx --password=xxxxxxxx "<PAIRS

URL>/ws/query/submit?type=point&coordinates=38,-122&datalayers=51&intervals=02/01/16"

7. PAIRS USER DATA UPLOAD The greater value of PAIRS platform is in the fact that users can upload their geospatial data onto the

platform and the data will be aligned and cross-linked with all the existing basic data sets. Assuming you

already have access to the IBM PAIRS web interface: https://pairs.res.ibm.com/, otherwise please

request an account below the login boxes to get access to IBM PAIRS first, before you can upload your

own data.

For each new dataset you need to upload into PAIRS, PAIRS will issue a unique FTP account for

uploading data into that data set table. But you can add as many datalayers as needed into each data

set. For example, a satellite image can contain multiple spectral bands, and each band is considered a

datalayer of the same data set of this satellite. Each data layer can have many images collected at many

different time points called timestamps as well as at many different locations. IBM PAIRS users can have

multiple FTP accounts, e.g. if you have multiple datasets to upload into PAIRS. Once you are assigned the

owner of the data set, you can share the account within your organization for uploading data belonging

to the same dataset by different users.

First you need to request permission to upload geospatial raster format data onto PAIRS. Please send an

email to [email protected] with a brief description of the dataset and datalayers you intend to ingest

into PAIRS, together with a sample file. The parameters needed are shown in Fig 7.1. A separate PAIRS

upload account and password will be created for you to FTP your data into our server at

pairs.mmthub.com. Once the dataset table is created, then the users can upload different data layers

onto the same dataset. You will be notified of the account and password information for FTP.

Administrator will first create a new dataset for uploading data (only administrators can do this).

Page 34: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

34

Create a New Dataset (only for PAIRS Admins)

This step is for the PAIRS administrators only. It is a reserved privilege. Log into the IBM PAIRS web

interface and select METADATA > Dataset to create a new dataset by clicking the symbol, a table for

datset intake will be shown as follows:

Fig 7.1. Table for creating a new Dataset. Please provide these parameters when contacting PAIRS.

Key will be automatically generated from the Name you provided. Please note the dash(–) is not in

names, since it is reserved for the key.

Level chooses the resolution the dataset will be projected. We project the data onto the closest higher

resolution level. For example in my case the original data resolution is 0.02 degree, so I choose Level 15

(resolution is 0.016384 degree) which is slightly higher than 0.02, that way there will be no loss of data.

Level 14 is 0.032768 degree, which is lower in resolution than 0.02.

Categories we have are satellite, weather, survey, analytics. Additional data categories can be added

upon request.

Select the corresponding coordinate reference system (CRS) used by your data to be uploaded. The

Coordinate reference system determines how to interpret your file geospatially. In this example, the

data is in WGS84 EPSG:4326 projection. A convenient way of specifying is the EPSG codes

(http://www.epsg.org/ ), but you may also use PROJ.4 strings (https://trac.osgeo.org/proj/ ). This

information can be updated any time you upload a data file. It is a fallback information for IBM PAIRS in

case it is unable to detect. Some of the geospatial data has the project information embedded in its

accompanying meta data section. That is why a sample data file can be very helpful to save you the time

to figure out what information to put into these fields. Once the table is saved, you will see the new

dataset name in the list when going back to METADATA > Dataset page. The password for the account is

automatically generated and the Admin will be able to retrieve it and send to the user.

Page 35: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

35

The administrator will send the account ID and password to the user, so they can start creating new

datalayers. You also need to make the user an administrator of that dataset. Then the user will be able

to create new datalayers.

Create a New DataLayer

Once log into PAIRS, Select METADATA > then Data layer. Click the symbol will open a table to create

a new data layer as shown in Fig 7.2:

DataSet: Select the name of your new PAIRS dataset.

Name: Choose a descriptive name for your new datalayer.

key: Automatically generated from Name, you will need this for uploading your data (make a note).

Please do not use dash (-) in the name.

DataType: Set the data type for the raster pixels of the datalayer, when you upload the data, IBM PAIRS

will enforce conversion to this type, no matter in which format your uploaded data is (make a careful

decision on this).

Level: Set the IBM PAIRS grid resolution you want your data to be projected onto. From the information

in brackets you can read off the corresponding resolution in degrees and in kilometers at the equator.

CRS: Select the default coordinate reference system used by your data to be uploaded. A convenient

way of specifying is the EPSG codes (http://www.epsg.org/ ), but you may also use PROJ.4 strings

(https://trac.osgeo.org/proj/ ). This information can be overridden as detailed below for each datalayer

you upload data.

ColorTable: Specify a color scale bar that is used to visualize your uploaded data on the IBM PAIRS web

interface. There are a few preloaded selections, and you can create your own color scale bar also, which

upon creation will show up in the selection.

Page 36: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

36

Fig. 7.2 Creating new data layer for user upload.

Uploading Data Onto PAIRS

To upload data you need to connect to IBM PAIRS through SFTP (FTP over SSH). Use any tool that

supports one of the protocols at your convenience, e.g. FileZilla, sftp, lftp, FireFTP, WinSCP, etc.. Login

using your credentials:

• FTP server: pairs-ftp.res.ibm.com

• FTP account ID: pairsftp-DataSet key (pairsftp-pairsadmintestdata in the above example)

• FTP password: IBM PAIRS FTP password provided by the PAIRS Admin

Page 37: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

37

E.g. if you use FileZilla with FTP, your login as user might look like:

Fig. 7.3 login interface of FileZilla

Your base folder after logging in will have two folders:

1. UPLOAD: write permission, place for your data to be uploaded – there are two folders in

UPLOAD: 2draster and 1dpoint. We will talk about 2draster for Raster data here.

There are two subfolders inside UPLOAD. Right now we support raster file upload in the 2draster/

folder. Please upload your data and meta files here.

2. LOGS: read only, information on uploaded data

IBM PAIRS supports all types of raster data that are properly geo-referenced and readable by GDAL. A

few parameters of those metadata need to be specified in the corresponding *.meta file. Without

a .meta data file, the raw data will not be loaded. In the root directory, other than the two folders

above, there is a sample .meta file: pairs-upload-sample.tif.meta. You can make a copy of the sample

and specify the parameters and save the file under UPLOAD/2draster/ folder. It needs to detail all

metadata information you can set. There is a lot of optional information. However, you are required to

provide at least the minimum information.

Here is a simpler example of the meta file:

timestamp=2016-02-10 05:43:16 #(here the preferred time format is yyyy-mm-dd HH:MM:SS timezone, where HH:MM:SS timezone is optional, default is 00:00:00 GMT) pairsdatalayer=cloud-optical-depth #(here it uses the key of the datalayer name) band=1 #(it specifies which band to load, most data only has 1 band, but satellite images

can have multiple bands) datainterpolation=bilinear #(if there is a preference on how the data should be interpolated, please specify here. Near=nearest neighbor;) inputnodata=-9999 #(if there is a specific assigned nodata value, please specify here)

Once you completed the editing of the .meta file and is ready to upload the data, you need to change

the name of the .meta file to match the data file name plus .meta appended (Suppose you want to

upload a GeoTiff file named myUpload.tif, then the final meta file name needs to be:

myUpload.tif.meta). Within less than a minute of the matching of the names, both meta file and data

file will be moved onto our ingestion server, and they will disappear from your FTP folder on

pairs.mmthub.com. At the same time, a log file will be generated and saved under /LOGS/ folder. You

can see the details of your meta file here, e.g.:

Page 38: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

38

1472423281: Moving '20160210_054317_0c73_analytic.tif' for uploading with settings (from 20160210_054317_0c73_analytic.tif.meta): timestamp=2016-02-10 05:43:16 pairsdatalayer= cloud-optical-depth band=1 datainterpolation=bilinear inputnodata=-9999 INFO: The timestamp interpretation from 20160210_054317_0c73_analytic.tif.meta is 'Wed Feb 10 05:43:16 GMT 2016'

Technical Notes:

• If you use the SFTP access, make sure that the user and the group of your uploaded data (raw

data and *.meta file) have read and write permission.

• As long as you do not create a name matched *.meta file for your uploaded raw data, the IBM

PAIRS FTP will not start uploading your data.

• If the data get picked up, both, myUpload.tif and myUpload.tif.meta will disappear from the

UPLOAD folder. You will find some logging information in LOGS in a file named <pickuptime>-

myUpload.tif.log. The <pickuptime> is the UNIX epoch time counting seconds since January 1, 1970

midnight in Coordinated Universal Time such that you can upload data with the same file name multiple

times. An example of the .log file is shown below:

timestamp=2016-03-16 15:00:00

pairsdatalayer=my-layer_3

band=1

geospatialprojection=EPSG:4326

datainterpolation=bilinear

inputnodata=-9999

Please do not hesitate to email us any questions and feedback you have regarding data upload to :

[email protected]

You can check the PAIRS web to see if the data has been uploaded properly by querying the datalayer

for the specific spatial and temporal coverage.

Page 39: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

39

8. DATA TABLE CREATION and UPLOAD

Create a New Data Table

We have covered 2D gridded data or raster data type hosting on PAIRS thoroughly already. This section

will introduce another type of spatio-temporal data, which is time series point data with geospatial

information (basically latitude and longitude information, or an administrative location such as a state in

the USA). Many of the sensor data falls into this category. It has the sensor location information – a

geospatial point. It has a time series of record. Point Data Table is introduced on PAIRS to integrate this

type of data with the existing raster data.

As of time in writing, we are working on a new license that will allow users to upload their own data. So

this is only available to IBM groups for now. Once the license is in place, all trial users will then be able

to use this feature to upload both raster and point data.

To create a point Data Table, first go to the METADATA menu and Data Table submenu. Select a Dataset

that is assigned for the point data table and filter. Fig 8.1 shows the view of “pairsadmin” dataset after

choosing Data table view.

Fig 8.1 Add new point Data Table: METADATA -> Data Table -> Filter -> add icon

This will bring you to the AddPoint Table page shown on Fig 8.2. As can be seen that we need to add the

columns corresponding to your actual data spreadsheet order. Let’s use the following table 8.1 as an

example. There are 4 column types mandatory to have: Value, Longitude, Latitude, and Timestamp

because PAIRS defines the data by its latitude, longitude, and timestamp keys.

By clicking the add icon , column by column configuration will be added to the table definition, as

shown in Fig 8.2.

Page 40: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

40

First we add the Name column, and its attribute is Value, and its data type is String. Then we add the 2nd

column lon with Attribute Longitude, and data type is preset to Double. Then we add the 3rd column lat

with Attribute Latitude, and data type is preset to Double. Then we add the last column unixtime with

Attribute Timestamp and data type Signed Long Integer. We can add more columns after these if

needed.

Fig 8.3 shows the list of Attribute types it supports: Latitude, Longitude, Region, Timestamp, Version

Timestamp, Device ID, Other property, and Value.

Sometimes the table does not contain time information, which can be supplemented with the current

time or the time of the record, since some properties do not change much over time, for example a

wind farm location.

Sometimes the location information is an address, which can be converted to latitude / longitude

coordinates easily with geocoding tools.

You probably noticed that the order of the pointData Table configuration follows the exact order of the

columns. This is very important, because after setting up the table on PAIRS, we will start uploading csv

files containing the data.

Once clicking the SAVE button, the table is created as shown in Fig 8.4, and we can start upload data.

You will be able to query these point data together with raster data in the vicinity location, which we will

cover in section 8.C.

Table 8.1 Example point data table

name lon lat unixtime depth

AAA -100 31 1451606400 50

BBB -101 32 1451606400 60

CCC -102 33 1451606400 70

DDD -103 34 1451606400 80

Page 41: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

41

Fig 8.2 Screen shot for add pointData Table. The add icon on this page will add column configuration

information individually to the table.

Fig 8.3 The available Attribute types available for different columns.

Page 42: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

42

Fig 8.4. The new data table “test” shows up after the creation.

Upload Point Data into Data Table

The point data upload follows mostly the same process as raster data upload described in Section 7

above. After creating the point data table configuration in Section 8A, in the sftp server (in this case, the

username is: pairsftp-pairsadmin) you will see the additional subfolders created:

UPLOAD/point/pairs_pointdata_48_test/

The pairs_pointdata_48_test is the new data table we created. 48 is the dataset ID for pairsadmin, and

test is the name for the data table. We will drop a csv file WITHOUT HEADER onto this folder, and it will

get uploaded. In my case, this is my .csv file that I uploaded:

AAA,-100,31,1451606400,50

BBB,-101,32,1451606400,60

CCC,-102,33,1451606400,70

DDD,-103,34,1451606400,80

The SFTP server upload directory looks like Fig 8.5 with subfolder UPLOAD and point. Upload the data

into /test directory in our case. After the upload, we will take a look at the data from query interface in

the next section 8.C.

Fig 8.5 shows the SFTP upload folder and subfolders for point data table upload.

Page 43: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

43

Query Point Data on PAIRS

Query point data is an extension to the raster data query interface described in Section 3.A. Point query

can be combined with raster data query or by itself. As an example, Fig 8.6 defines a spatial area that

covers the region of the point data locations.

Fig 8.6 Define the spatial area of interest.

Fig 8.7 defines the raster dataset GFS Global Weather Model in this example. Fig 8.8 chooses a datalayer

Ground pressure.

Page 44: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

44

Fig 8.7 Choose a raster dataset for 2dimensional query for the same area.

Below the “Add a condition” of the 2d raster query fields and above the Submit button is the “Add point

data” button to expand the point data query fields, as shown in Fig 8.9.

Fig 8.8 Click “Add point data” to expand a table for point data query.

Once expanded the point data query field, we can choose the dataset, point data table, values to query.

The results can be filtered, ordered, and then SUBMIT.

Page 45: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

45

Fig 8.9 Point data query criteria: choose the dataset, the point data table, and the column of value.

The results of the query will have 2 sections, one for the raster data, and the other for the point data as

shown in Fig 8.10. Both raster data and csv data are illustrated over the map. Each circular dot

represents available row(s) for that location from the point data table.

Fig 8.10 Query result page showing both 2D query image and csv table points. Under Layers – GFS listed

multiple files return from the query. Under Point Data Tables – listed ‘test’ as the queried table. Click on

any circular dots on the map will show the table result for that point.

Page 46: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

46

Downloading the query results next to the query job name will retrieve both the 2D raster files zipped

together with the point data table with all the points within the query area and temporal definition.

This section will expand as we further develop the capabilities of time series data tables.

Page 47: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

47

Appendix

A. DETAILS OF DATASETS

Table 1. List of Datasets

ID Key Name Level Status

1 lsat7-etm Landsat 7 (USGS and NASA satellite imagery) 21 beta

2 lsat8-lev1 High resolution satellite biweekly Landsat8 21 beta

3 lsat8-lev2 High resolution satellite biweekly Landsat8(SR) 21 beta

5 modis-aqua-13-q1 Medium resolution satellite daily Aqua (13) 18 product

6 modis-aqua-09-q1 Medium resolution satellite daily Aqua (09 SR) 18 product

7 modis-terra-13-q1 Medium resolution satellite daily Terra (13) 18 product

8 modis-terra-09-q1 Medium resolution satellite daily Terra (09 SR) 18 product

9 prism-daily-prs PRISM Climate Data 14 product

11 cropscape-prs Historical crop planting map (USA) 21 product

12 nam-forecast USA Weather Forecast 14 product

13 cimis-raster California weather condition measurements 15 product

14 ned-elevation Elevation 23 product

15 ibm-analytics Reference Evapotranspiration 14 product

16 gfs-forecast Global Weather Forecast 11 product

17 blend2d-forecast SMT (Self-learning weather modeling and

forecasting technology) 14 product

18 soil-gssurgo Soil properties (USA) 23 beta

24 daymet Daymet 16 product

25 cfs-forecast SMT (Long Term Forecast) 11 product

26 ecmwf ECMWF 13 product

Page 48: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

48

Table 2. Details of Datasets

Page 49: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

49

High resolution satellite biweekly Landsat 8

World wide 30m resolution Satellite Data from Landsat8 every 16 days. It has the following bands

(datalayers) on PAIRS:

Lsat8 bands PAIRS Name Unit Column Family Resolution ID

Band 1 - Coastal

aerosol Coastal aerosol a.u.(-1.2 to 1.2) c1 0.000256 201

Band 2 - Blue Blue a.u.(-1.2 to 1.2) c2 0.000256 202

Band 3 - Green Green a.u.(-1.2 to 1.2) c3 0.000256 203

Band 4 - Red Red a.u.(-1.2 to 1.2) c4 0.000256 204

Band 5 - Near Infrared

(NIR) Near Infrared (NIR) a.u.(-1.2 to 1.2) c5 0.000256 205

Band 6 - SWIR 1 SWIR 1 a.u.(-1.2 to 1.2) c6 0.000256 206

Band 7 - SWIR 2 SWIR 2 a.u.(-1.2 to 1.2) c7 0.000256 207

Band 8 - Panchromatic Panchromatic a.u.(-1.2 to 1.2) c8 0.000256 208

Band 9 - Cirrus Cirrus a.u.(-1.2 to 1.2) c9 0.000256 209

Band 10 - Thermal

Infrared (TIRS) 1 TIRS 1 K c10 0.000256 210

Band 11 - Thermal

Infrared (TIRS) 2 TIRS 2 K c11 0.000256 211

High resolution satellite biweekly Landsat 8 (SR)

Landsat8 Level2 is surface reflectance corrected dataset. It has the same resolution as Landsat8 Level1,

and it is post processed data by NASA. It has the following datalayers:

Page 50: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

50

Lsat8 SR bands PAIRS Name Unit Column Family Resolution ID

Band 1 - Coastal

aerosol Coastal aerosol a.u.[-2e3,1.6e4] c1 0.000256 301

Band 2 - Blue Blue a.u.[-2e3,1.6e4] c2 0.000256 302

Band 3 - Green Green a.u.[-2e3,1.6e4] c3 0.000256 302

Band 4 - Red Red a.u.[-2e3,1.6e4] c4 0.000256 304

Band 5 - Near

Infrared (NIR) Near Infrared (NIR) a.u.[-2e3,1.6e4] c5 0.000256 305

Band 6 - SWIR 1 SWIR 1 a.u.[-2e3,1.6e4] c6 0.000256 306

Band 7 - SWIR 2 SWIR 2 a.u.[-2e3,1.6e4] c7 0.000256 307

NDVI NDVI a.u.[-1,1] c8 0.000256 308

SAVI SAVI a.u.[-1,1] c9 0.000256 309

MSAVI MSAVI a.u.[-1,1] c10 0.000256 310

EVI EVI a.u.[-1,1] c11 0.000256 311

CLOUD CLOUD a.u.[0,7] c12 0.000256 312

NDMI NDMI a.u.[-1,1] c13 0.000256 313

NBR NBR a.u.[-1,1] c14 0.000256 314

NBR 2 NBR 2 a.u.[-1,1] c15 0.000256 315

Cloud Mask Cloud Mask a.u. c16 0.000256 316

Cloud Mask

Confidence Cloud Mask Confidence a.u. c17 0.000256 317

Page 51: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

51

Medium resolution satellite daily: Aqua (13), Aqua (09 SR), Terra

(13), Terra (09 SR)

There are two identical MODIS satellites – Aqua / Terra. MODIS Aqua (13) / MODIS Terra (13) have the

following datalayers:

MODIS 13 bands PAIRS Name Unit Column Family Resolution ID

250m 16 days NDVI NDVI a.u.[-2e2,1e4] b0 0.002048 51/71

250m 16 days red

reflectance (Band 1) Red a.u.[0,1e4] b1 0.002048

52/72

250m 16 days NIR

reflectance (Band 2) NIR a.u.[0,1e4] b2 0.002048

53/73

250m 16 days blue

reflectance (Band 3) Blue a.u.[0,1e4] b3 0.002048

54/74

250m 16 days MIR

reflectance (Band 7) MIR a.u.[0,1e4] b4 0.002048

55/75

MODIS Aqua (09) / MODIS Terra (09) have the following datalayers:

MODIS SR bands PAIRS Name Unit Column Family Resolution ID

250m Surface

Reflectance Band 1

(620–670 nm)

Band 1 K c0 0.002048

61

/

81

250m Surface

Reflectance Band 2

(841–876 nm)

Band 2 K c1 0.002048

62

/

82

Prism Climate Data

Prism data is historical daily weather condition measurements in USA. It has the following datalayers:

Page 52: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

52

PRISM parameters PAIRS Name Units Column Family Resolution ID

Daily total precipitation

(rain+melted snow) Precipitation Inch -> mm b0 0.032768 91

Daily maximum temperature Temperature Max F -> kelvin b1 0.032768 92

Daily minimum temperature Temperature Min F -> kelvin b2 0.032768 93

Daily mean temperature,

calculated as (tmax+tmin)/2

Temperature

Mean F -> kelvin b3 0.032768 94

USA weather forecast

USA weather forecast is a 3km resolution weather forecast with historical data. It has the following

layers in PAIRS:

Parameters PAIRS Name Units Column Family Resolution ID

Ground temperature Ground temperature K c1 0.032768 1200

Ground relative humidity Ground relative humidity % c2 0.032768 1300

Solar irradiance Solar irradiance W/m2 c3 0.032768 1400

Wind toward east Wind toward east m/s c4 0.032768 1500

Wind toward north Wind toward north m/s c5 0.032768 1600

Pressure_GND Pressure_GND Pa c6 0.032768 1700

Precipitation (mm/s) precip mm/s c7 0.032768 1800

California weather condition measurements

California Irrigation Management Information System (CIMIS) is a California weather condition

measurements dataset, which provides gridded data for the state of California. It has the following

datalayers:

Page 53: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

53

CIMIS parameters PAIRS Name Units Column Family Unit Conversion Resolution ID

Reference

evapotranspiration

Reference

evapotranspiration Mm c00 mm 0.016384 130

Net radiation Net radiation W/m2 c01 (MJ/m2)/11.57 0.016384 131

Net long-wave

radiation

Net long-wave

radiation W/m2 c02 (MJ/m2)/11.57 0.016384 132

Clear sky solar

radiation

Clear sky solar

radiation W/m2 c03 (MJ/m2)/11.57 0.016384 133

Clearness factor Clearness factor No unit c04 0.016384 134

Daily minimum air

temperature

Daily minimum air

temperature K c05 273.15+C 0.016384 135

Daily maximum air

temperature

Daily maximum air

temperature K c06 273.15+C 0.016384 136

Dew point

temperature

Dew point

temperature K c07 273.15 +C 0.016384 137

Wind speed Wind speed m/s c08 m/s 0.016384 138

Global weather forecast

Global weather forecast dataset is a world wide forecast model from NOAA with 0.5 degree spatial

resolution. 10 days forecast is ingested into PAIRS for weather forecast around the world. All the

parameters follow the same conventions as USA weather forecasts except the precipitation is an

averaged precipitation rate over 3 hours. Global weather forecast has the following datalayers available

on PAIRS:

Parameters PAIRS Name Units Column Family Resolution ID

Temp_2m_Gnd Ground temperature K c1 0.262144 16100

RH_2m_Gnd Ground relative

humidity % c2 0.262144 16200

Total_Sh_Dw_inline Solar irradiance W/m2 c3 0.262144 16300

Page 54: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

54

Wind_u_10m_Gnd Wind toward east m/s c4 0.262144 16400

Wind_v_10m_Gnd Wind toward north m/s c5 0.262144 16500

Pres_GND Pressure_GND Pa c6 0.262144 16600

precip precip mm/s c7 0.262144 16700

ECMWF (European Center for Medium-Range Weather Forecasting)

ECMWF issues 10 days ahead weather forecast globally with 0.125 degree spatial resolution with 3

hourly interval for the first 6 days and then 6 hourly for the other 4 days. We have acquired 15 surface

parameters into Pairs with spatial interpolation into a PAIRS grid of 0.065536 degree. In addition, the

accumulated solar radiation parameters have been interpolated into the instantaneous values using

clear sky profile . Accumulated total precipitation and convective precipitation have been converted to

averaged precipitation rate for the interval.

Parameter PAIRS Name Units Column Family Resolution ID

Ground temperature Ground temperature K c1 0.065536 26001

Solar irradiance_GHI Global Horizontal Irradiance w/m2 c2 0.065536 26002

Solar irradiance_DNI Direct Normal Irradiance w/m2 c3 0.065536 26003

Wind toward east_10m Wind toward east_10m m/s c4 0.065536 26004

Wind toward north_10m Wind toward north_10m m/s c5 0.065536 26005

Wind toward east_100m Wind toward east_100m m/s c6 0.065536 26006

Wind toward

north_100m Wind toward north_100m m/s c7 0.065536 26007

Dewpoint Dewpoint kelvin c8 0.065536 26008

Surface Albedo Surface Albedo No unit (0-1) c9 0.065536 26009

Max_precip_rate Max precipitation rate mm/hour c10 0.065536 26010

Min_precip_rate Min precipitation rate mm/hour c11 0.065536 26011

Page 55: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

55

Total_precipitation_rate_

avg Total precipitation mm/hour c12 0.065536 26012

Convective_precip_rate_

avg Convective precipitation mm/hour c13 0.065536 26013

Ground Pressure Ground Pressure pa c14 0.065536 26014

Cloud Cover Cloud Cover No unit (0-1) c15 0.065536 26015

Historical crop planting map

USDA issues crop information yearly in 30m resolution. PAIRS has ingested data from year 2008 to 2015.

Details are in the following website:

http://nassgeodata.gmu.edu/CropScape/

Name PAIRS Name Units Column Family Resolution ID

CROP Crop none b0 0.000256 111

The crop index is listed here for look up purposes:

Value Category Value Category Value Category

1 Corn 55 Caneberries 206 Carrots

2 Cotton 56 Hops 207 Asparagus

3 Rice 57 Herbs 208 Garlic

4 Sorghum 58 Clover/Wildflowers 209 Cantaloupes

5 Soybeans 59 Sod/Grass Seed 210 Prunes

6 Sunflower 60 Switchgrass 211 Olives

10 Peanuts 61 Fallow/Idle Cropland 212 Oranges

11 Tobacco 62 Pasture/Grass 213 Honeydew Melons

12 Sweet Corn 63 Forest 214 Broccoli

Page 56: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

56

13 Pop or Orn Corn 64 Shrubland 216 Peppers

14 Mint 65 Barren 217 Pomegranates

21 Barley 66 Cherries 218 Nectarines

22 Durum Wheat 67 Peaches 219 Greens

23 Spring Wheat 68 Apples 220 Plums

24 Winter Wheat 69 Grapes 221 Strawberries

25 Other Small Grains 70 Christmas Trees 222 Squash

26 Dbl Crop

WinWht/Soybeans 71 Other Tree Crops 223 Apricots

27 Rye 72 Citrus 224 Vetch

28 Oats 74 Pecans 225 Dbl Crop WinWht/Corn

29 Millet 75 Almonds 226 Dbl Crop Oats/Corn

30 Speltz 76 Walnuts 227 Lettuce

31 Canola 77 Pears 229 Pumpkins

32 Flaxseed 81 Clouds/No Data 230 Dbl Crop

Lettuce/Durum Wht

33 Safflower 82 Developed 231 Dbl Crop

Lettuce/Cantaloupe

34 Rape Seed 83 Water 232 Dbl Crop

Lettuce/Cotton

35 Mustard 87 Wetlands 233 Dbl Crop

Lettuce/Barley

36 Alfalfa 88 Nonag/Undefined 234 Dbl Crop Durum

Wht/Sorghum

37 Other Hay/Non Alfalfa 92 Aquaculture 235 Dbl Crop

Barley/Sorghum

Page 57: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

57

38 Camelina 111 Open Water 236 Dbl Crop

WinWht/Sorghum

39 Buckwheat 112 Perennial Ice/Snow 237 Dbl Crop Barley/Corn

41 Sugarbeets 121 Developed/Open Space 238 Dbl Crop

WinWht/Cotton

42 Dry Beans 122 Developed/Low

Intensity 239

Dbl Crop

Soybeans/Cotton

43 Potatoes 123 Developed/Med

Intensity 240

Dbl Crop

Soybeans/Oats

44 Other Crops 124 Developed/High

Intensity 241

Dbl Crop

Corn/Soybeans

45 Sugarcane 131 Barren 242 Blueberries

46 Sweet Potatoes 141 Deciduous Forest 243 Cabbage

47 Misc Vegs & Fruits 142 Evergreen Forest 244 Cauliflower

48 Watermelons 143 Mixed Forest 245 Celery

49 Onions 152 Shrubland 246 Radishes

50 Cucumbers 176 Grassland/Pasture 247 Turnips

51 Chick Peas 190 Woody Wetlands 248 Eggplants

52 Lentils 195 Herbaceous Wetlands 249 Gourds

53 Peas 204 Pistachios 250 Cranberries

54 Tomatoes 205 Triticale 254 Dbl Crop

Barley/Soybeans

Elevation

There is a 10-m resolution dataset for elevation for the USA.

Name PAIRS Name Units Column Family Resolution ID

Page 58: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

58

Elevation Elevation Meter c1 0.000008 140

Soil properties

We are in the process of ingesting soil property survey data into PAIRS.

Soil PAIRS Name Units Column Family Resolution ID

gssurgo_slope_1356998400 Slope % C1 0.000256 18001

gssurgo_runoff_1356998400 RunOff n.u. C2 0.000256 18002

gssurgo_component_135699

8400 Component n.u. C3 0.000256 18003

gssurgo_ec_1356998400 Electrical Conductivity dS/m C4 0.000256 18004

gssurgo_cec_1356998400 Cation Exchange

Capacity

meq/

100g C5 0.000256 18005

gssurgo_ph_1356998400 pH pH C6 0.000256 18006

gssurgo_silt_1356998400 Silt total % C7 0.000256 18007

gssurgo_sand_1356998400 Sand total % C8 0.000256 18008

gssurgo_clay_1356998400 Clay total % C9 0.000256 18009

gssurgo_om_1356998400 Organic Matter % C10 0.000256 18010

gssurgo_bd_1356998400 BulkDensity (1/3 bar) g/cm3 C11 0.000256 18011

gssurgo_awc_1356998400 Available Water

Holding Capacity n.u. C12 0.000256 18012

gssurgo_sar_1356998400 Sodium Adsorption

Ratio n.u. C13 0.000256 18013

gssurgo_horizon-

dep_1356998400 Horizon Depth cm C14 0.000256 18014

gssurgo_dep-restrict-

layer_1356998400

Depth to a Restrictive

Layer cm C15 0.000256 18015

Page 59: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

59

gssurgo_drainage_13569984

00 Drainage n.u. C16 0.000256 18016

gssurgo_horizon_135699840

0 Horizon n.u. C17 0.000256 18017

gssurgo_surf-

albedo_1356998400 Surface Albedo n.u. C18 0.000256 18018

Reference Evapotranspiration

We have multiple one of a kind analytics on PAIRS. Two of them are in the Weather category: SMT (self-

learning weather modeling and forecast ) and SMT (long term seasonal forecast). The

Evapotranspiration model is hosted under Analytics category. When the models are developed based on

other datasets on PAIRS and validated, we ingest the derived analytical layers back onto PAIRS as a

separate dataset. Currently daily reference evapotranspiration for the continental USA as well as on a

global scale (coarser resolution than USA data layer) is available. Reference evapotranspiration is critical

in irrigation forecast and decision making.

Analytics Layers PAIRS Name Units Comments Resolution ID

GFS based

evapotranspiration ET0 mm/day ET0 for global scale 0. 262144 15200-10

NAM based

evapotranspiration ET0 mm/day ET0 for USA 0.032768 15100-10

ECMWF based

evapotranspiration ET0 Mm/day ET0 for global scale 0.065536 15300-10

SMT – IBM’s cognitive forecast in USA

An improved weather forecast based on Model blending machine learning algorithm is generated daily

for the continental USA. Resolution is the same as USA forecast. The Solar irradiance and wind speed

parameters are super important for renewable energy industry. We deliver the forecast to renewable

energy utility customers daily. It has the following datalayers:

Parameters PAIRS Name Units Column Family Resolution ID

Page 60: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

60

Temp_2m_Gnd Ground Temperature K c1 0.032768 17100

RH_2m_Gnd Ground relative humidity % c2 0.032768 17200

Total_Sh_Dw_inline Solar irradiance W/m2 c3 0.032768 17300

Wind_speed Wind speed m/s c4 0.032768 17400

SMT-IBM’s Long Term Forecast Globally

Seasonal forecast projecting 6 months ahead is issued by NOAA daily. Based on NOAA’s forecast, we

built an improved model using machine learning. The new analytics layers is under weather category

called SMT (Long Term Forecast). It has the following data layers:

Parameters PAIRS Name Units Column

Family Resolution ID

Ground temperature Ground temperature K C1 0.262144 25001

Solar irradiance Solar irradiance w/m2 C2 0.262144 25002

Wind toward east Wind toward east m/s C3 0.262144 25003

Wind toward north Wind toward north m/s C4 0.262144 25004

Categorical Rain rain_or_not n.u. C5 0.262144 25005

Precip Rate* precip_rate mm/hour C6 0.262144 25006

Precipitable water* precip_water kg/m2 C7 0.262144 25007

Page 61: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

61

B. Acknowledgements

-Lansat 7 and Landsat 8 datasets are derived from U.S. Geological Survey (USGS)/NASA Landsat Program

The USGS home page is http://www.usgs.gov

The NASA home page is http://www.nasa.gov

-MODIS datasets are derived from USGS MODIS program datasets

-NED dataset is derived from Data available from the USGS

See USGS Visual Identity System Guidance http://www.usgs.gov/visual-id/ for further details

-NED dataset is distributed by the Land Processes Distributed Active Archive Center (LP DAAC)

It is located at USGS/EROS, Sioux Falls, SD. http://lpdaac.usgs.gov

-Global forecast system (GFS), North America Mesoscale (NAM), Climate Forecast System (CFS) are

derived products from NOAA datasets

The NOAA home page is http://www.noaa.gov/

-soil data is derived from SSURGO datasets distributed by USDA under Creative Commons License

The web page of USDA is http://www.usda.gov/

-ECMWF datasets are derived Type B and Type C products from data and products of the European

Center for Medium-range Weather Forecasts (copyright© 2016 ECMWF)

-PRISM dataset is derived from PRISM Climate Group, Oregon State University

-Cropscape data is from USDA National Agricultural Statistics Services

The web page of NASS is http://nassgeodata.gmu.edu/CropScape/

- Daymet historical weather dataset is derived from Daymet dataset distributed by Oak Ridge National

Laboratory, which is under NASA's EarthData license policy https://earthdata.nasa.gov/

Citation to Daymet data is in this web page:

https://daac.ornl.gov/DAYMET/guides/Daymet_mosaics.html#Daymet_m_citation

Thornton, P.E., Running, S.W., White, M.A. 1997. Generating surfaces of daily meteorological

variables over large regions of complex terrain. Journal of Hydrology 190: 214 - 251.

http://dx.doi.org/10.1016/S0022-1694(96)03128-9

- CIMIS dataset is obtained from California Irrigation Management Information System at

http://www.cimis.water.ca.gov/

Page 62: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

62

- US Census data is Published by the U.S. Census Bureau

- Sentinel satellite data contains modified Copernicus Sentinel data. Sentinel data is available from

https://sentinel.esa.int/

- Administrative boundary maps are made with Natural Earth. Free vector and raster map data @

naturalearthdata.com

- World elevation data is from U.S. Geological Survey. The USGS home page is http://www.usgs.gov

Page 63: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

63

C. Glossary

PAIRS – Physical Analytics Integrated Data Repository and Services

Geospatial – relating to geographical characters

Temporal – relating to time

SMT – Self learning weather Modeling and forecasting Technology

Data set – refers to a specific data collection

Datalayer – refers the different bands or parameters within a data set

Data region – refers the location of the data file (for satellite mainly, such as tiles)

Metadata – refers to the information about the data or file

Filter – refers selection conditions applied to choose certain data points or layers

Job – refers to a submitted request to PAIRS

ID – a unique numeric identification number for a specific parameter

Key – a unique character identifier

lsat7-etm – Landsat 7 (launched on April 15, 1999, is the seventh satellite of the Landsat program. USGS

and NASA 30-meter resolution satellite imagery every 16 days) Enhanced Thematic Mapper Plus bands

lsat8-lev1 – Landsat 8 (launched on Feb 11, 2013, is the eighth satellite in the Landsat program; USGS

and NASA 30-meter resolution satellite imagery every 16 days replacing lsat7). Level 1 is the

unprocessed instrument measurements

lsat8-lev2 – Landsat 8 surface reflectance product, is a derived and estimated data set from level 1 for

the surface spectral reflectance for each band as it would have been measured at ground level if there

were no atmospheric scattering or absorption. Detail guide please see ref [3].

modis 09-q1 – Moderate Resolution Imaging Spectroradiometer is a key instrument aboard the Terra

and Aqua satellites with a medium imagery resolution of 250-meter. Product 09-q1 is the derived

surface reflectance 8-Day level 3 data set for band 1 and band 2. Each 09-q1 image pixel contains the

best possible L2G observation during an 8-day period as selected on the basis of high observation

coverage, low view angle, the absence of clouds or cloud shadow, and aerosol loading. Details on MODIS

products can be found in reference [4].

modis 13-q1 – Vegetation indices are used for global monitoring of vegetation conditions and are used

in products displaying land cover and land cover changes. NDVI is Normalized Difference Vegetation

Page 64: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

64

Index. MODIS 13-q1 NDVI uses blue, red, near-infrared, and mid-infrared for calculation. Global

MOD13Q1 data are provided every 16 days at 250-meter spatial resolution as a gridded level-3 product.

prism-daily-prs – PRISM Climate Data is developed by Oregon State University using measurements from

various weather networks within USA. It is a two dimensional gridded historical weather model. The

data set we ingested in PAIRS is AN81d – the daily weather parameters in 4km resolution [5].

cropscape-prs – Historical crop planting map (USA) from USDA annually. It is 30meter resolution.

nam-forecast – USA Weather Forecast issued by NOAA is a 3km resolution weather forecast data set

with its archive

cimis-raster – California Irrigation Management Information System (CIMIS) is a California weather

condition measurements dataset, which provides gridded data for the state of California.

ned-elevation – Elevation in the USA, a 10meter resolution data set.

ibm-analytics – Reference Evapotranspiration

gfs-forecast – Global Weather Forecast issued by NOAA

blend2d-forecast – SMT (Self-learning weather modeling and forecasting technology) is a machine

learned ensemble forecast model developed by us

soil-gssurgo – Soil properties (USA) database gssurgo

daymet – Daymet is similar to Prism and is a calculated two dimensional historical weather data set. It

has a resolution of 1km.

cfs-forecast – SMT (Long term forecast) is a 6 months ahead seasonal forecast data set based on NOAA’s

Climate Forecast System (CFS). The temporal resolution is 6hourly.

ecmwf – The 10-days ahead forecast issued by European Centre for Medium-Range Weather Forecasts

(ECMWF). It is 3hourly for up to 6 days and then 6houly afterwards. The spatial resolution is 0.125⁰ x

0.125⁰.

Page 65: PAIRS User’s Manual & API Documentation · PAIRS User’s Manual Table of Contents 1. ... PAIRS USER DATA UPLOAD ... the browser periodically). Fig 2.1 shows the log in page. Siyuan

65

D. References

[1] L.J. Klein, F.J. Marianno, C.M Albrecht, M. Freitag, and H. F. Hamann, "PAIRS: A scalable geo-spatial data

analytics platform," 2015 IEEE Conference on Big Data, pp. 1290-1298, 2015. [2] Siyuan Lu, Y. Hwang, I. Khabibrakhmanov, F. J. Marianno, X. Shao, J. Zhang, et al., "Machine Learning

Based Multi-Physical-Model Blending for Enhancing Renewable Energy Forecast – Improvement via Situation Dependent Error Correction," Proceeding of European Control Conference 2015, 283 - 290 (2015). 2015.

[3] U. S. G. S. Department of the Interior. Product Guide: Provisional Landsat 8 Surface Reflectance Code (LASRC) Product. Available: http://landsat.usgs.gov/documents/provisional_lasrc_product_guide.pdf

[4] L. DAAC. MODIS Products Table. Available: https://lpdaac.usgs.gov/dataset_discovery/modis/modis_products_table

[5] PRISM. Descriptions of PRISM Spatial Climate Datasets for the Conterminous United States. Available: http://prism.oregonstate.edu/documents/PRISM_datasets.pdf