skytide_vs_xquery_v1

3

Click here to load reader

Upload: patrick-hurley

Post on 11-Mar-2016

214 views

Category:

Documents


0 download

DESCRIPTION

The Skytide multi-dimensional data cube is the optimal data model for Business Analytics. Queries are designed to be easy for a non-technical user to create in various tabular and cross tabular formats, as shown below.

TRANSCRIPT

Page 1: Skytide_vs_XQuery_v1

Speeding Complex XML Analytics: Skytide vs. XQuery

By Tom Tortolani, VP Product Management, Skytide, Inc.

For over 15 years, Business Analytics software has provided great benefi t to many organizations world-wide. However, this software was designed to work well with highly structured, normalized data. Given the fl exibility of the XML data model to include varying levels of descriptive information within each document, there is a lot of data available for analysis which is diffi cult or impossible to force into a normalized record/fi eld format. Skytide is specifi cally designed to build analytic models on such variable data structure.

Business analytics offer a way to turn large amounts of data into meaningful information for end user analysis. However, it was not designed to be optimal for every query requirement. In some cases, data is better queried directly from the data source; for example, the need to query detailed address information for individual customers. While it is possible to bring such information into Skytide, this type of data is not typically required for business analytics. However, knowing the city or state of a Customer is benefi cial for data analysis where data is analyzed by geographic aggregates.

XQuery Limitations

The XQuery language is often not the best choice to express and execute complex XML analytical queries. One reason is that XQuery does not have an explicit GROUP BY con-struct for easy aggregation of data. Similarly, OLAP functions which have long been inte-grated in the SQL standard are still absent from XQuery. This makes it diffi cult for users to write analytical queries in XQuery, and diffi cult for database systems to optimize and execute them effi ciently. Another option is to use SQL/XML to combine XQuery and SQL such that XQuery reads atomic data values from the XML data, and advanced SQL func-tions are used to aggregate or summarize these values. However, such queries are very complex and typically do not deliver the same performance as querying an in-memory data cube.

Skytide Multi-Dimensional Analytics

The Skytide multi-dimensional data cube is the optimal data model for Business Analytics. Queries are designed to be easy for a non-technical user to create in various tabular and cross tabular formats, as shown below.

Cubes are formed by defi ning dimensions, which are used in analytic models to organize related data (measures) into categories that are easy for the user to understand for query and reporting needs. Dimensions typically contain multiple levels of aggregation. For example, a Period dimension contains daily, monthly, and yearly levels. The actual values of a dimension are referred to as members. For example, a Customer dimension con-tains 100,000 members (Customers). Skytide cubes offer a large range of reporting and analysis possibilities. For example, Order data by country by month by security type that

About the AuthorAt Skytide, Tom is responsible for identifying market and customer demands, translating them into product requirements, defi ning the product and packaging strategy, and bringing new products to market.

Prior to Skytide Tom was responsible for bringing leading-edge products to market for Provato, after which he served and Aceva Technologies. He was among the fi rst 10 employees at Hyperion Solutions/Arbor Software, and was a key contributor to the creation of the BI and OLAP industry. Tom is the primary inventor on US patent 6,317,750 for advanced multi-dimensional navigational techniques.

T e c h S p o t l i g h t

This article was developed as part of

a joint white paper written by Skytide

& IBM, Skytide Analytical Platform

for DB2 9: Large Volume Business

Analytics for XML Data.

Page 2: Skytide_vs_XQuery_v1

compares premium to non-premium customers can easily be queried and analyzed for trends and outlying activity. In fact, numerous permu-tations of analytic queries that compare dimen-sions by other dimensions are easily performed on cube data.

Skytide provides a Designer UI, that lets the application designer point to various data sources, relate the data sources, and defi ne Cubes. As illustrated in fi gure 1, you can see a cube defi ned with various dimensions

Assuming that there are numerous articles marked up in this manner, it would be useful to be able to answer some analytic questions, such as:

One signifi cant benefi t of the data cube is the ability to easily perform ad-hoc queries and

analysis in a point-and-click fashion with no query language. Once a cube is defi ned it is easy to form various queries by dragging dimensions from the cube into any row or column location in the right side pane of the Skytide Designer window.

For example, the following query produces aggregate order amount data for a given state (West Virginia), and stock security sector (Energy) in a tabular result set, as shown in the Skytide Designer in fi gure 2.

This query was easy to build in Skytide by dragging cube dimensions onto an empty palette. In this case, order amounts are aggregated across all time periods. Although this type of query is possible against XML, it is not easily done by end users, for example, the identical query against data in IBM’s DB2 pureXML database is declared as shown in fi gure 3.

Skytide provides multiple advantages over this XQuery. Many users will fi nd it easier to create such a query in Skytide’s GUI rather than in XQuery notation. Secondly, the basic data fi ltering as well as the column sorting can easily be changed by mouse clicks on various column headings. This provides instant query results for different state and stock sectors, or different ordering, at no

or negligible additional cost. The XQuery in DB2 however would need to be re-rerun if results for a different state or stock sector are desired. This would

be particularly expensive for the more complex queries shown below.

The more common business analytics query is composed of a cross tabular view of order data aggregated across the multiple dimensions. For example, viewing performance measures by dimensions state and by month is easily accomplished by dragging dimensions onto a query page.

The more common business analytics query is composed of a cross tabular view of order data aggregated across the multiple dimensions. For example, viewing performance measures by dimensions state and by month is easily accomplished by dragging dimensions onto a query page, as shown in fi gure 4.

Figure 1: Skytide Designer User interface

Figure 2: Skytide Drag-and-Drop Query

Figure 3: XQuery Format

C1: max_stock_orderdeclare default element namespace “http://www.fi xprotocol.org/FIXML-4-4”;declare namespace s=”http://tpox-benchmark.com/security”;declare namespace c=”http://tpox-benchmark.com/custacc”;let $order :=for $ss in db2-fn:xmlcolumn(“SECURITY.SDOC”)/s:Security[s:SecurityInformation/s:StockInformation/s:Industry =”Energy”]for $ord in db2-fn:xmlcolumn(“ORDER.ODOC”)/FIXML/Order[Instrmt/@Sym= $ss/s:Symbol/fn:string(.)]for $cs in db2-fn:xmlcolumn(“CUSTACC.CADOC”)/c:Customer[c:Addresses/c:Address/c:State= “West Virginia”]/c:Accounts/c:Account[@id =$ord/@Acct/fn:string(.)]return $ord/OrdQty/@Cashreturn string(max($order))

Page 3: Skytide_vs_XQuery_v1

About SkytideSkytide delivers business analytical solutions that provide timely and unprecedented insight into the constantly changing environment in which today’s businesses operate. The XML-based Skytide Analytical Platform is the fi rst and only solution available today that can understand complex data from virtually any source, including unstructured data such as XML and HTML, delivering the visibility necessary to make critical business decisions. Skytide customers include Fortune 1000 companies across a wide range of market segments, including manufacturing, fi nancial services, healthcare, utilities, and retail. Founded in 2003, Skytide is a privately held, venture-backed company headquartered in California’s Silicon Valley.

This is type of query is far too complex to declare in XQuery and would perform poorly.

Final Thoughts:

The Skytide server is optimized to handle complex data calculation and aggregation from the source XML data. The dimensional model of the cube lets the user construct a virtually unlimited number of tabular and cross-tabular (pivot) query results, em-powering end users to quickly analyze large amounts of XML data -- often in a matter of seconds.

Put Skytide to Work for You

Discover how the Skytide Analytical Platform can uncover the power of your data. Contact us today at [email protected] or 650.292.1900.

Figure 4: Multi-dimensional view of performance measures

© 2007 Skytide, Inc. All rights reserved. Skytide and the Skytide logo are registered trademarks of Skytide, Inc. All other trademarks are the property of their respective owners.

Skytide, Inc.

1820 Gateway Drive, Suite 300 San Mateo, CA 94404

Phone: 1.650.292.1900

Fax: 1.650.312.1400

E-mail: [email protected]

Internet: www.skytide.com