february 24, 20061 the design and application of a generic query toolkit presented by: lichun (jack)...
Post on 19-Dec-2015
217 views
TRANSCRIPT
February 24, 2006 1
The Design And Application Of A Generic Query Toolkit
Presented by: Lichun (Jack) ZhuCourse: 60-520Winter 2006Instructor: Dr. A. K. Aggarwal University of Windsor
February 24, 2006 2
Agenda Introduction Contemporary Business Intelligence
Solutions The Design of Generic Query Toolkit Applications Works Undergoing & Future Scope Summary Demo Q & A
February 24, 2006 3
Introduction Traditional way of developing
Information Systemswaterfall model. problems:hard coded programs, less flexibility for change of specification
Prototyped developing methodscrew & rise, allows better communication, locate problems as early as possible, needs RAD tool support
My Project - Generic Query Toolkit Extended SQL language(GQL), query interface generator
February 24, 2006 4
Related Subject - Contemporary Business Intelligence Solutions
What is Business Intelligence ?BI means the process of turning data into
information, then into knowledge.
It uses all means including data warehousing, data mining, decision support techniques to collect, organize and process the enterprise data.
The goal of BI is to support the analysis & decision process, improve the competitive power of the enterprise.
February 24, 2006 5
Contemporary Business Intelligence Solutions
Common BI tools Business Objects Brio Cognos etc
Common features of BI software Customizable report and query interface
automation OLAP / Data Mining Analysis Data Integration Broadcast / Push Information
February 24, 2006 6
Contemporary Business Intelligence Solutions
Problems of commercial BI software Highly complicated systems. require sharp
learning curve Too expensive to suit for small projects
Open Source BI software Pentaho Business Intelligence Project Which integrates:
Mondrian OLAP server, Jpivot, Weka Data Mining etc
February 24, 2006 7
The Design of Generic Query Toolkit
Architecture GQL Language Specification Components How do all these components work
together?
February 24, 2006 8
Architecture
WML/XHTML
MetadataRepositoryp_query,
p_queryq ...
Database/Datamart
GQL App At Server Side
GQL DaemonTODO:Scheduler, Workflow JDBC
HTTP
Application Server - Tomcat
GQL Server(Web ServicesBased on Axis)
GQLViewer(JSP, Servlet,Struts…TODO: Report tools,Jpivot OLAP, Weka
Data Mining)
WAPGateway
HTTP
GQL Parser TODO:Language Extension
Client App Web Browser
File SystemCache Directory(in compressed
XML format)
SOAP/WSDL
HibernateO-R Mapping1.get waiting tasks2.set task status
HibernateO-R Mapping1. get script,2. put into queue3. get task info
Write query resultinto the cachedirectory
Read cachedquery results fordisplay
MobileDevice
February 24, 2006 9
GQL Language Specification
The GQL language is an extension on SQL. We use field attributes and query criteria attributes to replace the select-expressions and condition- expressions in SQL statements. Display attributes Field_Attribute ::={Field_Name;
Field_Description; Field_Type; Display_Attribute[;
[Aggregate_Attribute] ; [Key_Attribute ] ] }
February 24, 2006 10
GQL Language Specification
Query criteria attributes
Condition_Attribute ::= <Condition_Expression; Condition_Description; Condition_Type [; [Domain]; [Required_Attribute]; [Default_Attribute];
[Hint] ]>
February 24, 2006 11
GQL Language Specification
For example:select {id;Item;INTEGER;SHOW;;GROUP}, {mark;Type;STRING;SHOW;;GROUP}, {catelog;Category;STRING;SHOW;;GROUP}, {cdate;Date;DATE;SHOW;;GROUP}, {sum(income) incom;Credit;MONEY;SHOW;SUM}, {sum(outcome) outcom;Debit;MONEY;SHOW;SUM}, {sum((income-outcome)) pure;Net;MONEY;SHOW;SUM} from t_dacewhere <id;Item;INTEGER;#select id,name from t_item where id between 500 and 999 order by id> and <note;Description;STRING> and <mark;Type;STRING;#1> and <catelog;Category;STRING;#3> and <cdate;Date;DATE> and <income*exrate;Credit;MONEY> and <outcome*exrate;Debit;MONEY>group by #1, #2, #3, #4order by #1, #2, #3, #4;
February 24, 2006 12
GQL Language Specification
Generated User Interface:
February 24, 2006 13
Metadata Repository
Query directory – p_query Task queue – p_queryq
p_query
PK seq
idexplainrefqrypermskindscriptrefnumtemplate
p_queryq
PK uid
FK1 seqidstimeetimecondfldsdatapathstatustellnoerrmsgrefnumserverdatasize
Architecture of GQL Toolkit
February 24, 2006 14
Components of GQL Toolkit
GQL Parser (using Jflex, Cup) Parse: generate internal objects that represent fields & criteria attributes XML interface – get, bind Sample XML Schema
<?xml version="1.0" encoding="UTF-8"?><SQLGenerator xmlns="Parameters" seq="1" title="Revenue/Expense Analysis" ...> <Fields> <ID0 fstr="id" fdesc="Item" ftype="INTEGER" fprecision="0" fdatefmt="" fflag="SHOW" fagg="" fkey="GROUP" fappend="" factive="ENABLE" /> <ID1 fstr="mark" fdesc="Type" ftype="STRING" fprecision="0" fdatefmt="" fflag="SHOW" fagg="" fkey="GROUP" fappend="" factive="ENABLE" /> ... <ID6 fstr="sum((income-outcome)) pure" fdesc="Pure" ftype="MONEY" fprecision="2" fdatefmt="" fflag="SHOW" fagg="SUM" fkey="" fappend="" factive="ENABLE" /> </Fields> <Condis> <ID0 fstr="id" fdesc="Item" ftype="INTEGER" fprecision="0" fdatefmt="" facq="" fdefault="" fcomment="" fop="=" fexpflag="0"><fvalue>"501|Cash","502|Saving","503|Checking",...</fvalue> <fexp>501,512<fexp/></ID0> ... <ID6 fstr="outcome*exrate" fdesc="Debit" ftype="MONEY" fprecision="2" fdatefmt="" facq="" fdefault="" fcomment="" fop="=" fexpflag="0"><fvalue /><fexp /></ID6> </Condis></SQLGenerator>
February 24, 2006 15
GQL Parser
Execute: generate target SQL statements (using expression reduce algorithm)
Generated Sample SQL
select mark , catelog , sum(income) incom , sum(outcome) outcom , sum((income-outcome)) pure from t_dace where id between 501 and 512 and mark = 'P' and cdate >= '01-01-2006' group by mark , catelog order by mark , catalog
Components of GQL Toolkit
February 24, 2006 16
Components of GQL Toolkit
GQL Daemon (using Hibernate, SAX, JDom) Runs background, multi-thread
Procedure run()Begin Set the status of the task to “Running”; Try Get script from corresponding p_query persistence object; Create new instance of GQL Parser class and call its Parse method to parse the script; Get XML schema which stored in condfld attribute from p_queryq persistence object; Call GQLParser.XMLBindFieldsAndConditions to bind the XML schema; Call GQLParser.Execute to get a list of SQL statements; Submit these SQL statements to database server one by one;
Export the query results and save them into the cache directory, as compressed XML document. Set the status of the task to “Success”; Exception
Set the status of the task to “Error” and record the accompany error message; End;End.
February 24, 2006 17
GQL Daemon Result data format, compatible with Delphi
Clientdataset
<?xml version="1.0" encoding=”UTF-8” standalone="yes"?><DATAPACKET Version="2.0"> <METADATA> <FIELDS> <FIELD attrname="date_" fieldtype="date" WIDTH="23"/> <FIELD attrname="account_no" fieldtype="string" WIDTH="9"/> <FIELD attrname="trans_num" fieldtype="r8"/> <FIELD attrname="trans_amt" fieldtype="r8" SUBTYPE="Money"/> </FIELDS> <PARAMS LCID="1033"/> </METADATA> <ROWDATA> <ROW date_="20040128" account_no="11000” trans_num="2" trans_amt="240.34" /> <ROW date_="20040129" account_no="11004” trans_num="1" trans_amt="436.40" /> <ROW date_="20040130" account_no="11000” trans_num="2" trans_amt="1240.75" />
… </ROWDATA></DATAPACKET>
Components of GQL Toolkit
February 24, 2006 18
Components of GQL Toolkit
GQL Server Using Hibernate and Apache Axis, support
SOAP/WSDL
Providing intermediate Access Service and GQL Service
Access Service including user login, change password, system logging services
GQL Service communicates with presentation layer, co-operates with GQL Daemon and manages the query queue table.
February 24, 2006 19
Components of GQL Toolkit
GQL Viewer Currently using Jsp, Servlet, struts, various other TAG
libraries and using XSLT to present the data
Communicate with GQL Server, construct user interface, feed back user input, monitor task queue and display query results.
Connect with Legacy Client Application
February 24, 2006 20
Architecture of GQL Toolkit How do all these components work together?
MetadataMetadataQuerys and task queue
GQL ViewerGQL Viewer GQL SererGQL Serer
QueryQuery InterfaceInterface
1.Select a query from directory …1.Select a query from directory …
Return Return XML SchemaXML Schema
Select a Select a query from query from directorydirectory
Call Call getXMLSchemgetXMLSchem
a()a()
Build Query Build Query InterfaceInterface
February 24, 2006 21
Architecture of GQL Toolkit
How do all these components work together?
2.Input criteria then submit the 2.Input criteria then submit the query …query …
MetadataMetadataQuerys and task queue
GQL ViewerGQL Viewer GQL SererGQL Serer
Input criteria Input criteria and submitand submit
Call Call CheckCachedQuery() CheckCachedQuery()
using XML Schema bind using XML Schema bind with input datawith input data
XML SchemaXML Schema
Data DisplayData Display
Task MonitorTask Monitor
XML DataXML Data
File SystemFile SystemCache DirectoryCache Directory
If found If found matching matching query and query and
owner choose owner choose to view the to view the data, then data, then return the return the data from data from
cache.cache.
Display Display cached cached
matching matching datadata
Otherwise, Otherwise, put the query put the query
into queue into queue and display and display
the task the task monitormonitor
February 24, 2006 22
Architecture of GQL Toolkit How do all these components work together?
DatabaseDatabaseGQL DaemonGQL Daemon
Task MonitorTask Monitor
3.The GQL Daemon detects and runs the 3.The GQL Daemon detects and runs the query …query …
MetadataMetadataQuerys and task queue
XML DataXML Data
GQL ViewerGQL Viewer GQL SererGQL Serer
View the data. View the data. Other actions: Other actions: delete data, delete data,
make footnotemake footnote
Data DisplayData Display
XML Schema XML Schema & Data& Data
Display dataDisplay data
Call ExtractCondflds() & Call ExtractCondflds() & ExtractData() to get ExtractData() to get data. Other actions: data. Other actions:
Cleardata(), MarkQuery()Cleardata(), MarkQuery()
File SystemFile SystemCache DirectoryCache Directory
XML DataXML Data
GQL Daemon detects GQL Daemon detects the waiting task and the waiting task and create a thread to create a thread to
run it. Data result will run it. Data result will be exported to cache be exported to cache
directorydirectory
February 24, 2006 23
Applications The Management Information & Report
System for DCC Project – Jiangsu Branch, China Construction Bank, 2003
The Long Credit Card Management Information System (CMIS) of China Construction Bank, 2002
Long Card Data Analysis System – Shanghai Branch, China Construction Bank, 2001
February 24, 2006 24
Works undergoing and future scope
GQL Language extension Report template support and multi-
format data export support OLAP support Data mining support WAP support Scheduler and Workflow support GQL Visualized Designer
February 24, 2006 25
Summary and Conclusion
Goal Build a testbed for the research of new data
warehousing techniques and testing of new data mining algorithms;
Provide valuable solutions for future commercial use in Business Intelligence area
February 24, 2006 26
References1. Tetsuo Tamai, Akito Itou, Requirements and design change in large-
scale-software development: analysis from the viewpoint of process backtracking, Proceedings of the 15th international conference on Software Engineering, p.167-176, May 17-21, 1993, Baltimore, Maryland, United States.
2. M. Golfarelli, S. Rizzi, I. Cella, Beyond Data Warehousing: What's next in business intelligence?, Proceedings 7th International Workshop on Data Warehousing and OLAP (DOLAP 2004), Washington DC, 2004.
3. James Dixon, Pentaho Open Source Business Intelligence Platform Technical White Paper, http://sourceforge.net/project/showfiles.php?group_id=140317, © 2005 Pentaho Corporation.
4. XML for Analysis Specification Version 1.1, http://www.xmla.org/docs_pub.asp, Microsoft Corporation, Hyperion Solutions Corporation, 2002.
February 24, 2006 27
References
5. Marenco,L., Tosches,N., Crasto,C., Shepherd,G., Miller,P.L. and Nadkarni,P.M. (2003), Achieving evolvable Web-database bioscience applications using the EAV/CR framework: recent advances, J. Am. Med. Inform. Assoc., 10, 444–453.
6. Hibernate Object-Relational Persistent solution, http://www.hibernate.org
7. Jpiviot Tag Library, http://jpivot.sourceforge.net/
8. Weka Data Mining Software, http://www.cs.waikato.ac.nz/ml/weka/
February 24, 2006 28
Demonstration
February 24, 2006 29
Q & A
Thanking You