clarke, r. j (2001) l909-10: 1 office automation & intranets buss 909 lecture 10 intranet...
Post on 19-Dec-2015
216 views
TRANSCRIPT
Clarke, R. J (2001) L909-10: 1
Office Automation & Intranets
BUSS 909
Lecture 10Intranet Functionality 2:
Textual Media and Database Integration
Clarke, R. J (2001) L909-10: 2
Agenda
we will consider a number of issues relating to text resources- how intranets can provide information re-purposing texts documents into HTMLthe development and operation of doc basesintegrating databases with Intranet
applications (and the associated technical issues of state and persistence)
Clarke, R. J (2001) L909-10: 3
Quick Publishing
Clarke, R. J (2001) L909-10: 4
Quick PublishingBefore Intranets
traditional print publications:requires designers, printers, bindings,
shipping, receiving- many professionalsrequires may expensive and time
consuming decisionsoften source materials are out of date
(risking professional & legal difficulties)
Clarke, R. J (2001) L909-10: 5
Quick Publishing
quick publishing is typically an organisation’s first major intranet activity
there is an overlay between quick publishing and information management (the latter generally involves database publishing- refer to Lecture 3)
Clarke, R. J (2001) L909-10: 6
Quick Publishing
circumvents the time taken to print and publish material by creating digital documents for web publication straight from the document source applications (word processors and DTP applications)
digital documents are posted on the intranet. Eg: NetObjects TeamFusion (1992) Acrobat file on the BUSS909 Intranet
Clarke, R. J (2001) L909-10: 7
Quick Publishing
information is available for viewing, downloading and printingsome of these kinds of documents are
referred to as brochure-warebut may include more substantial
materials such as policy documents
this use of intranets treats the WWW like an information parts store
Clarke, R. J (2001) L909-10: 8
Quick Publishing
information parts stores are more than just a collection of files- they invite use- eg. PPT files in the BUSS909 Intranet
information on intranets can be:updated and re-published very quickly
bypassing mailrooms and printing agencies
removes the need for multiple physical copies- no costs for additional copies
Clarke, R. J (2001) L909-10: 9
Quick Publishingsome companies have measurable cost
reductions: normally intranets can reduce some paper use but that will not happen immediatelycopying costs may in fact increasewhile it would be expected that overall paper use
will decreaseadditional costs may be incurred internally while
the IT/IS department tools up for Quick Publishing
Clarke, R. J (2001) L909-10: 10
Quick Publishing
generally “really quick publishing” is accomplished by using an electronic publishing application like Acrobat (see BUSS909 Intranet)but as soon as you want to create real
digital documents (with hyperlinks etc) then you are looking at converting documents to HTML
very soon problems can emerge!
Clarke, R. J (2001) L909-10: 11
Web/Database Integration
Clarke, R. J (2001) L909-10: 12
Platform-dependent
Client is natively compiled and therefore executes fast
Installation necessary
Fat client; maintenance needs incurred
New, unfamiliar interface
Rich, custom GUI constructs possible
Difficult to integrate- existing applications
Difficult to add multimedia
Persistent connection to database
Platform-independent
Client is an interpreter (HTML, Java, JavaScript, Microsoft Active X, etc) and is therefore slower
No installation necessary, depending on model used
Thin Client, maintenance is minimized
One common, familiar interface across applications
Limited set of GUI constructs; with Java applets, custom-coded ones add to download time
Easy to integrate with existing applications
Easy to add multimedia
Nonpersistent connection to databases
Client/Server DB ApplicationsTraditional -vs- Web-based
Clarke, R. J (2001) L909-10: 13
Intranets:Web Database Applications- Components
fundamental components of web database applications for the architecture of a web database application
a database gateway is a combination of one or more of the first three layers of the architecture: browser, application logic, and database connection
Clarke, R. J (2001) L909-10: 14
Intranets:Web Database Applications- Components
web database applications are composed of four components or layersbrowser layerapplication logic layerdatabase connection layerdatabase layer
Clarke, R. J (2001) L909-10: 15
Intranets:Web Database Applications- Components
External“helper”program
HTML Document
Javaapplet
JavaApplication
CGI program
ProprietaryWeb Server
Web ServerAPI module
Vendor’sDatabase API Command-line
interface to database
ODBC JDBC
BrowserLayer
ApplicationLogicLayer
DatabaseGatewayLayer
DatabaseLayer
Database (RDMS)
Clarke, R. J (2001) L909-10: 16
Intranets:Web Database Applications- Two tiered
web database applications can consist of multiple tiers- there are two varieties:two-tiered applicationsthree-tiered applications
two-tiered applications consist of client which supplies the user interface and the database connection and the database
Clarke, R. J (2001) L909-10: 17
Intranets:Web Database Applications- Three tiered
three tiered applications consist of a client which supplies the user interface, a middle tier which supports the database connection, and the database
additional tiers can be used for operations such as security and state management (see Technical Note 4)
Clarke, R. J (2001) L909-10: 18
Web-Database Integration
there are two ways to integrate Web database applications with other applications:by directly linking applications of one
technology base- generally involves straight-forward coding
passing data between two applications- generally involves CGI (Common Gateway Interface) scripts
Clarke, R. J (2001) L909-10: 19
Web-Database IntegrationCGI Protocol
CGI Protocol is generally the heart of integration- it was designed to do this in web applications
CGI scripting was the first way that web sites were able to be integrated with databases and other resources external to the Web
Clarke, R. J (2001) L909-10: 20
Web-Database IntegrationMethods for Passing Data
Integration of Web database applications with other applications can be accomplished with CGI coupled with one of several methods for passing data:Hidden FieldsURL ParametersCookiesJavaScript
Clarke, R. J (2001) L909-10: 21
Web-Database IntegrationPassing Data: Hidden Fields (1)
a hidden HTML form field is commonly used as a simple storage container for data that needs to be passed from page to page of an HTML-based and CGI application
For example:
<input type=hidden name=“sessionID” value=“jwsr438kowkmgl”
Clarke, R. J (2001) L909-10: 22
Web-Database IntegrationPassing Data: Hidden Fields (2)
CGI programs automatically populate the field with a session ID that can be used to look up session information stored in a database
suppose an intranet requires employees to authenticate their identity by logging in with a user name and password
Clarke, R. J (2001) L909-10: 23
Web-Database IntegrationPassing Data: Hidden Fields (3)
at login employee can be assigned a randomly generated
unique key that is stored in the database along with the users name
the authentication procedure stores the key in a hidden HTML form field
during the session the employee moves within the site; the hidden field
can be referenced by new resources to determine who is accessing them
Clarke, R. J (2001) L909-10: 24
Web-Database IntegrationPassing Data: Hidden Fields (4)
this functionality can be used in contexts other than simply authenticating users
they are also an ideal way of passing data between Lotus Domino applications and straight CGI or Web Server API applications because Domino’s collaborative document management system is written in HTML
Clarke, R. J (2001) L909-10: 25
Web-Database IntegrationPassing Data: URL Parameters (1)
within the URL, the string following the question mark (?) in a GET request, are another storage area for data that can be accessed by Web database applications (see Lecture 8)
like HTML hidden form fields, the URL parameter string can be retrieved by CGI programs, web server API programs etc
Clarke, R. J (2001) L909-10: 26
Web-Database IntegrationPassing Data: URL Parameters (2)
the URL parameters can be used in the same way that HTML hidden fields are used
for example, a session key can be stored in the URL and used to access user authentication, user privaledges, and session state information
Clarke, R. J (2001) L909-10: 27
Web-Database IntegrationPassing Data: Cookies (1)
Cookies are pieces of information sent by the Web server in HTTP headers and stored on the client machine
are used for the same purposes as HTML hidden form fields, and URL parameters… but have an additional features...
Clarke, R. J (2001) L909-10: 28
Web-Database IntegrationPassing Data: Cookies (2)
the data stored in a Cookie can be retrieved across multiple Web browsing sessions
when a user quits an instance of a Web browser, any data stored in a HTML hidden field or URL parameter during the session is lost, but if...
Clarke, R. J (2001) L909-10: 29
Web-Database IntegrationPassing Data: Cookies (3)
if the data is stored in a cookie on the client machine then the information is retrievable even after the user quits the browser
as a consequence the data will be ready to be accessed during a later session
Clarke, R. J (2001) L909-10: 30
Web-Database IntegrationPassing Data: Cookies (4)
cookies relevant to a URL are sent back to the server and accessible via the environment variable HTTP_COOKIE
Cookies are retrievable from Java (via JavaScript), JavaScript, CGI, and Web server modules such as Lotus Domino
Clarke, R. J (2001) L909-10: 31
Web-Database IntegrationPassing Data: JavaScript (1)
JavaScript can be accessed from Java provides a bridge between a Java
applet and the document on which it lives- adds new capabilities to Web database application programming
an applet can know about any forms residing on the same page as its own and can access these fields
Clarke, R. J (2001) L909-10: 32
Web-Database IntegrationPassing Data: JavaScript (2)
higher than JDK1.1, if an applet resides with a frame, then it can access parent and sibling framesreading data from themexecuting JavaScript functions defined
in them, oroverwriting the framed documents
completely
Clarke, R. J (2001) L909-10: 33
Persistence & State
Clarke, R. J (2001) L909-10: 34
Persistence
persistent database connections are highly efficient data channels between a database client and the DBMS- ideal for database applications
they allow single applications to exhaust these valuable data channels as applications may require more than one constant connection
Clarke, R. J (2001) L909-10: 35
Non-Persistence
however Web-based database applications and the web applications in general do not have persistence
the non-persistent connection architecture of the Web is a mixed blessing
Clarke, R. J (2001) L909-10: 36
State
non-persistent connections mean that programmers take care of the application state management
in order to understand persistence we must understand state
the state of a system is expressed through the values that it current holds as a result of its execution
Clarke, R. J (2001) L909-10: 37
Persistence and State
persistence is the result or remembering or tracking all the incremental intermediate changes in the state of the system (its objects, movements, or the actions of various media)
persistence is the capabilty of remembering a state across different applications or time periods
Clarke, R. J (2001) L909-10: 38
Persistence and Statefor Traditional Applications
for traditional applications, managing persistence is easy- the application’s state can be kept in memory as long as the computer has enough memory
stand-alone applications do not interact with any other applications, clients, or servers, and so do not need to make their state available or dependent on external factors
Clarke, R. J (2001) L909-10: 39
Resource Allocation Model:Traditional Databases
Process 1
Process 2
Process 3
Process n
Database allowing n connections
Connections are maintained to the database;idle sessions waster resources
Clarke, R. J (2001) L909-10: 40
Persistence and Statefor Web Applications (1)
state maintenance in Web database applications adds another level of complexity which must be handled by programmers
HTTP- the main protocol of the Web, is connectionless- which means that once an HTTP request is sent and the response is received the connection for the communication is closed
Clarke, R. J (2001) L909-10: 41
Persistence and Statefor Web Applications (2)
if a connection could be kept open between the client and the server, the server could at any time query the client for state information and vise versa
the server would know the identity of the client user throughout the session once the user had logged in
Clarke, R. J (2001) L909-10: 42
Persistence and Statefor Web Applications (3)
the server has no constant ‘memory’ of the users identity even after the user has logged in
therefore, HTTP clients must make a new connection for each server request
in a stateless request, the transaction is atomic
Clarke, R. J (2001) L909-10: 43
Persistence and Statefor Web Applications (4)
programmers must also address the added overhead of creating new database connections each time a CGI program or Web server module requires database access
database connections are ‘expensive’ because they take time- which is a problem when implementing Web applications
Clarke, R. J (2001) L909-10: 44
WebDatabase
QueryWebDatabase
QueryWebDatabase
QueryWebDatabase
QueryWebDatabase
Query
Resource Allocation Model:Web Databases
Process 1
Process 2
Process 3
Process n
Database allowing n connections
Non-persistent connectionsoptimize the number of available database connections at any time which promotes the sharing database resources
Persistent connectionsreduce the overhead of accessing databases from the WWW or Intranet
Clarke, R. J (2001) L909-10: 45
Doc Bases
Clarke, R. J (2001) L909-10: 46
Doc BasesIntroduction...
I have mentioned in previous lectures that the success of the Internet owes much to its simplicityinternet protocols connect clients to servers
using standard ASCII text. programmers can write distributed programs
using almost any programming or scripting languages- all languages can be applied to produce and consumed text-based protocols.
Clarke, R. J (2001) L909-10: 47
Doc Bases…Introduction...
Why bother developing when you can buy distributed computing from Microsoft?simply assume that developing distributed
applications will be difficult the reason is that these proprietary solutions
use proprietary APIs, packaged into proprietary components that speak inaccessible binary protocols to other proprietary components.
Clarke, R. J (2001) L909-10: 48
Doc Bases…Introduction
looking at the news group mail message (NNTP) shows how simple it can be...
each NNTP transaction consists of a few elements...
Path: localhost!not-for-mail
From: Rodney Clarke <[email protected]>
Newsgroups: test
Subject: sample message
Date: Tue, 14 Apr 1999 22:29:59 -0400
Message-ID: <[email protected]>
NNTP-Posting-host: localhost
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 4.5 [en] (WinNT; I)
This is the body of the sample message!
Clarke, R. J (2001) L909-10: 49
Doc Bases
Newsgroups: header which identifies which group the message was posted to,
Message-ID: is automatically generated by the news server and uniquely identifies the specific message, and
there are of course other fields including From:, Subject:, Date: and so on whose meanings are familiar to you.
Clarke, R. J (2001) L909-10: 50
Doc BasesGeneral Definition...
a collection of these news messages in a tree of subdirectories is the news servers' primary message data base
this collection of messages is an example of Internet style data store or document data base which is being given the name of docbase (see Udell, 1999)
Clarke, R. J (2001) L909-10: 51
Doc Bases…General Definition...
‘docbase’ is used to define ASCII text is structured according to some specific rules
in the case of news messages these rules of defined by the appropriate standard for USENET messages (RFC 1036)
Clarke, R. J (2001) L909-10: 52
Doc BasesGeneral Definition...
this kind of Internet data store holds semi-structured information defined as a combination of data that is:structured- for example: news messages
defined key dimensions such as author, subject, date), together with,
unstructured- for example: a message body contains free form text.
Clarke, R. J (2001) L909-10: 53
Doc BasesImpose Rules on Web Page Collections (1)
a collection of web pages won’t hold the same kind of semi structured dataweb pages are not required to carry
headers for examplenor are they required to exhibit more
complex structures than can be expressed in XML
but you can choose to impose these kinds of rules on a collection of web pages
Clarke, R. J (2001) L909-10: 54
Doc BasesImpose Rules on Web Page Collections (2)
imposing rules on web pages converts them into containers for semi-structured data and collections of them form a web doc base.
semi-structured data is at the heart of groupware and requires the kind of skills necessary in publishing as well as conventional data management
Clarke, R. J (2001) L909-10: 55
Doc BasesProduce, Transform and Delivery
it is possible to develop programs which produce, transform and deliver web doc bases.
using exactly the same Internet protocols that we are already familiar with, these web doc bases can then be turned into CSCW/groupware applications
Clarke, R. J (2001) L909-10: 56
Doc BasesStructure (1)
Doc bases consist of several elements including a repository format:defines a doc base- the repository format for
mail and news data stores is usually the copies of the headers and message bodies
for web doc bases the repository format is often a markup language, eg. HTML, XHTML- a new standard which requires HTML documents to be ‘well formed’, SGML, or XML
Clarke, R. J (2001) L909-10: 57
Doc BasesStructure (2): Input and Delivery Format
an input tool moves content into the repository- often a text editor, export filter or a web form and its handler
the delivery format is the data store that is server application uses to deliver a document to apply application
Clarke, R. J (2001) L909-10: 58
Doc BasesStructure (3): Transformation & Viewing Tools
when repository and delivery format differ a transformation tool bridges the gap between them- is usually true for web doc bases but not always
a viewing tool is a client application that reads and displays a delivery format generally by means of a web browser, news reader or on occasion and mail reader.
Clarke, R. J (2001) L909-10: 59
Doc BasesImplementation (1): XML Query Languages
there are several ways in which docbases can be implementedXML compatible Query Languages are now
emerging although some of these are not stable (still experimental, but they will become increasingly viable in the next 2 yrs)
there are some practical ways for implementing scalable, real world doc bases, now…
Clarke, R. J (2001) L909-10: 60
Doc BasesImplementation (2): XML and Perl
perhaps the most practical way to implement a web based docbase is to:use XML as the repository format and then develop a translator in Perl using the
XML::Parser module that connects Perl to an XML parser called expat.
Perl is considered to be one of the few languages of choice for web masters
Clarke, R. J (2001) L909-10: 61
Doc BasesHow they work- Programmers Analogy
if you program then developing these kinds of systems are like:turning the repository into a form of
source codethe translator becomes a compiler, and the deliverable HTML pages can be
thought of as a form of object code
Clarke, R. J (2001) L909-10: 62
Doc BasesExample of Help Desk Doc Base (1)
a hypothetical doc base:could be created with information on how to
create web pages and perform procedures using NetObjects
this kind of archive of helpful tips is added to when new solutions are found to interesting problems in the company involving an Intranet/Extranet Site
Clarke, R. J (2001) L909-10: 63
Doc BasesExample of Help Desk Doc Base (2)
web content in almost any form provides opportunities for creating groupware
possibilities to connect internal groups (product development, support, marketing and other groups) with external groups (existing and prospective customers, business partners)
Clarke, R. J (2001) L909-10: 64
Doc BasesExample of Help Desk Doc Base (2)
eg. a link on a web page <a href=“mailto:[email protected]”>Author</a>
can rendered automatically by the translator into a parametised mailto: <a href=“mailto:[email protected]?subject=Technical+Feedback|\August+1998|NetObjects+Client|Page+Layout”>[email protected]</a>
this allows the reader of a technical paper to contact the systems support officer- the mail header is automatically filled out for them
Clarke, R. J (2001) L909-10: 65
Doc BasesExample of Help Desk Doc Base (3)
the translator can do this because it knows about the doc base structure:automatic translator substitution ensures that
the context of the message is provided and consistent
the recipient could use a client-side mail filter to manage the messages from the doc base, organise then and count them by technical paper or weblet section (as with messages to me about A2 and A3)
Clarke, R. J (2001) L909-10: 66
References
Greer, T. (1998) Understanding Intranets Strategic Technology Series Microsoft Press
Ju, P. (1997) Databases on the Web- Designing and Programming for Network Access Pencom Web Works/ M&T Books
Udell, J. (1999) Practical Internet Groupware: Building Tools for Collaboration Cambridge: O’Reilly & Associates Inc.