clarke, r. j (2001) l909-10: 1 office automation & intranets buss 909 lecture 10 intranet...

66
Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Post on 19-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 1

Office Automation & Intranets

BUSS 909

Lecture 10Intranet Functionality 2:

Textual Media and Database Integration

Page 2: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 2

Agenda

we will consider a number of issues relating to text resources- how intranets can provide information re-purposing texts documents into HTMLthe development and operation of doc basesintegrating databases with Intranet

applications (and the associated technical issues of state and persistence)

Page 3: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 3

Quick Publishing

Page 4: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 4

Quick PublishingBefore Intranets

traditional print publications:requires designers, printers, bindings,

shipping, receiving- many professionalsrequires may expensive and time

consuming decisionsoften source materials are out of date

(risking professional & legal difficulties)

Page 5: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 5

Quick Publishing

quick publishing is typically an organisation’s first major intranet activity

there is an overlay between quick publishing and information management (the latter generally involves database publishing- refer to Lecture 3)

Page 6: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 6

Quick Publishing

circumvents the time taken to print and publish material by creating digital documents for web publication straight from the document source applications (word processors and DTP applications)

digital documents are posted on the intranet. Eg: NetObjects TeamFusion (1992) Acrobat file on the BUSS909 Intranet

Page 7: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 7

Quick Publishing

information is available for viewing, downloading and printingsome of these kinds of documents are

referred to as brochure-warebut may include more substantial

materials such as policy documents

this use of intranets treats the WWW like an information parts store

Page 8: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 8

Quick Publishing

information parts stores are more than just a collection of files- they invite use- eg. PPT files in the BUSS909 Intranet

information on intranets can be:updated and re-published very quickly

bypassing mailrooms and printing agencies

removes the need for multiple physical copies- no costs for additional copies

Page 9: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 9

Quick Publishingsome companies have measurable cost

reductions: normally intranets can reduce some paper use but that will not happen immediatelycopying costs may in fact increasewhile it would be expected that overall paper use

will decreaseadditional costs may be incurred internally while

the IT/IS department tools up for Quick Publishing

Page 10: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 10

Quick Publishing

generally “really quick publishing” is accomplished by using an electronic publishing application like Acrobat (see BUSS909 Intranet)but as soon as you want to create real

digital documents (with hyperlinks etc) then you are looking at converting documents to HTML

very soon problems can emerge!

Page 11: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 11

Web/Database Integration

Page 12: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 12

Platform-dependent

Client is natively compiled and therefore executes fast

Installation necessary

Fat client; maintenance needs incurred

New, unfamiliar interface

Rich, custom GUI constructs possible

Difficult to integrate- existing applications

Difficult to add multimedia

Persistent connection to database

Platform-independent

Client is an interpreter (HTML, Java, JavaScript, Microsoft Active X, etc) and is therefore slower

No installation necessary, depending on model used

Thin Client, maintenance is minimized

One common, familiar interface across applications

Limited set of GUI constructs; with Java applets, custom-coded ones add to download time

Easy to integrate with existing applications

Easy to add multimedia

Nonpersistent connection to databases

Client/Server DB ApplicationsTraditional -vs- Web-based

Page 13: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 13

Intranets:Web Database Applications- Components

fundamental components of web database applications for the architecture of a web database application

a database gateway is a combination of one or more of the first three layers of the architecture: browser, application logic, and database connection

Page 14: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 14

Intranets:Web Database Applications- Components

web database applications are composed of four components or layersbrowser layerapplication logic layerdatabase connection layerdatabase layer

Page 15: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 15

Intranets:Web Database Applications- Components

External“helper”program

HTML Document

Javaapplet

JavaApplication

CGI program

ProprietaryWeb Server

Web ServerAPI module

Vendor’sDatabase API Command-line

interface to database

ODBC JDBC

BrowserLayer

ApplicationLogicLayer

DatabaseGatewayLayer

DatabaseLayer

Database (RDMS)

Page 16: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 16

Intranets:Web Database Applications- Two tiered

web database applications can consist of multiple tiers- there are two varieties:two-tiered applicationsthree-tiered applications

two-tiered applications consist of client which supplies the user interface and the database connection and the database

Page 17: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 17

Intranets:Web Database Applications- Three tiered

three tiered applications consist of a client which supplies the user interface, a middle tier which supports the database connection, and the database

additional tiers can be used for operations such as security and state management (see Technical Note 4)

Page 18: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 18

Web-Database Integration

there are two ways to integrate Web database applications with other applications:by directly linking applications of one

technology base- generally involves straight-forward coding

passing data between two applications- generally involves CGI (Common Gateway Interface) scripts

Page 19: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 19

Web-Database IntegrationCGI Protocol

CGI Protocol is generally the heart of integration- it was designed to do this in web applications

CGI scripting was the first way that web sites were able to be integrated with databases and other resources external to the Web

Page 20: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 20

Web-Database IntegrationMethods for Passing Data

Integration of Web database applications with other applications can be accomplished with CGI coupled with one of several methods for passing data:Hidden FieldsURL ParametersCookiesJavaScript

Page 21: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 21

Web-Database IntegrationPassing Data: Hidden Fields (1)

a hidden HTML form field is commonly used as a simple storage container for data that needs to be passed from page to page of an HTML-based and CGI application

For example:

<input type=hidden name=“sessionID” value=“jwsr438kowkmgl”

Page 22: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 22

Web-Database IntegrationPassing Data: Hidden Fields (2)

CGI programs automatically populate the field with a session ID that can be used to look up session information stored in a database

suppose an intranet requires employees to authenticate their identity by logging in with a user name and password

Page 23: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 23

Web-Database IntegrationPassing Data: Hidden Fields (3)

at login employee can be assigned a randomly generated

unique key that is stored in the database along with the users name

the authentication procedure stores the key in a hidden HTML form field

during the session the employee moves within the site; the hidden field

can be referenced by new resources to determine who is accessing them

Page 24: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 24

Web-Database IntegrationPassing Data: Hidden Fields (4)

this functionality can be used in contexts other than simply authenticating users

they are also an ideal way of passing data between Lotus Domino applications and straight CGI or Web Server API applications because Domino’s collaborative document management system is written in HTML

Page 25: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 25

Web-Database IntegrationPassing Data: URL Parameters (1)

within the URL, the string following the question mark (?) in a GET request, are another storage area for data that can be accessed by Web database applications (see Lecture 8)

like HTML hidden form fields, the URL parameter string can be retrieved by CGI programs, web server API programs etc

Page 26: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 26

Web-Database IntegrationPassing Data: URL Parameters (2)

the URL parameters can be used in the same way that HTML hidden fields are used

for example, a session key can be stored in the URL and used to access user authentication, user privaledges, and session state information

Page 27: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 27

Web-Database IntegrationPassing Data: Cookies (1)

Cookies are pieces of information sent by the Web server in HTTP headers and stored on the client machine

are used for the same purposes as HTML hidden form fields, and URL parameters… but have an additional features...

Page 28: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 28

Web-Database IntegrationPassing Data: Cookies (2)

the data stored in a Cookie can be retrieved across multiple Web browsing sessions

when a user quits an instance of a Web browser, any data stored in a HTML hidden field or URL parameter during the session is lost, but if...

Page 29: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 29

Web-Database IntegrationPassing Data: Cookies (3)

if the data is stored in a cookie on the client machine then the information is retrievable even after the user quits the browser

as a consequence the data will be ready to be accessed during a later session

Page 30: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 30

Web-Database IntegrationPassing Data: Cookies (4)

cookies relevant to a URL are sent back to the server and accessible via the environment variable HTTP_COOKIE

Cookies are retrievable from Java (via JavaScript), JavaScript, CGI, and Web server modules such as Lotus Domino

Page 31: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 31

Web-Database IntegrationPassing Data: JavaScript (1)

JavaScript can be accessed from Java provides a bridge between a Java

applet and the document on which it lives- adds new capabilities to Web database application programming

an applet can know about any forms residing on the same page as its own and can access these fields

Page 32: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 32

Web-Database IntegrationPassing Data: JavaScript (2)

higher than JDK1.1, if an applet resides with a frame, then it can access parent and sibling framesreading data from themexecuting JavaScript functions defined

in them, oroverwriting the framed documents

completely

Page 33: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 33

Persistence & State

Page 34: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 34

Persistence

persistent database connections are highly efficient data channels between a database client and the DBMS- ideal for database applications

they allow single applications to exhaust these valuable data channels as applications may require more than one constant connection

Page 35: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 35

Non-Persistence

however Web-based database applications and the web applications in general do not have persistence

the non-persistent connection architecture of the Web is a mixed blessing

Page 36: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 36

State

non-persistent connections mean that programmers take care of the application state management

in order to understand persistence we must understand state

the state of a system is expressed through the values that it current holds as a result of its execution

Page 37: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 37

Persistence and State

persistence is the result or remembering or tracking all the incremental intermediate changes in the state of the system (its objects, movements, or the actions of various media)

persistence is the capabilty of remembering a state across different applications or time periods

Page 38: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 38

Persistence and Statefor Traditional Applications

for traditional applications, managing persistence is easy- the application’s state can be kept in memory as long as the computer has enough memory

stand-alone applications do not interact with any other applications, clients, or servers, and so do not need to make their state available or dependent on external factors

Page 39: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 39

Resource Allocation Model:Traditional Databases

Process 1

Process 2

Process 3

Process n

Database allowing n connections

Connections are maintained to the database;idle sessions waster resources

Page 40: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 40

Persistence and Statefor Web Applications (1)

state maintenance in Web database applications adds another level of complexity which must be handled by programmers

HTTP- the main protocol of the Web, is connectionless- which means that once an HTTP request is sent and the response is received the connection for the communication is closed

Page 41: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 41

Persistence and Statefor Web Applications (2)

if a connection could be kept open between the client and the server, the server could at any time query the client for state information and vise versa

the server would know the identity of the client user throughout the session once the user had logged in

Page 42: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 42

Persistence and Statefor Web Applications (3)

the server has no constant ‘memory’ of the users identity even after the user has logged in

therefore, HTTP clients must make a new connection for each server request

in a stateless request, the transaction is atomic

Page 43: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 43

Persistence and Statefor Web Applications (4)

programmers must also address the added overhead of creating new database connections each time a CGI program or Web server module requires database access

database connections are ‘expensive’ because they take time- which is a problem when implementing Web applications

Page 44: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 44

WebDatabase

QueryWebDatabase

QueryWebDatabase

QueryWebDatabase

QueryWebDatabase

Query

Resource Allocation Model:Web Databases

Process 1

Process 2

Process 3

Process n

Database allowing n connections

Non-persistent connectionsoptimize the number of available database connections at any time which promotes the sharing database resources

Persistent connectionsreduce the overhead of accessing databases from the WWW or Intranet

Page 45: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 45

Doc Bases

Page 46: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 46

Doc BasesIntroduction...

I have mentioned in previous lectures that the success of the Internet owes much to its simplicityinternet protocols connect clients to servers

using standard ASCII text. programmers can write distributed programs

using almost any programming or scripting languages- all languages can be applied to produce and consumed text-based protocols.

Page 47: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 47

Doc Bases…Introduction...

Why bother developing when you can buy distributed computing from Microsoft?simply assume that developing distributed

applications will be difficult the reason is that these proprietary solutions

use proprietary APIs, packaged into proprietary components that speak inaccessible binary protocols to other proprietary components.

Page 48: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 48

Doc Bases…Introduction

looking at the news group mail message (NNTP) shows how simple it can be...

each NNTP transaction consists of a few elements...

Path: localhost!not-for-mail

From: Rodney Clarke <[email protected]>

Newsgroups: test

Subject: sample message

Date: Tue, 14 Apr 1999 22:29:59 -0400

Message-ID: <[email protected]>

NNTP-Posting-host: localhost

MIME-Version: 1.0

Content-type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

X-Mailer: Mozilla 4.5 [en] (WinNT; I)

This is the body of the sample message!

Page 49: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 49

Doc Bases

Newsgroups: header which identifies which group the message was posted to,

Message-ID: is automatically generated by the news server and uniquely identifies the specific message, and

there are of course other fields including From:, Subject:, Date: and so on whose meanings are familiar to you.

Page 50: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 50

Doc BasesGeneral Definition...

a collection of these news messages in a tree of subdirectories is the news servers' primary message data base

this collection of messages is an example of Internet style data store or document data base which is being given the name of docbase (see Udell, 1999)

Page 51: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 51

Doc Bases…General Definition...

‘docbase’ is used to define ASCII text is structured according to some specific rules

in the case of news messages these rules of defined by the appropriate standard for USENET messages (RFC 1036)

Page 52: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 52

Doc BasesGeneral Definition...

this kind of Internet data store holds semi-structured information defined as a combination of data that is:structured- for example: news messages

defined key dimensions such as author, subject, date), together with,

unstructured- for example: a message body contains free form text.

Page 53: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 53

Doc BasesImpose Rules on Web Page Collections (1)

a collection of web pages won’t hold the same kind of semi structured dataweb pages are not required to carry

headers for examplenor are they required to exhibit more

complex structures than can be expressed in XML

but you can choose to impose these kinds of rules on a collection of web pages

Page 54: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 54

Doc BasesImpose Rules on Web Page Collections (2)

imposing rules on web pages converts them into containers for semi-structured data and collections of them form a web doc base.

semi-structured data is at the heart of groupware and requires the kind of skills necessary in publishing as well as conventional data management

Page 55: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 55

Doc BasesProduce, Transform and Delivery

it is possible to develop programs which produce, transform and deliver web doc bases.

using exactly the same Internet protocols that we are already familiar with, these web doc bases can then be turned into CSCW/groupware applications

Page 56: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 56

Doc BasesStructure (1)

Doc bases consist of several elements including a repository format:defines a doc base- the repository format for

mail and news data stores is usually the copies of the headers and message bodies

for web doc bases the repository format is often a markup language, eg. HTML, XHTML- a new standard which requires HTML documents to be ‘well formed’, SGML, or XML

Page 57: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 57

Doc BasesStructure (2): Input and Delivery Format

an input tool moves content into the repository- often a text editor, export filter or a web form and its handler

the delivery format is the data store that is server application uses to deliver a document to apply application

Page 58: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 58

Doc BasesStructure (3): Transformation & Viewing Tools

when repository and delivery format differ a transformation tool bridges the gap between them- is usually true for web doc bases but not always

a viewing tool is a client application that reads and displays a delivery format generally by means of a web browser, news reader or on occasion and mail reader.

Page 59: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 59

Doc BasesImplementation (1): XML Query Languages

there are several ways in which docbases can be implementedXML compatible Query Languages are now

emerging although some of these are not stable (still experimental, but they will become increasingly viable in the next 2 yrs)

there are some practical ways for implementing scalable, real world doc bases, now…

Page 60: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 60

Doc BasesImplementation (2): XML and Perl

perhaps the most practical way to implement a web based docbase is to:use XML as the repository format and then develop a translator in Perl using the

XML::Parser module that connects Perl to an XML parser called expat.

Perl is considered to be one of the few languages of choice for web masters

Page 61: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 61

Doc BasesHow they work- Programmers Analogy

if you program then developing these kinds of systems are like:turning the repository into a form of

source codethe translator becomes a compiler, and the deliverable HTML pages can be

thought of as a form of object code

Page 62: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 62

Doc BasesExample of Help Desk Doc Base (1)

a hypothetical doc base:could be created with information on how to

create web pages and perform procedures using NetObjects

this kind of archive of helpful tips is added to when new solutions are found to interesting problems in the company involving an Intranet/Extranet Site

Page 63: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 63

Doc BasesExample of Help Desk Doc Base (2)

web content in almost any form provides opportunities for creating groupware

possibilities to connect internal groups (product development, support, marketing and other groups) with external groups (existing and prospective customers, business partners)

Page 64: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 64

Doc BasesExample of Help Desk Doc Base (2)

eg. a link on a web page <a href=“mailto:[email protected]”>Author</a>

can rendered automatically by the translator into a parametised mailto: <a href=“mailto:[email protected]?subject=Technical+Feedback|\August+1998|NetObjects+Client|Page+Layout”>[email protected]</a>

this allows the reader of a technical paper to contact the systems support officer- the mail header is automatically filled out for them

Page 65: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 65

Doc BasesExample of Help Desk Doc Base (3)

the translator can do this because it knows about the doc base structure:automatic translator substitution ensures that

the context of the message is provided and consistent

the recipient could use a client-side mail filter to manage the messages from the doc base, organise then and count them by technical paper or weblet section (as with messages to me about A2 and A3)

Page 66: Clarke, R. J (2001) L909-10: 1 Office Automation & Intranets BUSS 909 Lecture 10 Intranet Functionality 2: Textual Media and Database Integration

Clarke, R. J (2001) L909-10: 66

References

Greer, T. (1998) Understanding Intranets Strategic Technology Series Microsoft Press

Ju, P. (1997) Databases on the Web- Designing and Programming for Network Access Pencom Web Works/ M&T Books

Udell, J. (1999) Practical Internet Groupware: Building Tools for Collaboration Cambridge: O’Reilly & Associates Inc.