mis - chapter 11 - data management - warehousing, analyzing, mining, and visualization

51
PART IV Managerial and Decision Support Systems 490 10. Knowledge Management 11. Data Management: Warehousing, Analyzing, Mining, and Visualization 12. Management Decision Support and Intelligent Systems CHAPTER 11 Data Management: Warehousing, Analyzing, Mining, and Visualization Harrah’s Entertainment 11.1 Data Management: A Critical Success Factor 11.2 Data Warehousing 11.3 Information and Knowledge Discovery with Business Intelligence 11.4 Data Mining Concepts and applications 11.5 Data Visualization Technologies 11.6 Marketing Databases in Action 11.7 Web-based Data Management Systems Minicases: (1) Sears / (2) Dallas Area Rapid Transit LEARNING OBJECTIVES After studying this chapter, you will be able to: Recognize the importance of data, their managerial issues, and their life cycle. Describe the sources of data, their collection, and quality issues. Describe document management systems. Explain the operation of data warehousing and its role in decision support. Describe information and knowledge discovery and business intelligence. Understand the power and benefits of data mining. Describe data presentation methods and explain geographical information systems, visual simula- tions, and virtual reality as decision support tools. Discuss the role of marketing databases and pro- vide examples. Recognize the role of the Web in data manage- ment.

Upload: api-3807238

Post on 10-Apr-2015

8.544 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

PA R T I V

Managerial and DecisionSupport Systems

490

10. Knowledge Management11. Data Management: Warehousing, Analyzing, Mining,

and Visualization12. Management Decision Support and Intelligent

Systems

C H A P T E R

11Data Management:Warehousing, Analyzing,Mining, and Visualization

Harrah’s Entertainment

11.1Data Management: A Critical

Success Factor

11.2Data Warehousing

11.3Information and Knowledge

Discovery with BusinessIntelligence

11.4Data Mining Concepts and

applications

11.5Data Visualization

Technologies

11.6Marketing Databases in

Action

11.7Web-based Data Management

Systems

Minicases: (1) Sears /(2) Dallas Area Rapid Transit

LEARNING OBJECTIVESAfter studying this chapter, you will be able to:

� Recognize the importance of data, theirmanagerial issues, and their life cycle.

� Describe the sources of data, their collection,and quality issues.

� Describe document management systems.

� Explain the operation of data warehousing andits role in decision support.

� Describe information and knowledge discoveryand business intelligence.

� Understand the power and benefits of datamining.

� Describe data presentation methods and explaingeographical information systems, visual simula-tions, and virtual reality as decision supporttools.

� Discuss the role of marketing databases and pro-vide examples.

Recognize the role of the Web in data manage-ment.

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 490

Page 2: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

491

FINDING DIAMONDS BY DATA MININGAT HARRAH’S

➥ THE PROBLEM

Harrah’s Entertainment (harrahs.com) is a very profitable casino chain. With 26casinos in 13 states, it had $4 billion sales in 2002 and net income of $235 mil-lion. One of Harrah’s casinos is located on the Las Vegas strip, which typifies themarketing issues that casino owners face. The problem is very simple, how to at-tract visitors to come and spend money in your casino, and to do it again andagain. There is no other place like the Las Vegas strip, where dozens of megacasinos and hundreds of small ones lure visitors by operating attractions rangingfrom fiery volcanoes to pirate ships.

Most casino operators use intuition to plan inducements for customers.Almost all have loyalty cards, provide free rooms to frequent members, give youfree shows, and more. The problem is that there is only little differentiationamong the casinos. Casinos believe they must give those incentives to survive,but do they help casinos to excel? Harrah’s is doing better than most competingcasinos by using management and marketing theories facilitated by informationtechnology, under the leadership of Gary Loveman, a former Harvard BusinessSchool professor.

➥ THE SOLUTION

Harrah’s strategy is based on technology-based CRM and the use of customer data-base marketing to test promotions. This combination enables the company to fine-tune marketing efforts and service-delivery strategies that keep customers comingback. Noting that 82.7 percent of its revenue comes from slot machines, Harrah’sstarted by giving each player a loyalty smart card. A smart-card reader on each slotmachinein all 26 of its casinos records each customer’s activities. (Readers are alsoavailable in Harrah’s restaurants, gift shops, etc., to record any spending.)

Logging your activities, you earn credit, as in other loyalty programs, forwhich you get free hotel rooms, dinners, etc. Such programs are run by mostcompetitors, but Harrah’s goes a step further: It uses a 300-gigabyte transactionaldatabase, known as a data warehouse, to analyze the data recorded by the cardreaders. By tracking millions of individual transactions, the IT systems assemblea vast amount of data on customer habits and preferences. These data are fedinto the enterprise data warehouse, which contains not only millions of trans-actional data points about customers (such as names, adddresses, ages, genders)but also details about their gambling, spending, and preferences. This databasehas become a very rich repository of customer information, and it is mined fordecision support.

The information found in Harrah’s database indicated that a loyalty strategybased on same-store (same casino, in this case) sales growth could be very ben-eficial. The goal is to get a customer to visit your establishment regularly. For ex-ample, analysis discovered that the company’s best customers were middle-agedand senior adults with discretionary time and income, who enjoyed playing slotmachines. These customers did not typically stay in a hotel, but visited a casinoon the way home from work or on a weekend night out. These customers re-sponded better to an offer of $60 of casino chips than to a free room, two steak

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 491

Page 3: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

492 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

meals, and $30 worth of chips, because they enjoyed the anticipation and ex-citement of gambling itself (rather than seeing the trip as a vacation get-away).

This strategy offered a way to differentiate Harrah’s brand. Understanding thelifetime value of the customers became critical to the company’s marketing strat-egy. Instead of focusing on how much people spend in the casinos during a singlevisit, the company began to focus on their total spending over a long time. And, bygathering more and more specific information about customer preferences, runningexperiments and analyses on the newly collected data, and determining ways ofappealing to players’ interests, the company was able to increase the amount ofmoney customers spent there by appealing to their individual preferences.

As in other casinos with loyalty programs, players are segregated into threetiers, and the biggest spenders get priorities in waiting lines and in awards. Thereis a visible differentiation in customer service based on the three-tier hierarchy,and every experience in Harrah’s casinos was redesigned to drive customers towant to earn a higher-level card. Customers have responded by doing what theycan to earn the higher-tiered cards.

However, Harrah’s transactional database is doing much more than just cal-culating gambling spending. For example, the casino knows which specific cus-tomers were playing at particular slot machines and at what time. Using datamining techniques, Harrah’s can discover what specific machines appealed tospecific customers. This knowledge enabled Harrah’s to configure the casinofloor with a mix of slot machines that benefited both the customers and thecompany.

In addition, by measuring all employee performance on the matrices ofspeed and friendliness and analyzing these results with data mining, the com-pany is able to provide its customers with better experiences as well as earn moremoney for the employees. Harrah’s implemented a bonus plan to reward hourlyworkers with extra cash for achieving improved customer satisfaction scores.(Bonuses totaling $43 million were paid over three years.) The bonus programworked because the reward depends on everyone’s performance. The generalmanager of a lower-scoring property might visit a colleague at a higher-scoringcasino to find out what he could do to improve his casino’s scores.

➥ THE RESULTS

Harrah’s experience has shown that the better the experience a guest has and themore attentive your are to him or her, the more money will be made. For Harrah’s,good customer service is not a matter of an isolated incident or two but of dailyroutine. So, while somewhere along the Las Vegas strip a “Vesuvian” volcanoerupts loudly every 15 minutes, a fake British frigate battles a pirate ship at regu-lar intervals, and sparkling fountains dance in a lake, Harrah’s continues to en-hance benefits to its Total Rewards program, improves customer loyalty throughcustomer service supported by the data mining, and of course makes lots of money.

Sources: Complied from Loveman (2003) and from Levinson (2001).

➥ LESSONS LEARNED FROM THIS CASE

The opening case about Harrah’s illustrates the importance of data to a large en-tertainment company. It shows that it is necessary to collect vast amount of data,

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 492

Page 4: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.1 DATA MANAGEMENT: A CRITICAL SUCCESS FACTOR 493

organize and store it proprely in one place, and then analyze it and use the re-sults for better make marketing and other corporate decisions. The case showsus that new data go through a process and stages: Data are collected, processed,and stored in a data warehouse. Then, data are processed by analytical tools suchas data mining and decision modeling. The findings of the data analysis direct pro-motional and other decisions. Finally, continous collection and analysis of freshdata provide management with feedback regarding the success of managementstrategies.

In this chapter we explain how this process is executed with the help of IT.We will also deal with some additional topics that typically supplement the datamanagement process.

11.1 DATA MANAGEMENT: A CRITICAL SUCCESS FACTOR

As illustrated throughout this textbook, IT applications cannot be done withoutusing some kind of data. In other words, without data you cannot have mostIT applications, nor can you make good decisions. Data, as we have seen in theopening case, are at the core of management and marketing operations. How-ever, there are increasing difficulties in acquiring, keeping, and managing data.

Since data are processed in several stages and possibly places, they may be sub-ject to some problems and difficulties.

DATA PROBLEMS AND DIFFICULTIES. Managing data in organizations is diffi-cult for various reasons:

● The amount of data increases exponentially with time. Much past data mustbe kept for a long time, and new data are added rapidly. However, only smallportions of an organization’s data are relevant for any specific application,and that relevant data must be identified and found to be useful.

● Data are scattered throughout organizations and are collected by many indi-viduals using several methods and devices. Data are frequently stored in sev-eral servers and locations, and in different computing systems, databases,formats, and human and computer languages.

● An ever-increasing amount of external data needs to be considered in mak-ing organizational decisions.

● Data security, quality, and integrity are critical, yet are easily jeopardized. Inaddition, legal requirements relating to data differ among countries andchange frequently.

● Selecting data management tools can be a major problem because of thehuge number of products available.

These difficulties, and the critical need for timely and accurate information, haveprompted organizations to search for effective and efficient data managementsolutions.

SOLUTIONS TO MANAGING DATA. Historically, data management has beengeared to supporting transaction processing by organizing the data in a hierarchicalformat in one location. This format supports secured and efficient high-volume

The Difficulties of Managing Data,

and Some PotentialSolutions

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 493

Page 5: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

494 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

processing; however, it may be inefficient for queries and other ad-hoc applica-tions. Therefore, relational databases, based on organization of data in rows andcolumns, were added to facilitate end-user computing and decision support.

With the introduction of client/server environments and Web technologies,databases became distributed throughout organizations, creating problems infinding data quickly and easily. This was the major reason that Harrah’s soughtthe creation of a data warehouse. As we will see later, the intranet, extranets andWeb technologies can also be used to improve data management.

It is now well recognized that data are an asset, although they can be a bur-den to maintain. Furthermore, the use of data, converted to information andknowledge, is power. The purpose of appropriate data management is to easethe burden of maintaining data and to enhance the power from their use. Tosee how this is done, let’s begin by examining how data are processed duringtheir life cycle.

Businesses do not run on raw data. They run on data that have been processedto information and knowledge, which mangers apply to businesses problems andopportunities. As seen at the Harrah’s case, knowledge fuels solutions. Everythingfrom innovative product designs to brilliant competitive moves relies on knowl-edge (see Markus et al., 2002). However, because of the difficulties of manag-ing data, cited earlier, deriving knowledge from accumulated data may not besimple or easy.

Transformation of data into knowledge and solutions is accomplished in sev-eral ways. In general, it resembles the process shown in Figure 11.1. It startswith new data collection from various sources. These data are stored in a data-base(s). Then the data is preprocessed to fit the format of a data warehouse ordata marts, where it is stored. Users then access the warehouse or data martand take a copy of the needed data for analysis. The analysis is done with dataanalysis and mining tools (see Chopoorian et al., 2001) which look for patterns,and with intelligent systems, which support data interpretation.

Note that not all data processing follows this process. Small and mediumcompanies do not need data warehouses, and even many large companies donot need them. (We will see later who needs them.) In such cases data godirectly from data sources or databases to an analysis. An example of directprocessing is an application that uses real-time data. These can be processedas they are collected and immediately analyzed. Many Web data are of thistype. In such a case, as we will see later, we use Web mining instead of datamining.

Data Life CycleProcess

InternalData

DataWarehouse

Meta Data

Data Sources

ExternalData

PersonalData

DataVisualization

OLAP,Queries

EIS, DSSDecisions

KnowledgeManagement

DataMarts

SCM

EC

Strategy

Others

CRM

DataMarts

Data Analysis Result

Business IntelligenceLegacy, Mainframe

Solutions

DataMining

FIGURE 11.1 Data lifecycle.

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 494

Page 6: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.1 DATA MANAGEMENT: A CRITICAL SUCCESS FACTOR 495

The result of all these activities is the generating of decision support andknowledge. Both the data (at various times during the process) and the knowl-edge (derived at the end of the process) may need to be presented to users.Such a presentation can be accomplished by using different visualization tools.The created knowledge may be stored in an organizational knowledge base (asshown in Chapter 10) and used, together with decision support tools, to pro-vide solutions to organizational problems. The elements and the process shownin Figure 11.1 are discussed in the remaining sections of this chapter.

The data life cycle begins with the acquisition of data from data sources. Datasources can be classified as internal, personal, and external.

INTERNAL DATA SOURCES. An organization’s internal data are about people,products, services, and processes. Such data may be found in one or more places.For example, data about employees and their pay are usually stored in the cor-porate database. Data about equipment and machinery may be stored in themaintenance department database. Sales data can be stored in several places—aggregate sales data in the corporate database, and details at each regional data-base. Internal data are usually accessible via an organization’s intranet.

PERSONAL DATA. IS users or other corporate employees may document theirown expertise by creating personal data. These data are not necessarily just facts,but may include concepts, thoughts, and opinions. They include, for example,subjective estimates of sales, opinions about what competitors are likely to do,and certain rules and formulas developed by the users. These data can resideon the user’s PC or be placed on departmental or business units’ databases oron the corporate knowledge bases.

EXTERNAL DATA SOURCES. There are many sources for external data, rangingfrom commercial databases to sensors and satellites. Government reports constitutea major source for external data. Data are available on CD-ROMs and memorychips, on Internet servers, as films, and as sound or voices. Pictures, diagrams,atlases, and television are other sources of external data. Hundreds of thousands oforganizations worldwide place publicly accessible data on their Web servers, flood-ing us with data. Most external data are irrelevant to any single application. Yet,much external data must be monitored and captured to ensure that important dataare not overlooked. Large amounts of external data are available on the Internet.

THE INTERNET AND COMMERCIAL DATABASE SERVICES. Many thousands ofdatabases all over the world are accessible through the Internet. Much of thedatabase access is free. A user can access Web pages of vendors, clients, andcompetitors. He or she can view and download information while conductingresearch. Some external data flow to an organization on a regular basis throughEDI or through other company-to-company channels. Much external data arefree; other data are available from commercial database services.

A commercial online database publisher sells access to specialized databases,newspapers, magazines, bibliographies, and reports. Such a service can provideexternal data to users in a timely manner and at a reasonable cost. Many com-mercial database publishers will customize the data for each user. Severalthousand services are currently available, most of which are accessible via theInternet. Many consulting companies sells reports online (e.g., aberdeen.com).

Data Sources

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 495

Page 7: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

496 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

The diversity of data and the multiplicity of sources make the task of data col-lection fairly complex. Sometimes it is necessary to collect raw data in the field.In other cases it is necessary to elicit data from people.

Raw data can be collected manually or by instruments and sensors. Someexamples of manual data collection methods are time studies, surveys, observa-tions, and contributions from experts. Data can also be scanned or transferredelectronically. Although a wide variety of hardware and software exists for datastorage, communication, and presentation, much less effort has gone into devel-oping software tools for data capture in environments where complex and unsta-ble data exist. Insufficient methods for dealing with such situations may limit theeffectiveness of IT development and use. One exception is the Web. Clickstreamdata are those that can be collected automatically, using special software, froma company’s Web site or from what visitors are doing on the site (see Chapter5, and Turban et al., 2004). In addition, the use of online polls and question-naires is becoming very popular (see Baumer, 2003, and Ray and Tabor, 2003).

The collection of data from multiple external sources may be an even morecomplicated task. One way to improve it is to use a data flow manager (DFM),which takes information from external sources and puts it where it is needed,when it is needed, in a usable form (e.g., see smartdraw.com). A DFM consists of(1) a decision support system, (2) a central data request processor, (3) a data in-tegrity component, (4) links to external data suppliers, and (5) the processes usedby the external data suppliers.

The complexity of data collection can create data-quality problems. There-fore, regardless of how they are collected, data need to be validated. A classicexpression that sums up the situation is “garbage in, garbage out” (GIGO). Safe-guards for data quality are designed to prevent data problems.

Data quality (DQ) is an extremely important issue since quality determines thedata’s usefulness as well as the quality of the decisions based on the data (Creeseand Veytsel, 2003). Data are frequently found to be inaccurate, incomplete, orambiguous, particularly in large, centralized databases. The economical and socialdamage from poor-quality data has actually been calculated to have cost organ-izations billions of dollars (see Redman, 1998). According to Brauer (2001), dataquality is the cornerstone of effective business intelligence.

Interest in data quality has been known for generations. For example, accord-ing to Hasan (2002), treatment of numerical data for quality can be traced to theyear 1881. An example of typical data problems, their causes, and possible solu-tions is provided in Table 11.1. For a discussion of data auditing and controls, seeChapter 15.

Strong et al. (1997) conducted extensive research on data quality problems.Some of the problems are technical ones such as capacity, while others relateto potential computer crimes. The researchers divided these problems into thefollowing four categories and dimensions.

● Intrinsic DQ: Accuracy, objectivity, believability, and reputation.

● Accessibility DQ: Accessibility and access security.

● Contextual DQ: Relevancy, value added, timeliness, completeness, amount ofdata.

● Representation DQ: Interpretability, ease of understanding, concise repre-sentation, consistent representation.

Methodsfor Collecting

Raw Data

Data Qualityand Integrity

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 496

Page 8: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.1 DATA MANAGEMENT: A CRITICAL SUCCESS FACTOR 497

Strong et al. (1997) have suggested that once the major variables and relation-ships in each category are identified, an attempt can be made to find out howto better manage the data.

Different categories of data quality are proposed by Brauer (2001). They are:standardization (for consistency), matching (of data if stored in different places),verification (against the source), and enhancement (adding of data to increase itsusefulness).

An area of increasing importance is data quality that is done very fast inreal time. Many decisions are being made today in such an environment. Forhow to handle data in such a case, see Creese and Veytsel (2003) and Bates(2003).

Another major data quality issue is data integrity. This concept means thatdata must be accurate, accessible, and up-to-date. Older filing systems may lackintegrity. That is, a change made in the file in one place may not be made ina related file in another place. This results in conflicting data.

DATA QUALITY IN WEB-BASED SYSTEMS. Data are collected on a routine basisor for a special application on the Internet. In either case, it is necessary toorganize and store the data before they can be used. This may be a difficult taskespecially when media-rich Web sites are involved. See Online File W11.1 for anexample of a multimedia Web-based system. For a comprehensive approach onhow to ensure quality of Internet-generated data, see Creese and Veytsel (2003).

There are several major problems with paper documents. For example, in main-taining paper documents, we can pose the following questions: (1) Does everyone

TABLE 11.1 Data Problems and Possible Solutions

Problem Typical Cause Possible Solutions (in Some Cases)

Data are not correct. Raw data were entered inaccurately. Develop a systematic way to ensure theaccuracy of raw data. Automate (usescanners or sensors).

Data derived by an individual Carefully monitor both the data values andwere generated carelessly. the manner in which the data have been

generated. Check for compliance withcollection rules.

Data are not timely. The method for generating the data Modify the system for generating thewas not rapid enough to meet the data. Move to a client/server system.need for the data. Automate.

Data are not measured Raw data were gathered according Develop a system for rescaling oror indexed properly. to a logic or periodicity that was recombining the improperly indexed

not consistent with the purposes data. Use intelligent search agents.of the analysis.

Needed data simply do No one ever stored the data Whether or not it is useful now, store datanot exist. needed now. for future use. Use the Internet to search

for similar data. Use experts.Required data never Make an effort to generate the data or to

existed. estimate them (use experts). Use neuralcomputing for pattern recognition.

Source: Compiled and modified from Alter (1980).

DocumentManagement

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 497

Page 9: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

498 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

have the current version? (2) How often does it need to be updated? (3) Howsecure are the documents? and (4) How can the distribution of documents tothe appropriate individuals be managed in a timely manner? The answers tothese and similar questions may be difficult.

Electronic data processing overcome some of these problems. One of theearliest IT-enabled tools of data management is called document management.When documents are provided in electronic form from a single repository (typ-ically a Web server), only the current version is provided. For example, manyfirms maintain their telephone directories in electronic form on an intranet toeliminate the need to copy and distribute hard copies of a directory that requiresconstant corrections. Also, with data stored in electronic form, access to variousdocuments can be restricted as required (see Becker, 2003).

WHAT IS DOCUMENT MANAGEMENT? Document management is the auto-mated control of electronic documents, page images, spreadsheets, word process-ing documents, and other complex documents through their entire life cycle withinan organization, from initial creation to final archiving. Document managementoffers various benefits: It allows organizations to exert greater control over pro-duction, storage, and distribution of documents, yielding greater efficiency in thereuse of information, the control of a document through a workflow process, andthe reduction of product cycle times.

Electronic delivery of documents has been around since 1999, with UPS andthe U.S. Post Office playing a major role in such service. They deliver docu-ments electronically over a secured system (e-mail is not secured), and they areable to deliver complex “documents” such as large files and multimedia videos(which can be difficult to send via e-mail). (See exchange.ups.com, and take thetest drive there.) The need for greater efficiency in handling business documentsto gain an edge on the competition has fueled the increased availability of doc-ument management systems, also known as electronic document management.Essentially, document management systems (DMSs) provide information inan electronic format to decision makers. The full range of functions that adocument management system may perform includes document identification,storage, and retrieval; tracking; version control; workflow management; andpresentation. The Thomas Cook Company, for example, uses a document man-agement system to handle travel-refund applications. The system works on thePC desktop and has automated the workflow process, helping the firm doubleits volume of business while adding only about 33 percent more employees (seeCole, 1996). Another example is the Massachusetts Department of Revenue,which uses imaging systems to increase productivity of tax return processing byabout 80 percent (see civic.com/pubs, 2001).

Document management deals with knowledge in addition to data and infor-mation. See Asprev and Middleton (2003) for an overview and for the rela-tionship of document management with knowledge management.

The major tools of document management are workflow software, author-ing tools, scanners, and databases (object-oriented mixed with relational, knownas object-relational database management systems; see Technology Guide 3).Document management systems usually include computerized imaging systemsthat can result in substantial savings, as shown in Online File W11.2.

One of the major vendors of document management is Lotus DevelopmentCorporation. Its document databases and their replication property provide

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 498

Page 10: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.1 DATA MANAGEMENT: A CRITICAL SUCCESS FACTOR 499

many advantages for group work and information sharing (see lotus.com). Forfurther discussion see imrgold.com and docuvantage.com.

WEB-BASED DMS. In many organizations, documents are now viewed as mul-timedia objects with hyperlinks. The Web provides easy access to pages of infor-mation. DMSs excel in this area. (see examples in A Closer Look 11.1). Web-enabledDMSs also make it easy to put information on intranets, since many of themprovide instantaneous conversion of documents to HTML. BellSouth, for example,

Here are some examples of how companies use doc-ument management systems to manage data and

documents:The Surgery Center of Baltimore stores all of its med-

ical records electronically, providing instant patient infor-mation to doctors and nurses anywhere and any time.The system also routes charts to the billing department,whose employees can scan and e-mail any related infor-mation to insurance providers and patients. The DMS alsohelps maintain an audit trail, including providing recordsfor legal purposes or action. Business processes have beenexpedited by more than 50 percent, the cost of suchprocesses is significantly lower, and morale of officeemployees in the center is up (see laserfiche.com/newsroom/baltimore.html).

American Express is using a DMS to collect and processover one million customer satisfaction surveys each year.The data are collected in templates of over 600 differentsurvey forms, in 12 languages, in 11 countries. The sys-tem (TELEform from Alchemy and Cardiff Software) isintegrated with AMEX’s legacy system and is capable ofdistributing processed results to many managers. Staffwho process these forms has been reduced from 17 to 1,saving AMEX over $500,000 each year (see imrgold.com/en/case-studies/fin_AMEX_Cardiff.asp).

LifeStar, an ambulance service in Tulare, California, iskeeping all historical paper documents on optical disks.Hundreds of boxes with documents were digitized, and soare all new documents. Furthermore, all optical disks arebacked up and are kept in different locations for securitypurposes (see laserfiche.com/newsroom/tulare.html).

Toronto, Canada, Works and Emergency ServicesDepartment uses a Web-based record document-retrievalsolution. With it, employees have immediate access todrawings and the documents related to roads, buildings,utility lines, and more. Quick access to these documents

enables emergency crews to solve problems, and evensave lives, much faster. Laptop computers are installed ineach departmental vehicle, loaded with maps, drawings,and historical repair data. (see laserfiche.com/newsroom/torantoworks.html).

The University of Cincinnati, a state university in Ohio,is required to provide authorized access to the personnelfiles of 12,000 active employees and tens of thousand ofretirees. There are over 75,000 queries about the person-nel records every year, and answers need to be foundamong 2.5 million records. Using antiquated microfilmsystem to find answers tooks days. The solution was aDMS that digitized all paper and microfilm documents,making them available via the Internet and the intranet.An authorized employee can now use a browser andaccess a document in seconds (see imrgold.com/en/case_studies/edu_Univ_of_Cin.asp).

The European Court of Human Rights (44 countries inEurope) created a Web-based document and KM systemwhich was originally stored on an intranet and now isstored in a separate organizational knowledge base. TheDMS have had over 20 million hits in 2002 (CanadaNewsWire, 2003). Millions of euros are saved each yearjust on printing and mailing documents.

McDonnell-Douglas (now part of the Boeing Com-pany) distributed aircraft service bulletins to its customersaround the world using the Internet. The company usedto distribute a staggering volume of bulletins to over 200airlines, using over 4 million pages of documentationevery year. Now it is all on the Web, saving money andtime both for the company and for its customers.

Motorola uses a DMS not only for document storageand retrieval, but also for small-group collaboration andcompanywide knowledge sharing. It develops virtualcommunities where people can discuss and publish infor-mation, all with the Web-enabled DMS.

A CLOSER LOOK11.1 HOW COMPANIES USE DOCUMENT MANAGEMENT SYSTEMS

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 499

Page 11: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

500 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

saves an estimated $17.5 million each year through its intranet-enabled forms-management system. For an example of Web-enabled document managementsystems see Delcambre et al. (2003). Other examples of how companies use doc-ument management systems, both on and off the Web, are shown in A CloserLook 11.1.

In all of the examples cited in the text and in A Closer Look 11.1, time andmoney are saved. Also, documents are not lost or mixed up. An issue relatedto document management systems is how to provide the privacy and securityof personal data. We address that issue in Chapters 15 and 16.

11.2 DATA WAREHOUSING

Many large and even medium size companies are using data warehousing tomake it easy and faster to process, analyze, and query data.

Data processing in organizations can be viewed either as transactional or analyti-cal. The data in transactions processing systems (TPSs) are organized mainly ina hierarchical structure (Technology Guide 3) and are centrally processed. Thedatabases and the processing systems involved are known as operational systems,and the results are mainly transaction reports. This is done primarily for fastand efficient processing of routine, repetitive data.

Today, however, the most successful companies are those that can respondquickly and flexibly to market changes and opportunities, and the key to thisresponse is the effective and efficient use of data and information as shown inthe Harrah’s case. This is done not only via transaction processing, but alsothrough a supplementary activity, called analytical processing, which involvesanalysis of accumulated data, frequently by end users. Analytical processing, alsorefered to as business intelligence, includes data mining, decision support systems(DSS), enterprise information systems (EIS), Web applications, quering, andother end-user activities. Placing strategic information in the hands of decisionmakers aids productivity and empowers users to make better decisions, leadingto greater competitive advantage. A good data delivery system should be ableto support easy data access by the end users themselves, quick, accurate, flexibleand effective decision making.

There are basically two options for conducting analytical processing. One isto work directly with the operational systems (the “let’s use what we have”approach), using software tools and components known as front-end tools, andmiddleware (see Technology Guide 3). The other is to use a data warehouse. Thefirst option can be optimal for companies that do not have a large number ofend users running queries and conducting analyses against the operating sys-tems. It is also an option for departments that consist mainly of users who havethe necessary technical skills for an extensive use of tools such as spreadsheets(see BIXL, 2002), and graphics. (See Technology Guide 2.) Although it is pos-sible for those with fewer technical skills to use query and reporting tools, theymay not be effective, flexible, or easy enough to use in many cases.

Since the mid-1990s, there has been a wave of front-end tools that allow endusers to ease these problems by directly conducting queries and reporting ondata stored in operational databases. The problem with this approach, however,is that the tools are only effective with end users who have a medium to high

Transactionalversus Analytical

Processing

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 500

Page 12: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.2 DATA WAREHOUSING 501

degree of knowledge about databases. This situation improved drastically withthe use of Web-based tools. Yet, when data are in several sources and in severalformats, it is difficult to bring them together to conduct an analysis.

The second option, a data warehouse, overcomes these limitations andprovides for improved analytical processing. It involves three concepts:

1. A business representation of data for end users

2. A Web-based environment that gives the users query and reporting capa-bilities

3. A server-based repository (the data warehouse) that allows centralized se-curity and control over the data

The Harrah’s case illustrates some major benefits of a data warehouse, whichis a repository of subject-oriented historical data that is organized to be acces-sible in a form readily acceptable for analytical processing activities (such as datamining, decision support, querying, and other applications). The major benefitsof a data warehouse are (1) the ability to reach data quickly, since they arelocated in one place, and (2) the ability to reach data easily and frequently byend users with Web browsers. To aid the accessibility of data, detail-level oper-ational data must be transformed to a relational form, which makes them moreamenable to analytical processing. Thus, data warehousing is not a concept byitself, but is interrelated with data access, retrieval, analysis, and visualization.(See Gray and Watson, 1998, and Inmon, 2002.)

The process of building and using a data warehouse is shown in Figure 11.2.The organization’s data are stored in operational systems (left side of the fig-ure). Using special software called ETL (extraction, transformation, load), data

The DataWarehouse

POS

Legacy

DataMart

DataMart

DataMart

Marketing

DataAccess DSS

Business Intelligence

Custom-BuiltApplications(4GL tools)

EIS,Reporting

RelationalQuery Tools

WebBrowser

OLAP/ROLAP

Data Mining

Management

Finance

Misc.OLTP

ExternalWeb

documents

OperationalSystems/Data

ExternalEDI

ERP

Extraction,Transformation,

Load (ETL) Replication

MetadataRepository

EnterpriseData

Warehouse

Finance

FederatedData Warehouse

Marketing

SupplyChain Data M

iddleware

Internet

FIGURE 11.2 Data warehouse framework and views. (Source: Drawn by E. Turban.)

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 501

Page 13: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

502 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

are processed and then stored in a data warehouse. Not all data are necessarilytransferred to the data warehouse. Frequently only a summary of the data istransferred. The data that are transferred are organized within the warehousein a form that is easy for end users to access. The data are also standardized.Then, the data are organized by subject, such as by functional area, vendor, orproduct. In contrast, operational data are organized according to a businessprocess, such as shipping, purchasing, or inventory control and/or functionaldepartment. (Note that ERP data can be input to data warehouse as well as ERPand SCM decisions use the output from data warehouse. See Grant, 2003.)

Data warehouses provide for the storage of metadata, meaning data aboutdata. Metadata include software programs about data, rules for organizing data,and data summaries that are easier to index and search, especially with Webtools. Finally, middleware tools enable access to the data warehouse (see Tech-nology Guide 3, and Rundensteiner et al., 2000).

CHARACTERISTICS OF A DATA WAREHOUSE. The major characteristics of datawarehousing are:

1. Organization. Data are organized by subject (e.g., by customer, vendor,product, price level, and region), and contain information relevant for deci-sion support only.

2. Consistency. Data in different operational databases may be encoded differ-ently. For example, gender data may be encoded 0 and 1 in one operationalsystem and “m” and “f” in another. In the warehouse they will be coded ina consistent manner.

3. Time variant. The data are kept for many years so they can be used fortrends, forecasting, and comparisons over time.

4. Nonvolatile. Once entered into the warehouse, data are not updated.

5. Relational. Typically the data warehouse uses a relational structure.

6. Client/server. The data warehouse uses the client/server architecture mainlyto provide the end user an easy access to its data.

7. Web-based. Today’s data warehouses are designed to provide an efficientcomputing environment for Web-based applications (Rundensteiner et. al.,2000).

BENEFITS. Moving information off the mainframe presents a company withthe unique opportunity to restructure its IT strategy. Companies can reinventthe way in which they shape and form their application data, empowering endusers to conduct extensive analysis with these data in ways that may not havebeen possible before (e.g., see Minicase 1, about Sears). Another immediate ben-efit is providing a consolidated view of corporate data, which is better than pro-viding many smaller (and differently formatted) views. For example, separateapplications may track sales and coupon mailings. Combining data from thesedifferent applications may yield insights into the cost-efficiency of coupon salespromotions that would not be immediately evident from the output data ofeither applications alone. Integrated within a data warehouse, however, suchinformation can be easily extracted and analyzed.

Another benefit is that data warehousing allows information processing to beoffloaded from expensive operational systems onto low-cost servers. Once this is

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 502

Page 14: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.2 DATA WAREHOUSING 503

done, the end-user tools can handle a significant number of end-user informa-tion requests. Furthermore, some operational system reporting requirements canbe moved to decision support systems, thus freeing up production processing.

These benefits can improve business knowledge, provide competitive advan-tage (see Watson et. al., 2002), enhance customer service and satisfaction (seeOnline File W11.3), facilitate decision making, and help in streamlining businessprocesses.

COST. The cost of a data warehouse can be very high, both to build and tomaintain. Furthermore, it may difficult and expensive to incorporate data fromobsolete legacy systems. Finally, there may be a lack of incentive to share data.Therefore, a careful feasibility study must be undertaken before a commitmentis made to data warehousing.

ARCHITECTURE AND TOOLS. There are several basic architectures for datawarehousing. Two common ones are two-tier and three-tier architectures. (SeeGray and Watson, 1998.) In three-tier architecture, data from the warehouseare processed twice and deposited in an additional multidimensional database,organized for easy multidimensional analysis and presentation (see Section11.3), or replicated in data marts. For a Web-based architecture see Runden-steiner et al., 2000. The architecture of the data warehouse determines the toolsneeded for its construction (see Kimball and Ross, 2002).

PUTTING THE WAREHOUSE ON THE INTRANET. Delivery of data warehousecontent to decision makers throughout the enterprise can be done via anintranet. Users can view, query, and analyze the data and produce reports usingWeb browsers. This is an extremely economical and effective method of deliv-ering data (see Kimball and Ross, 2002, and Inmon, 2002).

SUITABILITY. Data warehousing is most appropriate for organizations in whichsome of the following apply: large amounts of data need to be accessed by endusers (see the Harrah’s and Sears Cases); the operational data are stored in dif-ferent systems; an information-based approach to management is in use; there isa large, diverse customer base (such as in a utility company or a bank); the samedata are represented differently in different systems; data are stored in highly tech-nical formats that are difficult to decipher; and extensive end-user computing isperformed (many end users performing many activities; for example, Sears has5,000 users).

Some of the successful applications are summarized in Table 11.2. Hundreds ofother successful applications are reported (e.g., see client success stories and casestudies at Web sites of vendors such as Brio Technology Inc., Business Objects, Cog-nos Corp., Information Builders, NCR Corp., Oracle, Platinum Technology, SoftwareA&G, and Pilot Software). For further discussion see Gray and Watson (1998) andInmon (2002). Also visit the Data Warehouse Institute (dw-institute.org).

Data warehouses are frequently supplemented with or substituted by thefollowing: data marts, operational data stores, and multidimensional databases.

DATA MARTS. The high cost of data warehouses confines their use to largecompanies. An alternative used by many other firms is creation of a lower cost,

Data Marts,Operational Data

Stores, andMultidimensional

Databases

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 503

Page 15: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

504 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

scaled-down version of a data warehouse called a data mart. A data mart is asmall warehouse designed for a strategic business unit (SBU) or a department.

The advantages of data marts include: low cost (prices under $100,000 ver-sus $1 million or more for data warehouses); significantly shorter lead time forimplementation, often less than 90 days; local rather than central control, con-ferring power on the using group. They also contain less information than thedata warehouse. Hence, they have more rapid response and are more easilyunderstood and navigated than an enterprisewide data warehouse. Finally, theyallow a business unit to build its own decision support systems without relyingon a centralized IS department.

There are two major types of data marts:

1. Replicated (dependent) data marts. Sometimes it is easier to work with asmall subset of the data warehouse. In such cases one can replicate somesubsets of the data warehouse in smaller datamarts, each of which is dedi-cated to a certain area, as was shown in Figure 11.2. In such a case the datamart is an addition to the data warehouse.

2. Standalone data marts. A company can have one or more independent datamarts without having a data warehouse. Typical data marts are for marketing,finance, and engineering applications.

OPERATIONAL DATA STORES. An operational data store is a database fortransaction processing systems that uses data warehouse concepts to provide

TABLE 11.2 Summary of Strategic Uses of Data Warehousing

Industry Functional Areas of Use Strategic Use

Operations and Marketing

Distribution and MarketingProduct Development,

Operations, andMarketing

Product Development andMarketing

OperationsProduct Development,

Operations, and MarketingDistribution and Marketing

OperationsDistribution and Marketing

ManufacturingProduct Development,

Operations, and Marketing

Crew assignment, aircraft deployment, mix of fares,analysis of route profitability, frequent flyerprogram promotions

Merchandising, and inventory replenishmentCustomer service, trend analysis, product and

service promotions. Reduction of IS expenses

Customer service, new information service for a fee,fraud detection

Reduction of operational expensesRisk management, market movements analysis,

customer tendencies analysis, portfolio managementDistribution decision, product promotions, sales

decision, pricing policyIntelligence gatheringTrend analysis, buying pattern analysis, pricing

policy, inventory control, sales promotions, optimaldistribution channel

Pattern analysis (quality control)New product and service promotions, reduction of

IS budget, profitability analysis

Airline

ApparelBanking

Credit Card

Health CareInvestment and

InsurancePersonal Care Products

Public SectorRetail Chain

SteelTelecommunications

Source: Park (1997), p. 19, Table 2.

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 504

Page 16: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.3 INFORMATION AND KNOWLEDGE DISCOVERY WITH BUSINESS INTELLIGENCE 505

clean data. That is, it brings the concepts and benefits of the data warehouse tothe operational portions of the business, at a lower cost. It is used for short-term decisions involving mission-critical applications rather than for themedium- and long-term decisions associated with the regular data warehouse.These decisions depend on much more current information. For example, abank needs to know about all the accounts for a given customer who is callingon the phone. The operational data store can be viewed as situated betweenthe operational data (in legacy systems) and the data warehouse. A comparisonbetween the two is provided by Gray and Watson (1998).

MULTIDIMENSIONAL DATABASES. Multidimensional databases are special-ized data stores that organize facts by dimensions, such as geographical region,product line, salesperson, pr time. The data in multidimensional databases areusually preprocessed and stored in what is called a (multi-dimensional) datacube. Facts, such as quantities sold, are placed at the intersection of the dimen-sions. One such intersection might be the quantities of widgets sold by Ms.Smith in the Morristown, New Jersey, branch of XYZ Company in July 2003.Dimensions often have a hierarchy. Sales figures, for example, might be pre-sented by day, by month, or by year. They might also roll up an organizationaldimension from store to region to company. Multidimensional databases can beincorporated in a data warehouse, sometimes as its core, or they can be usedas an additional layer. See Online File W11.4 for details.

11.3 INFORMATION AND KNOWLEDGE DISCOVERY

WITH BUSINESS INTELLIGENCE

Once the data are in the data warehouse and/or data marts they can be accessedby managers, and analysts, and other end users. Users can then conduct sev-eral activities. These activities are frequently referred to as analytical processingor more commonly as business intellignce. (Note: For a glossary of these and otherterms, see Dimensional Insight, 2003.) Business intellignce (BI) is a broad cat-egory of applications and techniques for gathering, storing, analyzing and pro-viding access to date to help enterprise users make better business and strate-gic decisions (see Oguz, 2003, and Moss and Atre, 2003). The process of BIusually, but not necessarily, involves the use of a data warehouse, as seen inFigure 11.3.

HOW BUSINESS INTELLIGENCE WORKS. Operational raw data are usually keptin corporate databases. For example, a national retail chain that sells everythingfrom grills and patio furniture to plastic utensils, has data about inventory, cus-tomer information, data about past promotions, and sales numbers in variousdatabases. Though all this information may be scattered across multiple systems—and may seem unrelated—a data warehouse building software can bring ittogether to the data warehouse. In the data warehouse (or mart) tables can belinked, and data cubes (another term for multidimensional databases) are formed.For instance, inventory information is linked to sales numbers and customer data-bases, allowing for extensive analysis of information. Some data warehouses havea dynamic link to the databases; others are static.

BusinessIntelligence

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 505

Page 17: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

506 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

Using a business intelligence software the user can ask queries, request ad-hoc reports, or conduct any other analyses. For example, analysis can be car-ried out by performing multilayer queries. Because all the databases are linked,you can search for what products a store has too much of. You can then deter-mine which of these products commonly sell with popular items, based on pre-vious sales. After planning a promotion to move the excess stock along withthe popular products (by bundling them together, for example), you can digdeeper to see where this promotion would be most popular (and most prof-itable). The results of your request can be reports, predictions, alerts, and/orgraphical presentations. These can be disseminated to decision-making tasks.

More advanced applications of business intelligence include outputs such asfinancial modeling, budgting, resource allocation, and competitive intelligence.Advanced business intelligence systems include components such as decisionmodels, business performance analysis, metrics, data profiling and reengineeringtools, and much more. (For details see dmreview.com)

THE TOOLS AND TECHNIQUES OF BUSINESS INTELLIGNCE. BI employs largenumber of tools and techniques. The major applications include the activities ofquery and reporting, online analytical processing (OLAP), DSS, data mining,forecasting and statistical analysis. A major BI vendor is SAS (sas.com). Othervendors include Microstrategy, Congos, SPSS, and Business Objects. We havedivided the BI tools into two major categories: (1) information and knowledge dis-covery and (2) decision support and intelligent analysis. In each category there are

Query Data Mining

Forecasting

Alert

Report

Graph

ResultsInsightPoint of

Sale

Raw Dataand Databases

Web

Sto

re

DataWarehouse

BusinessIntelligence Tools

SupplyChain

CallCenter

Cus

tom

erIn

vent

ory

Sal

es

P = X3 2V

FIGURE 11.3 How busi-ness intelligence Works.

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 506

Page 18: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.3 INFORMATION AND KNOWLEDGE DISCOVERY WITH BUSINESS INTELLIGENCE 507

several tools and techniques, as shown in Figure 11.4. In this chapter we willdescribe the information and knowledge discovery category, while Chapter 12is dedicated to decision support and intelligent systems.

Information and knowledge discovery differs from decision support in its mainobjective: Discovery. Once discovery is done the results can be used for decisionsupport. Let’s distinguish first between information and knowledge discovery.

THE EVOLUTION OF INFORMATION AND KNOWLEDGE DISCOVERY. Informationdiscovery started in the late sixties and early seventies with data collection tech-niques. It was basically simple data collection and answered queries that involvedone set of historical data. This analysis was extended to answer questions thatinvolved several sets of data with tools such as SQL and relational database man-agement systems (see Table 11.3 for the evolution). During the 1990s, a recog-nition for the need of better tools to deal with the ever increasing amount ofdata, was initiated. This resulted in the creation of the data warehouse and theappearance of OLAP and multidimensional databases and presentation. Whenthe amount of data to be analyzed exploded in the mid 1990s knowledge discoveryemerged as an important analytical tool.

The process of extracting useful knowledge from volumes of data is knownas knowledge discovery in databases (KDD), or just knowledge discov-ery, and it is the subject of extensive research (see Fayyad et al., 1996). KDD’smajor objective is to identify valid, novel, potentially useful, and ultimatelyunderstandable patterns in data. KDD is useful because it is supported by threetechnologies that are now sufficiently mature: massive data collection, power-ful multiprocessor computers, and data mining and other algorithms. KDDprocesses have appeared under various names and have shown different char-acteristics. As time has passed, KDD has become able to answer more complex

The Rools and Techniques

of Information and Knowledge

Discovery

Ad-hocquery

Information andKnowledge Discovery

OLAP

DataMining

WebMining

DSS

Decision Support andIntelligent Systems

Business Intelligence

Group DSS

Executive andEnterprise support

IntegratedDecision Support

ManagementScience Analysis

Profitability andother analysis

Applied ArtificialIntelligence

Metrics scorecardBPI, BPMFIGURE 11.4 Categories

of business intelligence.

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 507

Page 19: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

508 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

business questions. In this section we will describe two tools of informationdiscovery: Ad-hoc queries, and OLAP. Data mining as a KDD is described in Sec-tion 11.4. We discuss multidimensionality in Section 11.5. Web-based querytools are described in Section 11.7.

AD-HOC QUERIES AND REPORTING. Ad-hoc queries allow users to request, inreal time, information from the computer that is not available in periodicreports. Such answers are needed to expedite decision making. The system mustbe intelligent enough to understand what the user wants. Simple ad-hoc querysystems are often based on menus. More intelligent systems use structuredquery language (SQL) and query-by-example approaches, which are describedin Technology Guide 3. The most intelligent systems are based on natural lan-guage understanding (Chapter 12) and some can communicate with users usingvoice recognition. Later on we will describe the use of Web tools to facilitatequeries.

Quering systems are frequently combined with reporting systems that gen-erate routine reports. For an example of such a combination in a radio rentalstore see Amato-McCoy, 2003. For Web-based information discovery tools seeOnline File W11.5.

ONLINE ANALYTICAL PROCESSING. The term online analytical processing(OLAP) was introduced in 1993 by E. F. Codd, to describe a set of tools thatcan analyze data to reflect actual business needs. These tools were based on aset of 12 rules: (1) multidimensional view, (2) transparency to the user, (3) easyaccessibility, (4) consistent reporting, (5) client/server architecture, (6) genericdimensionality, (7) dynamic sparse matrix handling, (8) multiuser support,(9) cross-dimensional operations, (10) intuitive data manipulation, (11) flexible

TABLE 11.3 Stages in the Evolution of Knowledge Discovery

Evolutionary Stage Business Quesstion Enabling Technologies Characteristics

Data collection (1960s) What was my total revenue Computers, tapes, disks Retrospective, staticin the last five years? data delivery

Data access (1980s) What were unit sales in Relational databases Retrospective, dynamicNew England last March? (RDBMS), structured data delivery at

query language (SQL) record levelData warehousing and What were the sales in OLAP, multidimensional Retrospective, dynamic

decision support region A, by product, by databases, data data delivery at(early 1990s) salesperson? warehouses multiple levels

Intelligent data mining What’s likely to happen Advanced glorithms, Prospective, proactive(late 1990s) to the Boston unit’s sales multiprocessor computers, information delivery

next month? Why? massive databasesAdvanced intellegenet What is the best plan to Neural computing, advanced Proactive, integrative;

system follow? AI models, complex multiple businessComplete integration How did we perform optimization, Web services partners

(2000–2004) compared to metrics?

Source: Based on material from accure.com (Accure Software).

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 508

Page 20: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.3 INFORMATION AND KNOWLEDGE DISCOVERY WITH BUSINESS INTELLIGENCE 509

reporting, and (12) unlimited levels of dimension and aggregation. For detailssee Codd et al., 1993. Let’s see how these rules may work:

Assume that a business might organize its sales force by regions, say theEastern and Western. These two regions might then be broken down into states.In an OLAP database, this organization would be used to structure the sales dataso that the VP of sales could see the sales figures for each region. The VP mightthen want to see the Eastern region broken down by state so that the per-formance of individual state sales managers could be evaluated. Thus OLAPreflects the business in the data structure.

The power of OLAP is in its ability to create these business structures (salesregions, product categories, fiscal calendar, partner channels, etc.) and combinethem in such a way as to allow users to quickly answer business questions. “Howmany blue sweaters were sold via mail-order in New York so far this year?” isthe kind of question that OLAP is very good at answering. Users can interac-tively slice the data and drill down to the details they are interested in.

In terms of the technology, an OLAP database can be implemented on top ofan existing relational database (this is called ROLAP, for Relational OLAP) or itcan be implemented via a specialized multidimensional data store (this is calledMOLAP, for Multidimensional OLAP. In ROLAP, the data request is translated intoSQL and the relational database is queried for the answer. In MOLAP, the spe-cialized data store is preloaded with the answers to (all) possible queries so thatany request for data can be returned quickly. Obviously there are performanceand storage tradeoffs between these two approaches. (Another technology calledHOLAP attempts to combine these two approaches.)

Unlike online transaction processing (OLTP) applications, OLAP involves exam-ining many data items (frequently many thousands or even millions) in complexrelationships. In addition to answering users’ queries, OLAP may analyze theserelationships and look for patterns, trends, and exceptions.

A typical OLAP query might access a multigigabyte, multiyear sales databasein order to find all product sales in each region for each product type. (See theHarrah’s case.) After reviewing the results, an analyst might further refine thequery to find sales volume for each sales channel within region or product classi-fications. As a last step, the analyst might want to perform year-to-year or quarter-to-quarter comparisons, for each sales channel. This whole process must be carriedout online with rapid response time so that the analysis process is undisturbed.

The 12 original rules of Codd are reflected in today’s software. For example,today’s software permits access to very large amounts of data, such as several yearsof sales data, usually with a browser, and analysis of the relationships betweenmany types of business elements, such as sales, products, regions, and channels.It enables users to process aggregated data, such as sales volumes, budgeted dol-lars, and dollars spent, to compare aggregated data over time, and to present datain different perspectives, such as sales by region versus sales by product or by prod-uct within each region. Today’s software also ienables complex calculationsbetween data elements, such as expected profit as calculated as a function of salesrevenue for each type of sales channel in a particular region; and responds quicklyto user requests so that users can pursue an analytical thought process withoutbeing stymied by the system.

OLAP can be combined with data mining to conduct a very sophisticatedmultidimensional-based decision support as illustrated by Fong et al., 2002. By

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 509

Page 21: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

510 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

attaching a rule-based component to the OLAP, one can make such integrated sys-tem an integlligent data mine system (see Lau et al., 2001). For more information,products and vendors visit olapreport.com and olap.com.

Although OLAP and ad-hoc queries are very useful in many cases, they areretrospective in nature and cannot provide the automated and prospectiveknowledge discovery that is done by advanced data mining techniques.

11.4 DATA MINING CONCEPTS AND APPLICATIONS

Data mining is becoming a major tool for analyzing large amount of data, usu-ally in a data warehouse, (Nemati and Barko., 2002) as well for analyzing Webdata. Data mining derives its name from the similarities between searching forvaluable business information in a large database, and mining a mountain for avein of valuable ore. Both processes require either sifting through an immenseamount of material or intelligently probing it to find exactly where the valueresides (see the Harrah’s case at the start of the chapter). In some cases the dataare consolidated in a data warehouse and data marts (e.g., see Chopoorian et al.,2001). In others they are kept on the Internet and intranet servers.

Given databases of sufficient size and quality, data mining technology cangenerate new business opportunities by providing these capabilities:

● Automated prediction of trends and behaviors. Data mining automates theprocess of finding predictive information in large databases. Questions thattraditionally required extensive hands-on analysis can now be answered di-rectly and quickly from the data. A typical example of a predictive problemis targeted marketing. Data mining can use data on past promotional mailingsto identify the targets most likely to respond favorably to future mailings.Other predictive examples include forecasting bankruptcy and other formsof default, and identifying segments of a population likely to respond similarlyto given events.

● Automated discovery of previously unknown patterns. Data mining toolsidentify previously hidden patterns in one step. An example of pattern dis-covery is the analysis of retail sales data to identify seemingly unrelatedproducts that are often purchased together, such as baby diapers and beer.Other pattern discovery problems include detecting fraudulent credit cardtransactions and identifying invalid (anomalous) data that may representdata entry keying errors.

When data mining tools are implemented on high-performance parallel-processing systems, they can analyze massive databases in minutes. Often, thesedatabases will contain data stored for several years. Faster processing means thatusers can experiment with more models to understand complex data. High speedmakes it practical for users to analyze huge quantities of data. Larger databases,in turn, yield improved predictions (see Winter, 2001, and Hirji, 2001).

Data mining also can be conducted by nonprogrammers. The “miner” isoften an end user, empowered by “data drills” and other power query tools toask ad-hoc questions and get answers quickly, with little or no programmingskill. Data mining tools can be combined with spreadsheets and other end-usersoftware development tools, making it relatively easy to analyze and process themined data. Data mining appears under different names, such as knowledge

Capabilities of Data Mining

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 510

Page 22: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.4 DATA MINING CONCEPTS AND APPLICATIONS 511

extraction, data dipping, data archeology, data exploration, data pattern pro-cessing, data dredging, and information harvesting. “Striking it rich” in datamining often involves finding unexpected, valuable results.

Data mining consists of two steps, building a data cube and using the cube toextract data for the mining functions that the mining tool supports. A data cubeis multidimensional as shown in Online File W11.6. Data miners can use severaltools and techniques; see a list and definitions in A Closer Look 11.2.

Large number of applications exist in data mining both in business (see Apte et al.,2002) and other fields. According to a GartnerGroup report (gartnergroup.com),more than half of all the Fortune 1000 companies worldwide are using data miningtechnology.

The Tools of Data Mining

The most commonly used techniques for data miningare the following.

● Case-based reasoning. The case-based reasoning ap-proach uses historical cases to recognize patterns (seeChapter 12). For example, customers of Cognitive Sys-tems, Inc., utilize such an approach for helpdesk appli-cations. One company has a 50,000-query case library.New cases are matched quickly against the 50,000samples in the library, providing more than 90 percentaccurate and automatic answers to queries.

● Neural computing. Neural computing is a machinelearning approach by which historical data can be ex-amined for pattern recognition. These patterns canthen be used for making predictions and for decisionsupport (details are given in Chapter 12). Usersequipped with neural computing tools can go throughhuge databases and, for example, identify potentialcustomers of a new product or companies whose pro-files suggest that they are heading for bankruptcy.Most practical applications are in financial services, inmarketing, and in manufacturing.

● Intelligent agents. One of the most promising ap-proaches to retrieving information from the Internetor from intranet-based databases is the use of intelli-gent agents. As vast amounts of information becomeavailable through the Internet, finding the right infor-mation is more difficult. This topic is further discussedin Chapters 5 and 12.

● Association analysis. Association analysis is a rela-tively new approach that uses a specialized set of

algorithms that sort through large data sets and ex-press statistical rules among items. (See Moad, 1998,for details.)

● Other tools. Several other tools can be used. Theseinclude decision trees, genetic algorithms, nearest-neighbor method, and rule induction. For details, seeInmon, 2002.

The most common information types are:

● Classification. Implies the defining characteristics of acertain group (e.g., customers who have been lost tocompetitors).

● Clustering. Identifies groups of items that share a par-ticular characteristic. Clustering differs from classifica-tion in that no predefining characteristic is given.

● Association. Identifies relationships between eventsthat occur at one time (e.g., the contents of a shoppingbasket). For an application in law enforcement seeBrown and Hagen, 2003.

● Sequencing. Similar to association, except that the re-lationship exists over a period of time (e.g., repeat vis-its to a supermarket or use of a financial planningproduct).

● Forecasting. Estimates future values based on patternswithin large sets of data (e.g., demand forecasting).

There are a large number of commerical products avail-able for conducting data mining (e.g., dbminer.com, data-miner.com, and spss.com). For a directory see kdnuggets.com/software.

A CLOSER LOOK11.2 DATA MINING TECHNIQUES AND INFORMATION TYPES

Data MiningApplications

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 511

Page 23: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

512 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

A SAMPLER OF DATA MINING APPLICATIONS. Data mining is used extensivelytoday for many business applications (see Apte et al., 2002), as illustrated bythe 12 representative examples that follow. Note that the intent of most of theseexamples is to identify a business opportunity in order to create a sustainablecompetitive advantage.

1. Retailing and sales. Predicting sales; determining correct inventory levelsand distribution schedules among outlets and loss prevention. For example,retailers such as AAFES (store in military bases) use data mining to combatfraud done by employees at their 1400 stores, using Fraud Watch solutionfrom a Canadian Company, Triversity (see Amato-McCoy, 2003b). EddieBauer (see Online File W11.7) uses data mining for serveral applications.

2. Banking. Forecasting levels of bad loans and fraudulent credit card use,credit card spending by new customers, and which kinds of customers willbest respond to (and qualify for) new loan offers.

3. Manufacturing and production. Predicting machinery failures; finding keyfactors that control optimization of manufacturing capacity.

4. Brokerage and securities trading. Predicting when bond prices will change;forecasting the range of stock fluctuations for particular issues and the over-all market; determining when to buy or sell stocks.

5. Insurance. Forecasting claim amounts and medical coverage costs; classify-ing the most important elements that affect medical coverage; predictingwhich customers will buy new insurance policies.

6. Computer hardware and software. Predicting disk-drive failures; forecastinghow long it will take to create new chips; predicting potential securityviolations.

7. Policework. Tracking crime patterns, locations, and criminal behavior; iden-tifying attributes to assist in solving criminal cases.

8. Government and defense. Forecasting the cost of moving military equipment;testing strategies for potential military engagements; predicting resourceconsumption.

9. Airlines. Capturing data on where customers are flying and the ultimate des-tination of passengers who change carriers in hub cities; thus, airlines canidentify popular locations that they do not service and check the feasibilityof adding routes to capture lost business.

10. Health care. Correlating demographics of patients with critical illnesses; de-veloping better insights on symptoms and their causes and how to provideproper treatments.

11. Broadcasting. Predicting what is best to air during prime time and how tomaximize returns by interjecting advertisements.

12. Marketing. Classifying customer demographics that can be used to predictwhich customers will respond to a mailing or buy a particular product (as il-lustrated in Section 11.6).

TEXT MINING. Text mining is the application of data mining to nonstructuredor less-structured text files (see Berry, 2002). Data mining takes advantage ofthe infrastructure of stored data to extract predictive information. For example,by mining a customer database, an analyst might discover that everyone who

Text Mining and Web Mining

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 512

Page 24: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.4 DATA MINING CONCEPTS AND APPLICATIONS 513

buys product A also buys products B and C, but does so six months later. Textmining, however, operates with less structured information. Documents rarelyhave strong internal infrastructure, and when they do, it is frequently focused ondocument format rather than document content. Text mining helps organizationsto do the following: (1) find the “hidden” content of documents, including addi-tional useful relationship and (2) group documents by common themes (e.g.,identify all the customers of an insurance firm who have similar complaints).

WEB MINING. Web mining is the application of data mining techniques to dis-cover actionable and meaningful patterns, profiles, and trends from Webresources (see Linoff and Berry, 2002). The term Web mining is used to referto both Web-content mining and Web-usage mining. Web-content mining is theprocess of information of analyzing Web access logs and other informationconnected to user browsing and access patterns on one or more Web localities.

Web mining is used in the following areas: information filtering (e-mails,magazimes, and newspaper); surveillance (of competitors, patents, technologicaldevelopment); mining of Web-access logs for analyzing usage (clickstreamanalysis); assisted browsing, and services that fight crime on the Internet.

In e-commerce, Web content mining is especially critical, due to the largenumber of visitors to e-commerce sites, about 2.5 billion during the Christmas2002 season (Weiss, 2003). For example, when you look for a certain book onAmazon.com, the site will also provide you with a lot of books purchased bythe customers who have purchased the specific book you are looking for. Byproviding such mined information, the Amazon.com site minimizes the needfor additional search by providing customers with valuable service.

According to Etzioni (1996), Web mining can perform the followingfunctions:

● Resource discovery: locating unfamiliar documents and services on the Web.

● Information extraction: automatically extracting specific information formnewly discovered Web resources.

● Generalization: uncovering general patterns at individual Web sites and acrossmultiple sites. Miner3D (miner3d.com) is a suite of visual data analysis tools in-cluding a Web-mining tool that displays hundreds and even thousands ofsearch hits on a single screen. The actual search for Web pages is performedthrough any major search engine, and this add on tool presents the resultingsearch in the form of a 3-D graphic instead of displaying links to the first fewpages. For details on a number of Web-mining products see Kdnuggets.com/software/web.html. Also see spss.com, and bayesia.com (free downloads).

Since their early inceptions, data warehouses and mining have produced manysuccess stories. However, there have also been many failures. Carbone (1999)defined levels of data warehouse failures as follows: (1) warehouse does not meetthe expectations of those involved; (2) warehouse was completed, but wentseverely over budget in relation to time, money, or both; (3) warehouse failedone or more times but eventually was completed; and (4) warehouse failed withno effort to revive it.

Carbone provided examples and identified a number of reasons for failures(which are typical for many other large information systems): These are summa-rized in Table 11.4. Suggestions on how to avoid data warehouse failure are

Failures of Data Warehousesand Data Mining

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 513

Page 25: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

514 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

provided at datawarehouse.com, at bitpipe.com, and at teradatauniversotynetwork. com.Suggestions how to properly implement data mining are provided by Hirji (2001).

TABLE 11.4 The Reasons Data Warehouses and Mining Fail

● Unrealistic expectations—overly optimistic time sched-ule or underestimation of cost.

● Inappropriate architecture.● Vendors overselling capabilities of products.● Lack of training and support for users.● Omitted information.● Lack of coordination (or requires too much coordi-

nation).● Cultural issues were ignored.● Using the warehouse only for operational, not infor-

mational, purposes.● Not enough summarization of data.● Poor upkeep of technology.● Improperly managing multiple users with various

needs.● Failure to align data marts and data warehouses.

● Unclear business objectives; not knowing theinformation requirements.

● Lack of effective project sponsorship.● Lack of data quality.● Lack of user input.● Using data marts instead of data warehouses

(and vice versa).● Inexperienced/untrained/inadequate number of

personnel.● Interfering corporate politics.● Insecure access to data manipulation (users

should not have the ability to change any data).● Inappropriate format of information—not a single,

standard format.● Poor upkeep of information (e.g., failure to keep

information current).

Source: Compiled from Carbone (1999).

11.5 DATA VISUALIZATION TECHNOLOGIES

Once data have been processed, they can be presented to users as text, graph-ics, tables, and so on, via several data visualization technologies. A variety ofmethods and software packages are available to do visualization for supportingdecision making (e.g., see I/S Analyzer, 2002 and Li, 2001).

Visual technologies make pictures worth a thousand numbers and make ITapplications more attractive and understandable to users. Data visualizationrefers to presentation of data by technologies such as digital images, geograph-ical information systems, graphical user interfaces, multidimensional tables andgraphs, virtual reality, three-dimensional presentations, vedios and animation.Visualization is becoming more and more popular on the Web not only forentertainment, but also for decision support (see spss.com, microstrategy.com). Visu-alization software packages offer users capabilities for self-guided explorationand visual analysis of large amounts of data. By using visual analysis technolo-gies, people may spot problems that have existed for years, undetected by stan-dard analysis methods. Data visualization can be supported in a dynamic way(e.g., by video clips). It can also be done in real time (e.g., Bates, 2003). Visual-ization technologies can also be integrated among themselves to create a varietyof presentations, as demonstrated by the IT At Work 11.1.

Data visualization is easier to implement when the necessary data are in adata warehouse. Our discussion here will focused mainly on the data visualiza-tion techniques of multidimensionality, geographical information systems, visualinteractive modeling, and virtual reality. Related topics, such as multimedia (seeinformatica.com) and hypermedia, are presented in Technology Guide 2.

Data Visualization

0006D_c11_490-540.qxd 10/9/03 8:31 PM Page 514

Page 26: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.5 DATA VISUALIZATION TECHNOLOGIES 515

Modern data and information may have several dimensions. For example,management may be interested in examining sales figures in a certain city byproduct, by time period, by salesperson, and by store (i.e., in five dimensions).The common tool for such sitations is OLAP, and it often includes a visualpresentation. The more dimensions involved, the more difficult it is to presentmultidimensional information in one table or in one graph. Therefore, it is impor-tant to provide the user with a technology that allows him or her to add, replace,or change dimensions quickly and easily in a table and/or graphical presentation.Such changes are known as “slicing and dicing” of data. The technology of slic-ing, dicing, and similar manipulations is called OLAP multidimensionality, and it isavailable in most business intelligence packages (e.g., brio.com)

Figure 11.5 shows three views of the same data, organized in different ways,using multidimensional software, usually available with spreadsheets. Part ashows travel hours of a company’s employees by means of transportation andby country. The “next year” column gives projections automatically generatedby an embedded formula. In part b the data are reorganized, and in part c theyare reorganized again and manipulated as well. All this is easily done by theend user with one or two clicks of the mouse.

The major advantage of multidimensionality is that data can be organizedthe way managers like to see them rather than the way that the system ana-lysts do. Furthermore, different presentations of the same data can be arrangedand rearranged easily and quickly.

intranet via the Internet and using Web tools to allowcustomers to make the desired changes. As of 2000 thecompany operates, on the Web, a design studio in whichcustomers can interact with the designers.

These programs allow the company to reduce cycletime. After the last computer-assisted design (CAD)mockup of an order has been approved, the CAD softwareis used to create a bill of materials that goes to Haworth’sfactory for manufacturing. This reduces the time spent be-tween sales reps and CAD operators, increasing the timeavailable for sales reps to make more sales calls and in-creasing customer satisfaction with quicker delivery. By us-ing this visualization computer program, Haworth hasincreased its competitive advantage.

Sources: Compiled from Infoworld, 1997, p. 92 and from haworth.com(2003).

For Further Exploration: How can the intranet be usedto improve the process? How does this topic relate to carcustomization at jaguar.com? How can this case be related towireless?

Manufacturing office furniture is an extremely compet-itive business. Haworth Corporation (haworth.com)

operates in this environment and has been able to surviveand even excel with the help of IT. To compete, Haworth al-lows its customers to customize what they want to buy (seedemo at haworth.com). It may surprise you to learn that anoffice chair can be assembled in 200 different ways. The cus-tomization of all Haworth’s products resulted in 21 millionpotential product combinations, confusing customers whowere not able to visualize, until the item was delivered, howthe customized furniture would look.

The solution was computer visualization software thatallowed sales representatives with laptop computers toshow customers exactly what they were ordering. Thus,the huge parts catalogs became more easily understood,and sales representatives were able to configure differentoptions by entering the corporate database, showing whata product would look like, and computing its price. Thecustomers can now make changes until the furnituredesign meets their needs. The salesperson can do all thisfrom the customer’s office by connecting to the corporate

IT At Work 11.1DATA VISUALIZATION HELPS HAWORTHTO COMPETE

POM

MultidimensionalityVisualization

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 515

Page 27: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

516 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

Trains AutomobilesPlanesThis Year Next Year This Year Next Year This Year Next Year

Travel

CanadaJapanFranceGermany

740430320425

888516384510

140290460430

168348552516

640150210325

768180252390

Country

This Year Next YearCanadaJapanPlanes

888516

CountryTravel

740430

CanadaJapanTrains

168348

140290

CanadaJapanAutomobiles

768180

640150

FranceGermany

252390

210325

FranceGermany

384510

320425

FranceGermany

552516

460430

This Year Next YearCanadaJapan

Planes

888516

Country

740430

CanadaJapan

Trains348168

290

JapanCanada

Automobiles180252

150210

390642

325535

Europe

FranceGermany

384510

320425

FranceGermany

Europe

Europe

FranceGermany

552516

1068768

460430

140

890640

Total

Total

Next Year = (This Year)*1.2 Travel

Total

Worksheet1-View1-TUTORIAL

894745

(a)

(b)

(c)

• The software adds a Total Item

• The software adds formula 2 and calculates Total

• Auto-making shades the formulas using two shades of gray

Shows how formula 1 calculates cells (in this case, the cells Total:Next Year)

Shows that formula 2 calculates all Total cells.

Hours

Hours

Hours

FIGURE 11.5 Multidimensionality views.

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 516

Page 28: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.5 DATA VISUALIZATION TECHNOLOGIES 517

Three factors are considered in multidimensionality: dimensions, measures,and time.

1. Examples of dimensions: products, salespeople, market segments, businessunits, geographical locations, distribution channels, countries, industries

2. Examples of measures: money, sales volume, head count, inventory profit,actual versus forecasted results

3. Examples of time: daily, weekly, monthly, quarterly, yearly

For example, a manager may want to know the sales of product M in a cer-tain geographical area, by a specific salesperson, during a specified month, interms of units. Although the answer can be provided regardless of the databasestructure, it can be provided much faster, and by the user himself or herself, ifthe data are organized in multidimensional databases (or data marts), or if thequery tools are designed for multidimensionality (e.g., via OLAP). In either case,users can navigate through the many dimensions and levels of data via tablesor graphs and then conduct a quick analysis to find significant deviations orimportant trends.

Multidimensionality is available with different degrees of sophistication andis especially popular in business intellignece software (e.g., see Campbell, 2001).There are several types of software from which multidimensional systems canbe constructed, and they often work in conjunction with OLAP tools.

A geographical information system (GIS) is a computer-based system forcapturing, storing, checking, integrating, manipulating, and displaying datausing digitized maps. Its most distinguishing characteristic is that every recordor digital object has an identified geographical location. By integrating mapswith spatially oriented databases and other databases (called geocoding), userscan generate information for planning, problem solving, and decision making,increasing their productivity and the quality of their decisions.

GIS SOFTWARE. GIS software varies in its capabilities, from simple computer-ized mapping systems to enterprisewide tools for decision support data analysis(see Minicase 2). As a high-quality graphics display and high computation andsearch speeds are necessary, most early GIS implementations were developedfor mainframes. Initially, the high cost of GISs prevented their use outside exper-imental facilities and government agencies. Since the 1990s, however, the costof GIS software and its required hardware has dropped dramatically. Now rel-atively inexpensive, fully functional PC-based packages are readily available.Representative GIS software vendors are ESRI, Intergraph, and Mapinfo.

GIS DATA. GIS data are available from a wide variety of sources. Governmentsources (via the Internet and CD-ROM) provide some data, while vendors pro-vide diversified commercial data as well Some are Ffee (see CD-ROMs fromMapInfo, and downable materal from esri.com and gisdatadepot.com.)

The field of GIS can be divided into two major categories: functions andapplications. There are four major functions: design and planning, decising mod-eling, database management, and spatial imaging. These functions support sixareas of applications as shown in Figure 11.6. Note that the functions (shownas pillars) can support all the applications. The applications they support themost are shown closest to each pillar.

GeographicalInformation

Systems

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 517

Page 29: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

518 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

GIS AND DECISION MAKING. GISs provide a large amount of extremely usefulinformation that can be analyzed and utilized in decision making. Its graphicalformat makes it easy for managers to visualize the data. For example, as JanetM. Hamilton, market research administrator for Dow Elanco, a $2 billion makerof agricultural chemicals based in Indianapolis, Indiana, explains, “I can put 80-page spreadsheets with thousands of rows into a single map. It would take acouple of weeks to comprehend all of the information from the spreadsheet,but in a map, the story can be told in seconds” (Hamilton, 1996, p. 21).

There are countless applications of GISs to improve decision making in thepublic or private sector (see Nasirin and Birks, 2003). They include the dispatchof emergency vehicles, transit management (see Minicase 2), facility site selec-tion, and wildlife management. GISs are extremely popular in local govern-ments, where the tools are used not only for mapping but for many decision-making applications (see O’Looney, 2000). States, cities and counties are usinga GIS application related to property assessment, mapping, and flood control.(e.g., Hardester, 2002). Banks have been using GIS for over a decade to supportexpansion and marketing decision making (See Online File W11.8.)

For many companies, the intelligent organization of data within a GIS canprovide a framework to support the process of decision making and of design-ing alternative strategies especially when location decisions are involved (Church,2002). Some examples of successful GIS applications are provided by Korte(2000) and Hamilton (1996). Other examples of successful GIS applications aresummarized in A Closer Look 11.3.

GIS AND THE INTERNET OR INTRANET. Most major GIS software vendors areproviding Web access, such as embedded browsers, or a Web/Internet/intranetserver that hooks directly into their software. Thus, users can access dynamicmaps and data via the Internet or a corporate intranet.

A number of firms are deploying GISs on the Internet for internal use orfor use by their customers. For example, Visa Plus, which operates a networkof automated teller machines, has developed a GIS application that lets Inter-net users call up a locator map for any of the company’s 300,000 ATM machinesworldwide. A common application on the Internet is a store locator. Not onlydo you get an address near you, but you also are told how to get there in the

GIS Applications

Surveyingand

Mapping

FacilitiesManagement

Demogrpahicsand

MarketAnalysis

Transportationand

Logistics

StrategicPlanning

andDecisionMaking

Designand

Engineering

Designand

Planning

SpatialImaging

DecisionModeling

DatabaseManagement

Function

FIGURE 11.6 GIS func-tions and applications.

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 518

Page 30: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

COMPANY

Pepsi Cola Inc., Super Value, Acordia Inc.

CIGNA (health insurance)

Western Auto (a subsidiary of Sears)

Sears, Roebuck & Co.

Health maintenance organizations

Wood Personnel Services (employment agencies)

Wilkening & Co. (consulting services)

CellularOne Corp.

Sun Microsystems

Consolidated Rail Corp.

Federal Emergency Management Agency

Toyota (other car manufacturers)

APPLICATION OF GIS FOR DECISION SUPPORT

Used in site selection for new Taco Bell and Pizza Hut restaurants; combining demographic data and traffic pat-terns used in deciding on new facilities and on marketingstrategy decision

Uses GIS to answer such questions as “How many CIGNA-affiliated physicians are available within an 8-mile radiusof a business?”

Integrates data with GIS to create a detailed demographic profile of a store’s neighborhood to determine the bestproduct mix to offer at the store. (Narisin and Birks, 2003)

Uses GIS to support planning of truck routes.

Using GIS to support drought risk manmagement (God-dard et al., 2003)

Tracks cancer rate and other diseases to determine expan-sion strategy and allocation of expensive equipment intheir facilities.

Maps neighborhoods where temporary workers live; forlocating marketing and recruiting cities.

Map Real estate for tax assessments, property appraisals,flood control, land surveys and related applications (seeHardester, 2002)

Designs optimal sales territories and routes for their clients,reducing travel costs by 15 percent.

Maps its entire cellular network to identify clusters of calldisconnects and to dispatch technicians accordingly.

Making decisions regarding location of facilities (Church,2002)

Manages leased property in dozens of places worldwide.Monitors the condition of 20,000 miles of railroad track

and thousands of parcels of adjoining land.Assesses the damage of hurricanes, floods, and other natu-

ral disasters by relating videotapes of the damage to digi-tized maps of properties.

Combines GIS and GPS as a navigation tool. Drivers aredirected to destinations in the best possible way.

A CLOSER LOOK11.3 GIS SAMPLE APPLICATIONS

11.5 DATA VISUALIZATION TECHNOLOGIES 519

shortest way (e.g., try frys.com). As GIS Web server software is deployed by ven-dors, more applications will be developed. Maps, GIS data, and informationabout GISs are available over the Web through a number of vendors and publicagencies. (For design issues see Sikder and Gangopadhyay, 2002.)

EMERGING GIS APPLICATIONS. The integration of GISs and global positioningsystems (GPSs) has the potential to help restructure and redesign the aviation,

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 519

Page 31: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

520 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

transportation, and shipping industries. It enables vehicles or aircraft equippedwith a GPS receiver to pinpoint their location as they move (Steede-Terry,2000). Emerging applications of GPSs include personal automobile mapping sys-tems, vehicle tracking (Terry and Kolb, 2003), and earth-moving equipmenttracking. The price of these applications is dropping with improvements in hard-ware, increased demand, and the availability of more competing vendors. (Asimple GPS cost less than $50 in 2003.) GPSs have also become a major sourceof new GIS data (see Group Assignment 1). Some researchers have developedintelligent GISs that link a GIS to an expert system or to intelligent agents (Tanget al., 2001).

L-Commerce. In Chapter 6 we introduced the concept of location-basedcommerce (l-commerce), a major part of mobile-commerce (m-commerce). Inl-commerce, advertising is targeted to an individual whose location is known(via a GPS and GIS combination). Similarly, emergency medical systems iden-tify the location of a car accident in a second, and the attached GIS helps indirecting ambulances to the scene. For other interesting applications, see Sadeh(2002).

CONCLUSIONS. Improvements in the GIS user interface have substantiallyaltered the GIS “look” and “feel.” Advanced visualization (three-dimensionalgraphics) is increasingly integrated with GIS capabilities, especially in animatedand interactive maps. GISs can provide information for virtual reality engines,and they can display complex information to decision makers. Multimedia andhypermedia play a growing role in GISs, especially in help and training systems.Object linking and embedding is allowing users to import maps into any docu-ment. More GISs will be deployed to provide data and access data over the Weband organizational intranets as “Web-ready” GIS software becomes more afford-able. See Korte (2000) for an overview of GISs, their many capabilities, andpotential advances.

Visual interactive modeling (VIM) uses computer graphic displays to repre-sent the impact of different management or operational decisions on goals suchas profit or market share. VIM differs from regular simulation in that the usercan intervene in the decision-making process and see the results of the inter-vention. A visual model is much more than a communication device, it is anintegral part of decision making and problem solving.

A VIM can be used both for supporting decisions and for training. It canrepresent a static or a dynamic system. Static models display a visual image ofthe result of one decision alternative at a time. (With computer windows, sev-eral results can be compared on one screen.) Dynamic models use animation orvideo clips to show systems that evolve over time. These are also used in real-time simulations.

VIM has been used with DSSs in several operations management decisionprocesses (see Beroggi, 2001). The method loads the current status of a plant (ora business process) into a virtual interactive model. The model is then run rap-idly on a computer, allowing management to observe how a plant is likely tooperate in the future.

One of the most developed areas in VIM is visual interactive simulation(VIS), a method in which the end user watches the progress of the simulation

Visual InteractiveModels and

Simulation

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 520

Page 32: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.5 DATA VISUALIZATION TECHNOLOGIES 521

model in an animated form using graphics terminals. The user may interact withthe simulation and try different decision strategies. (See Pritsker and O’Reilly,1999.) VIS is an approach that has, at its core, the ability to allow decision mak-ers to learn about their own subjective values and about their mistakes. There-fore, VIS can be used for training as well as in games, as in the case of flightsimulators, and as shown in IT At Work 11.2.

Animation systems that produce realistic graphics are available from manysimulation software vendors (e.g., see SAS.com and vissim.com). The latestvisual simulation technology is tied in with the concept of virtual reality, wherean artificial world is created for a number of purposes—from training toentertainment to viewing data in an artificial landscape.

There is no standard definition of virtual reality. The most common def-initions usually imply that virtual reality (VR) is interactive, computer-generated, three-dimensional graphics delivered to the user through a head-mounted display. Defined technically, virtual reality is an environment and/ortechnology that provides artificially generated sensory cues sufficient toengender in the user some willing suspension of disbelief. So in VR, a per-son “believes” that what he or she is doing is real even though it is artificiallycreated.

Inside the simulator are the Harvester’s actual controls,which are used to control a virtual model of a Harvesterplowing its way through a virtual forest. The machinesways back and forth on uneven terrain, and the grapple ofthe Harvester grips the trunk of a tree, fells it, delimbs it,and cuts it into pieces very realistically in real time.

The simulated picture is very sharp: even the structuresof the bark and annual growth rings are clearly visiblewhere the tree has been cut. In traditional simulators, thetraveling path is quite limited beforehand, but in this Har-vester simulator, you are free to move in a stretch of forestcovering two hectares (25,000 square yards).

In addition, the system can be used to simulate differentkinds of forest in different parts of the world, together withthe different tree species and climatic conditions. An addi-tional advantage of this simulator is that the operations canbe videotaped so that training sessions can be studied after-ward. Moreover, it is possible to practice certain dangeroussituations that cannot be done using a real machine.

Source: Condensed from Finnish Business Report, April 1997

For Further Exploration: Why is the simulated trainingtime shorter? Why is visualization beneficial?

The foresting industry is extremely competitive, andcountries with high labor costs must automate tasks

such as moving, cutting, delimbing, and piling logs. A newmachine, called the “Harvester,” can replace 25 lumber-jacks.

The Harvester is a highly complex machine that takes sixmonths to learn how to operate. The trainee destroys a size-able amount of forest in the process, and terrain suitable topractice on is decreasing. In unskilled hands, this expensivemachine can also be damaged. Therefore, extensive and ex-pensive training is needed. Sisu Logging of Finland (a sub-sidiary of Parter Forest) found a solution to the trainingproblem by using a real-time simulation (partekforest.com).

In this simulation, the chassis, suspension, and wheelsof the vehicles have to be modeled, together with theforces acting on them (inertia and friction), and the move-ment equations linked with them have to be solved in realtime. This type of simulation is mathematically complex,and until recently it required equipment investment run-ning into millions of dollars. However, with the help of avisual simulations program, simulation training can nowbe carried out for only 1 percent of the cost of thetraditional method.

IT At Work 11.2COMPUTER TRAINING IN COMPLEX LOGGING MACHINES AT PARTEK FOREST

HRM

Virtual Reality

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 521

Page 33: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

522 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

More than one person and even a large group can share and interact in thesame artificial environment. VR thus can be a powerful medium for communi-cation, entertainment, and learning. Instead of looking at a flat computer screen,the VR user interacts with a three-dimensional computer-generated environ-ment. To see and hear the environment, the user wears stereo goggles and aheadset. To interact with the environment, control objects in it, or move aroundwithin it, the user wears a computerized display and hand position sensors(gloves). Virtual reality displays achieve the illusion of a surrounding mediumby updating the display in real time. The user can grasp and move virtual objects.

VIRTUAL REALITY AND DECISION MAKING. Most VR applications to date havebeen used to support decision making indirectly. For example, Boeing has devel-oped a virtual aircraft mockup to test designs. Several other VR applications forassisting in manufacturing and for converting military technology to civiliantechnology are being utilized at Boeing. At Volvo, VR is used to test virtual carsin virtual accidents; Volvo also uses VR in its new model-designing process.British Airways offers the pleasure of experiencing first-class flying to its Website visitors. For a comprehensive discussion of virtual reality in manufacturing,see Banerjee and Zetu (2001).

Another VR application area is data visualization. VR helps financial deci-sion makers make better sense of data by using visual, spatial, and aural immer-sion virtual systems. For example, some stock brokerages have a VR applicationin which users surf over a landscape of stock futures, with color, hue, and inten-sity indicating deviations from current share prices. Sound is used to conveyother information, such as current trends or the debt/equity ratio. VR allowsside-by-side comparsions with a large assortment of financial data. It is easierto make intuitive connections with three-dimensional support. Morgan Stanley& Co. uses VR to display the results of risk analyses.

VIRTUAL REALITY AND THE WEB. A platform-independent standard for VRcalled virtual reality markup language (VRML) (vrmlsite.com, and Kerlow, 2000)makes navigation through online supermarkets, museums, and stores as easy asinteracting with textual information. VRML allows objects to be rendered as anInternet user “walks” through a virtual room. At the moment, users can utilizeregular browsers, but VRML browsers will soon be in wide circulation.

Extensive use is expected in e-commerce marketing (see Dalgleish, 2000).For example, Tower Records offers a virtual music store on the Internet wherecustomers can “meet” each other in front of the store, go inside, and previewCDs and videos. They select and purchase their choices electronically and inter-actively from a sales associate. Applications of virtual reality in other areas areshown in Table 11.5.

Virtual supermarkets could spark greater interest in home grocery shopping.In the future, shoppers will enter a virtual supermarket, walk through the vir-tual aisles, select virtual products and put them in their virtual carts. This couldhelp remove some of the resistance to virtual shopping. Virtual malls, whichcan be delivered even on a PC (synthonics.com), are designed to give the user afeeling of walking into a shopping mall.

Virtual reality is just beginning to move into many business applications. Aninteractive, three-dimensional world on the Internet should prove popularbecause it is a metaphor to which everyone can relate.

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 522

Page 34: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.6 MARKETING DATABASES IN ACTION 523

Data warehouses and data marts serve end users in all functional areas. How-ever, the most dramatic applications of data warehousing and mining are inmarketing, as seen in the Harrah’s case, in what is referred to as marketing data-bases (also referred to as database marketing).

In this section we examine how data warehouses, their extensions, and datamining are used, and what role they play in new marketing strategies, such asthe use of Web-based marketing transaction databases in interactive marketing.

Most current databases are static: They simply gather and store information aboutcustomers. They appear in the following categories: operations databases, datawarehouses, and marketing databases. Success in marketing today requires a newkind of database, oriented toward targeting the personalizing marketing messagesin real time. Such a database provides the most effective means of capturinginformation on customer preferences and needs. In turn, enterprises can use thisknowledge to create new and/or personalized products and services. Such a data-base is called a marketing transaction database (MTD). The MTD combinesmany of the characteristics of the current databases and marketing data sourcesinto a new database that allows marketers to engage in real-time personalizationand target every interaction with customers.

MTD’S CAPABILITIES. The MTD provides dynamic, or interactive, functions notavailable with traditional types of marketing databases. In marketing terms, atransaction occurs with the exchange of information. With interactive media,

TABLE 11.5 Examples of Virtual Reality Applications

Applications in Manufacturing Applications in Business

Training Real estate presentation and evaluationDesign testing and interpretation of Advertising

resultsSafety analysis Presentation in e-commerceVirtual prototyping Presentation of financial dataEngineering analysisErgonomic analysisVirtual simulation of assembly,

production, and maintenance

Applications in Medicine Applications in Research and Education

Training surgeons (with simulators) Virtual physics labInterpretation of medical data Representation of complex mathematicsPlanning surgeries Galaxy configurationsPhysical therapy

Amusement Applications Applications in Architecture

Virtual museums Design of building and other structuresThree-dimensional race car games(on PCs)

Air combat simulation (on PCs)Virtual reality arcades and parksSki simulator

11.6 MARKETING DATABASES IN ACTION

The MarketingTransaction

Database

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 523

Page 35: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

524 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

each exposure to the customer becomes an opportunity to conduct a market-ing “transaction.” Exchanging information (whether gathered actively throughregistration or user requests, or passively by monitoring customer behavior)allows marketers to refine their understanding of each customer continuouslyand to use that information to target him or her specifically with personalizedmarketing messages. This is done most frequently on the Web.

Comparing various characteristics of MTDs with other marketing-relateddatabases shows the advantages of MTDs. For example, marketing databasesfocus on understanding customers’ behavior, but do not target the individual cus-tomer nor personalize the marketing approach, as do MTDs. Additionally, datain MTDs can be updated in real time, as opposed to the periodic (weekly,monthly, or quarterly) updates that are characteristic of data warehouses andmarketing databases. Also, the data quality in an MTD is focused, and is verifiedby the individual customers. It thus is of much higher quality than data in manyoperations databases, where legacy systems may offer only poor assurance of dataquality. Further, MTDs can combine various types of data—behavioral, descrip-tive, and derivative; other types of marketing databases may offer only one ortwo of these types. Note that MTDs do not eliminate the need for traditionaldatabases. They complement them by providing additional capabilities (seeOnline Minicase 1, Dell Computers).

THE ROLE OF THE INTERNET. Data mining, data warehousing, and MTDs aredelivered on the Internet and intranets. The Internet does not simply representanother advertising venue or a different medium for catalog sales. Rather, itcontains new attributes that smart marketers can exploit to their fullest degree.Indeed, the Internet promises to revolutionize sales and marketing. Dell Com-puter (see Online Minicase 1) offers an example of how marketing profession-als can use the Internet’s electronic sales and marketing channels for marketresearch, advertising, information dissemination, product management, andproduct delivery. For an overview of marketing databases and the Web, seeGrossnickle and Raskin (2000).

Fewer and fewer companies can afford traditional marketing approaches, whichinclude big-picture strategies and expensive marketing campaigns. Marketingdepartments are being scaled down, (and so is the traditional marketingapproaches) and new approaches such as one-to-one marketing, speed market-ing, interactive marketing, and relationship marketing are being employed (seeStrauss et al., 2003). The following examples illustrate how companies use datamining and warehousing to support the new marketing approaches. For otherexamples, see Online File W11.9.

● Through its online registry for expectant parents, Burlington Coat Factorytracks families as they grow. The company then matches direct-mail mate-rial to the different stages of a family’s development over time. Burlingtonalso identifies, on a daily basis, top-selling styles and brands. By digging intoreams of demographic data, historical buying patterns, and sales trends inexisting stores, Burlington determines where to open its next store and whatto stock in each store.

● Au Bon Pain Company, Inc., a Boston-based chain of cafes, discovered thatthe company was not selling as much cream cheese as planned. When it an-alyzed point-of-sale data, the firm found that customers preferred small,

ImplementationExamples

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 524

Page 36: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.6 MARKETING DATABASES IN ACTION 525

one-serving packaging (like butter). As soon as the package size of the creamcheese was changed, sales shot up.

● Bank of America gets more than 100,000 telephone calls from customers everyday. Analyzing customers’ banking activities, the bank determines what maybe of interest to them. So when a customer calls to check on a balance, thebank tries to sell the customer something in which he or she might be interested.

● Supermarket chains regularly analyze reams of cash register data to discoverwhat items customers are typically buying at the same time. These shoppingpatterns are used for issuing coupons, designing floor layouts and products’location, and creating shelf displays.

● In its data warehouse, the Chicago Tribune stores information about cus-tomer behavior as customers move through the various newspaper Websites. Data mining helps to analyze volumes of data ranging from whatbrowsers are used to what hyperlinks are clicked on most frequently.

The data warehouses in some companies include several terabytes or moreof data (e.g., at Sears, see Minicase 1). They need to use supercomputing to siftquickly through the data. Wal-Mart, the world’s largest discount retailer, has agigantic database, as shown in IT At Work 11.3.

Wal-Mart is using a data mining-based demand-forecast-ing application that employes on neural networking softwareand runs on a 4,000-processor parallel computer. The appli-cation looks at individual items for individual stores to decidethe seasonal sales profile of each item. The system keeps ayear’s worth of data on the sales of 100,000 products andpredicts which items will be needed in each store and when.

Wal-Mart is expanding its use of market-basket analysis.Data are collected on items that comprise a shopper’s totalpurchase so that the company can analyze relationshipsand patterns in customer purchases. The data warehouse isavailable over an extranet to store managers and suppliers.In 2003, 6,000 users made over 40,000 database querieseach day.

“What Wal-Mart is doing is letting an army of peopleuse the database to make tactical decisions,” says consult-ant Winter. “The cumulative impact is immense.”

Sources: This information is courtesy of NCR Corp. (2000) andwalmart.com.

For Further Exploration: Since small retailers cannotafford data warehouses and data mining, will they be ableto compete?

With more than 50 terabytes of data (in 2003) on twoNCR (National Cash Register) systems, Wal-Mart

(walmart.com) manages one of the world’s largest datawarehouses. Besides the two NCR Teradata databases,which handle most decision-support applications, Wal-Mart has another 6 terabytes of transaction processing dataon IBM and Hitachi mainframes.

Wal-Mart’s formula for success—getting the right prod-uct on the appropriate shelf at the lowest price—owes muchto the company’s multimillion-dollar investment in datawarehousing. “Wal-Mart can be more detailed than most ofits competitors on what’s going on by product, by store, byday—and act on it,” says Richard Winter, a database con-sultant in Boston. “That’s a tremendously powerful thing.”

The systems house data on point of sale, inventory,products in transit, market statistics, customer demograph-ics, finance, product returns, and supplier performance.The data are used for three broad areas of information dis-covery and decision support: analyzing trends, managinginventory, and understanding customers. What emergesare “personality traits” for each of Wal-Mart’s 3,500 or sooutlets, which Wal-Mart managers can use to determineproduct mix and inventory levels for each store.

IT At Work 11.3DATA MINING POWERS AT WAL-MART

MKT

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 525

Page 37: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

526 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

Data management and business intelligence activities—from data acquisition(e.g., Atzeni et al., 2002), through warehousing, to mining—are often per-formed with Web tools, or are interrelated with Web technologies and e-business(see Liautaud, 2001). Users with browsers can log onto a system, make inquiries,and get reports in a real-time setting. This is done through intranets, and foroutsiders via extranets (see remedy.com).

E-commerce software vendors are providing Web tools that connect the datawarehouse with EC ordering and cataloging systems. Hitachi’s EC tool suite,Tradelink (at hitachi.com), combines EC activities such as catalog management,payment applications, mass customization, and order management with datawarehousesand martsand ERP systems. Oracle (see Winter, 2001) and SAP offersimilar products.

Data warehousing and decision support vendors are connecting their productswith Web technologies and EC. Examples include Brio’s Brio One, Web Intelli-gence from Business Objects, and Cognos’s DataMerchant. Hyperion’s Appsource“wired for OLAP” product connects OLAP with Web tools. IBM’s Decision Edge imakes OLAP capabilities available on the intranet from anywhere in the corpora-tion using browsers, search engines, and other Web technologies. MicroStrategyoffers DSS Agent and DSS Web for help in drilling down for detailed information,providing graphical views, and pushing information to users’ desktops. Oracle’sFinancial Analyzer and Sales Analyzer, Hummingbird’s Bi/Web and Bi/Broker, andseveral of the products cited above bring interactive querying, reporting, and otherOLAP tasks to many users (both company employees and business partners) viathe Web. Also, for a comprehensive discussion of business intelligence on the Web,see the white paper at businessobjects.com.

The systems described in the previous sections of this chapter can be inte-grated on by Web-based platforms, such as the one shown in Figure 11.7. TheWeb-based system is accessed via a portal, and it connects the following parts:

11.7 WEB-BASED DATA MANAGEMENT SYSTEMS

PORTAL

BI SERVICES

META DATA

BI DATA MART CREATION

e-Business, ERP, CRM,SCM, Legacy

BI READY DATA INFRA STRUCTURE

Data MiningAd Hoc QueryScorcardingAnalysisVisualizationManaged Reporting

E-A

PP

LIC

AT

ION

S

SE

CU

RIT

Y

FIGURE 11.7 Web-baseddata management sys-tem. (Source: cognos.com.Platform for EnterpriseBusiness Intelligence.© Cognos Inc. 2001.)

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 526

Page 38: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.7 WEB-BASED DATA MANAGEMENT SYSTEMS 527

the business intelligence (BI) services, the data warehouse and marts, the cor-porate applications, and the data infrastructure. A security system protects thecorporate proprietary data. Let’s examine how all of these components worktogether via the corporate portal.

Enterprise BI suites (EBISs) integrate query, reporting, OLAP, and other tools.They are scalable, and offered by many vendors (e.g., IBM, Oracle, Microsoft,Hyperion Solution, Sagent Technology, AlphaBlox, MicroStrategy and CrystalDecisions). EBISs are offered usually via enterprise portals.

In Chapter 4 we introduced the concept of corporate portals as a Web-basedgateway to data, information, and knowledge. As seen in Figure 11.8, the por-tal integrates data from many sources. It provides end users with a single Web-based point of personalized access to BI and other applications. Likewise, it pro-vides IT with a single point of delivery and management of this content. Usersare empowered to access, create, and share valuable information.

The amount of data in the data warehouse can be very large. While the organ-ization of data is done in a way that permits easy search, it still may be use-ful to have a search engine for specific applications. Liu (1998) describes how

EnterpriseBI suites and

Corporate Portals

ERPapplication

On-linetransactiondatabase

ETL anddata

quality

Point of saledata

Structured

Data

Unstructured

Data

Data Information

ContentManagementApplication

ContentManagementReposition

Database

OLAPAnalysis

Analyticalapplication

Warehousemanagement

Query andreporting

Datamining

UnderstandingEnterpriseInformationPortals

Web documentsE-mailProduct plansLegal contractsWord processing documentsAdvertisementsAudio filesProduct specificationsPurchase ordersPaper invoicesVideo filesResearch reportsGraphics/graphic designs

FIGURE 11.8 Sources ofcontent for an enterpriseinformation portal. (Source:Merrill Lynch, 1998.)

Intelligent DataWarehouse Web-

based Systems

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 527

Page 39: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

528 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

an intelligent agent can improve the operation of a data warehouse in thepulp and paper industry. This application supplements the monitoring andscanning of external strategic data. The intelligent agent application can serveboth managers’ ad-hoc query/reporting information needs and the externaldata needs of a strategic management support system for forest companies inFinland.

Large and ever increasing amounts of B2C data about consumers, products, etc.can be collected. Such data come from several sources: internal data (e.g., salesdata, payroll data etc.), external data (e.g., government and industry reports),and clickstream data. Clickstream data are those that occur inside the Webenvironment, when customers visit a Web site. They provide a trail of the users’activities in the Web site, including user behavior and browsing patterns. Bylooking at clickstream data, an e-tailer can find out such things as whichpromotions are effective and which population segments are interested inspecific products.

According to Inmon (2001), clickstream data can reveal information toanswer questions such as the following: What goods has the customer lookedat or purchased? What items did the customer buy in conjunction with otheritems? What ads and promotion were effective? Which were ineffective? Arecertain products too hard to find? Are certain products too expensive? Is therea substitute product that the customer find first?

The Web is an incredibly rich source of business intelligence, and manyenterprises are scrambling to build data warehouses that capture the knowledgecontained in the clickstream data from their Web sites. By analyzing the userbehavior patterns contained in these clickstream data warehouses, savvy busi-ness can expand their markets, improve customer relationships, reduce costs,streamline operations, strengthen their Web sites, and hone their businessstrategies. One has two options: incorporate Web-based data into preexistingdata warehouses, or to build new clickstream data warehouses that are capa-ble of showing both e-business activities and the non-Web aspects of thebusiness in an integrated fashion. (see Sweiger at el., 2002).

➥ MANAGERIAL ISSUES1. Cost-benefit issues and justification. Some of the data management solutions

discussed in this chapter are very expensive and are justifiable only in largecorporations. Smaller organizations can make the solutions cost effective ifthey leverage existing databases rather than create new ones. A careful cost-benefit analysis must be undertaken before any commitment to the newtechnologies is made.

2. Where to store data physically. Should data be distributed close to theirusers?This could potentially speed up data entry and updating, but adds replica-tion and security risks. Or should data be centralized for easier control, se-curity, and disaster recovery? This has communications and single point offailure risks.

Clickstream DataWarehouse

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 528

Page 40: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

11.7 WEB-BASED DATA MANAGEMENT SYSTEMS 529

3. Legal issues. Data mining may suggest that a company send catalogs or pro-motions to only one age group or one gender. A man sued Victoria’s SecretCorp. because his female neighbor received a mail order catalog with deeplydiscounted items and he received only the regular catalog (the discount wasactually given for volume purchasing). Settling discrimination charges can bevery expensive.

4. Internal or external? Should a firm invest in internally collecting, storing,maintaining, and purging its own databases of information? Or should itsubscribe to external databases, where providers are responsible for all datamanagement and data access?

5. Disaster recovery. Can an organization’s business processes, which have be-come dependent on databases, recover and sustain operations after a natu-ral or other type of information system disaster? (See Chapter 15.) How cana data warehouse be protected? At what cost?

6. Data security and ethics. Are the company’s competitive data safe from ex-ternal snooping or sabotage? Are confidential data, such as personnel details,safe from improper or illegal access and alteration? A related question is,Who owns such personal data? (See Smith, 1997.)

7. Ethics: Paying for use of data. Compilers of public-domain information, suchas Lexis-Nexis, face a problem of people lifting large sections of their workwithout first paying royalties. The Collection of Information Antipiracy Act(Bill HR 2652 in the U.S. Congress) will provide greater protection from on-line piracy. This, and other intellectual property issues, are being debated inCongress and adjudicated in the courts.

8. Privacy. Collecting data in a warehouse and conducting data mining mayresult in the invasion of individual privacy. What will companies do to pro-tect individuals? What can individuals do to protect their privacy? (SeeChapter 16.)

9. The legacy data problem. One very real issue, often known as the legacy dataacquisition problem, is what to do with the mass of information alreadystored in a variety of systems and formats. Data in older, perhaps obsolete,databases still need to be available to newer database management systems.Many of the legacy application programs used to access the older data sim-ply cannot be converted into new computing environments without consid-erable expense. Basically, there are three approaches to solving this problem.One is to create a database front end that can act as a translator from the oldsystem to the new. The second is to cause applications to be integrated withthe new system, so that data can be seamlessly accessed in the original for-mat. The third is to cause the data to migrate into the new system byreformatting it. A new promising approach is the use of Web Services (seeChapter 14).

10. Data delivery. Moving data efficiently around an enterprise is often a ma-jor problem. The inability to communicate effectively and efficiently amongdifferent groups, in different geographical locations is a serious roadblock toimplementing distributed applications properly, especially given the manyremote sites and mobility of today’s workers. Mobile and wireless computing(Chapter 6) are addressing some of these difficulties.

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 529

Page 41: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

530 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

KEY TERMSAnalytical processing •••

Business intelligence (BI) •••

Clickstream data •••

Data integrity •••

Data mart •••

Data mining •••

Data quality (DQ) •••

Data visualization •••

Data warehouse •••

Document management •••

Document management system(DMS) •••

Geographical information system(GIS) •••

Knowledge discovery •••

Knowledge discovery in databases(KDD) •••

Marketing database •••

Marketing transaction database(MTD) •••

Metadata •••

Multidimensional database •••

Multidimensionality •••

Multimedia database •••

Object-oriented database •••

Online analytical processing(OLAP) •••

Operational data store •••

Text mining •••

Virtual reality (VR) •••

Virtual reality markup language(VRML) •••

Visual interactive modeling(VIM) •••

Visual interactive simulation(VIS) •••

CHAPTER HIGHLIGHTS (Numbers Refer to Learning Objectives)

� Online analytical processing is a data discovery methodthat uses analytical approaches.

� Business intelligence is an umbrella name for largenumber of methods and tools used to conduct dataanalysis.

� Data mining for knowledge discovery is an attemptto use intelligent systems to scan volumes of data tolocate necessary information and knowledge.

� Visualization is important for better understanding ofdata relationships and compression of information.Several computer-based methods exist.

� A geographical information system captures, stores,manipulates, and displays data using digitized maps.

� Virtual reality is 3-D, interactive, computer-generatedgraphics that provides users with a feeling that theyare inside a certain environment.

� Marketing databases provide the technological sup-port for new marketing approaches such as interactivemarketing.

� Marketing transaction databases provide dynamic in-teractive functions that facilitate customized advertise-ment and services to customers.

Web-based systems are used extensively in supportingdata access and data analysis. Also, Web-based systemsare an important source of data. Finally, data visuali-zation is frequently combined with Web systems.

� Data are the foundation of any information systemand need to be managed throughout their useful lifecycle, which convert data to useful information,knowledge and a basis for decision support.

� Data exist in internal and external sources. Personaldata and knowledge are often stored in people’sminds.

� The Internet is a major source of data and knowledge.Other sources are databases, paper documents, videos,maps, pictures and more.

� Many factors that impact the quality of data must berecognized and controlled.

� Today data and documents are managed electronically.They are digitized, stored and used in electronic man-agement systems.

� Electronic document management, the automatedcontrol of documents, is a key to greater efficiency inhandling documents in order to gain an edge on thecompetition.

� Multidimensional presentation enables quick and easymultiple viewing of information in accordance withpeople’s needs.

� Data warehouses and data marts are necessary to sup-port effective information discovery and decision mak-ing. Relevant data are indexed and organized for easyaccess by end users.

ON THE WEB SITE… Additional resources, including quizzes; online files ofadditional text, tables, figures, and cases; and frequently updated Web links tocurrent articles and information can be found on the book’s Web site.(wiley.com/college/turban).

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 530

Page 42: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

EXERCISES 531

QUESTIONS FOR REVIEW1. List the major sources of data.

2. List some of the major data problems.

3. What is a terabyte? (Write the number.)

4. Review the steps of the data life cycle and explain them.

5. List some of the categories of data available on theInternet.

6. Define data quality.

7. Define document management.

8. Describe a data warehouse.

9. Describe a data mart.

10. Define business intelligence.

11. Define online analytical processing (OLAP).

12. Define data mining and describe its major characteristics.

13. Define data visualization.

14. Explain the properties of multidimensionality and itsvizualization.

15. Describe GIS and its major capabilities.

16. Define visual interactive modeling and simulation.

17. Define a marketing transaction database.

18. Define virtual reality.

QUESTIONS FOR DISCUSSION1. Compare data quality to data integrity. How are they

related?

2. Discuss the relationship between OLAP and multidi-mentionality.

3. Discuss business intelligence and distinguish betweendecision support and information and Knowledge dis-covery.

4. Discuss the factors that make document managementso valuable. What capabilities are particularly valuable?

5. Relate document management to imaging systems.

6. Describe the process of information and knowledgediscovery, and discuss the roles of the data warehouse,data mining, and OLAP in this process.

7. Discuss the major drivers and benefits of data ware-housing to end users.

8. Discuss how a data warehouse can lessen the stovepipeproblem. (See Chapter 9.)

9. A data mart can substitute for a data warehouse orsupplement it. Compare and discuss these options.

10. Why is the combination of GIS and GPS becoming sopopular? Examine some applications.

11. Discuss the advantages of terabyte marketing data-bases to a large corporation. Does a small companyneed a marketing database? Under what circum-stances will it make sense to have one?

12. Discuss the benefits managers can derive from visualinteractive simulation in a manufacturing company.

13. Why is the mass-marketing approach may not be ef-fective? What is the logic of targeted marketing?

14. Distinguish between operational databases, data ware-houses, and marketing data marts.

15. Relate the Sears minicase to the phases of the data lifecycle.

16. Discuss the potential contribution of virtual reality toe-commerce.

17. Discuss the interaction between marketing and man-agement theories and IT support at Harrah’s case.

EXERCISES1. Review the list of data management difficulties in Section

11.1. Explain how a combination of data warehousingand data mining can solve or reduce these difficulties. Bespecific.

2. Interview a knowledge worker in a company you workfor or to which you have access. Find the data problemsthey have encountered and the measures they havetaken to solve them. Relate the problems to Strong’sfour categories.

3. Ocean Spray Cranberries is a large cooperative of fruitgrowers and processors. Ocean Spray needed data todetermine the effectiveness of its promotions and its

advertisements and to make itself able to respond strate-gically to its competitors’ promotions. The company alsowanted to identify trends in consumer preferences fornew products and to pinpoint marketing factors thatmight be causing changes in the selling levels of certainbrands and markets. Ocean Spray buys marketing datafrom InfoScan (infores.com), a company that collects datausing bar code scanners in a sample of 2,500 storesnationwide and from A. C. Nielsen. The data for eachproduct include sales volume, market share, distribution,price information, and information about promotions(sales, advertisements).

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 531

Page 43: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

532 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

The amount of data provided to Ocean Spray on adaily basis is overwhelming (about 100 to 1,000 timesmore data items than Ocean Spray used to collect onits own). All the data are deposited in the corporatemarketing data mart. To analyze this vast amount ofdata, the company developed a DSS. To give end userseasy access to the data, the company uses an expertsystem–based data-mining process called CoverStory,which summarizes information in accordance withuser preferences. CoverStory interprets data processedby the DSS, identifies trends, discovers cause and effectrelationships, presents hundreds of displays, andprovides any information required by the decision

makers. This system alerts managers to key problemsand opportunities.

a. Find information about Ocean Spray by enteringOcean Spray’s Web site (oceanspray.com).

b. Ocean Spray has said that it cannot run the businesswithout the system. Why?

c. What data from the data mart are used by the DSS?d. Enter infores.com or scanmar.nl and review the mar-

keting decision support information. How is thecompany related to a data warehouse?

e. How does Infoscan collect data? (Check the DataWrench product.)

GROUP ASSIGNMENTS1. Several applications now combine GIS and GPS.

a. Survey such applications by conducting literatureand Internet searches and query GIS vendors.

b. Prepare a list of five applications, including at leasttwo in e-commerce (see Chapter 6).

c. Describe the benefit of such integration.

2. Prepare a report on the topic of “data management andthe intranet.” Specifically, pay attention to the role ofthe data warehouse, the use of browsers for query, anddata mining. Also explore the issue of GIS and the In-ternet. Finally, describe the role of extranets in supportof business partner collaboration. Each student will visitone or two vendors’ sites, read the white papers, andexamine products (Oracle, Red Bricks, Brio, SiemensMixdorf IS, NCR, SAS, and Information Advantage).Also, visit the Web site of the Data Warehouse Institute(dw-institute.org).

3. Companies invest billions of dollars to support data-base marketing. The information systems departments’

(ISD) activities that have supported accounting and fi-nance in the past are shifting to marketing. Accordingto Tucker (1997), some people think that the ISDshould report to marketing. Do you agree or disagree?Debate this issue.

4. In 1996, Lexis-Nexis, the online information service,was accused of permitting access to sensitive informa-tion on individuals. Using data mining, it is possible notonly to capture information that has been buried indistant courthouses, but also to manipulate and cross-index it. This can benefit law enforcement but invadeprivacy. The company argued that the firm was targetedunfairly, since it provided only basic residential data forlawyers and law enforcement personnel. Should Lexis-Nexis be prohibited from allowing access to suchinformation or not? Debate the issue.

INTERNET EXERCISES1. Conduct a survey on document man-

agement tools and applications by visit-ing dataware.com, documentum.com, andaiim.org/aim/publications.

2. Access the Web sites of one or two ofthe major data management vendors,

such as Oracle, Informix, and Sybase, and trace thecapabilities of their latest products, including Webconnections.

3. Access the Web sites of one or two of the major datawarehouse vendors, such as NCR or SAS; find howtheir products are related to the Web.

4. Access the Web site of the GartnerGroup (gartnergroup.com). Examine some of their research notes pertainingto marketing databases, data warehousing, and datamanagement. Prepare a report regarding the state ofthe art.

5. Explore a Web site for multimedia database applica-tions. Visit such sites as leisureplan.com, illustra.com, oradb.fr. Review some of the demonstrations, and pre-pare a concluding report.

6. Enter microsoft.com/solutions/BI/customer/biwithinreach_demo.asp and see how BI is supported by Microsoft’stools. Write a report.

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 532

Page 44: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

INTERNET EXERCISES 533

7. Enter teradatauniversitynetwork.com. Prepare a summaryon resources available there. Is it valuable to a student?

8. Enter visual mining.com and review the support theyprovide to business intelligence. Prepare a report.

9. Survey some GIS resources such as geo.ed.ac.uk/home/hiswww.html and prenhall.com/stratgis/sites.html. IdentifyGIS resources related to your industry, and prepare areport on some recent developments or applications.See http://nsdi.usgs.gov/nsdi/pages/what_is_ gis.html.

10. Visit the sites of some GIS vendors (such as mapinfo.com, esri.com, autodesk.com or bently.com). Join a news-group and discuss new applications in marketing,banking, and transportation. Download a demo. What

are some of the most important capabilities and newapplications?

11. Enter websurvey.com, clearlearning.com, and tucows.com/webforms, and prepare a report about data collectionvia the Web.

12. Enter infoscan.com. Find all the services related todynamic warehouse and explain what it does.

13. Enter ibm.com/software and find their data mining prod-ucts, such as DB2 Intelligent Miner. Prepare a list ofproducts and their capabilities.

14. Enter megapuker.com, Read “Data Mining 101,” “TextAnalyst,” and “WebAnalyst.” Compare the two prod-ucts. (Look at case studies.)

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 533

Page 45: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

534 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

The Problem. Sears, Roebuck and Company, the largest de-partment store chain and the third-largest retailer in theUnited States, was caught by surprise in the 1980s as shop-pers defected to specialty stores and discount mass merchan-disers, causing the firm to lose market share rapidly. In an at-tempt to change the situation, Sears used several responsestrategies, ranging from introducing its own specialty stores(such as Sears Hardware) to restructure its mall-based stores.Recently, Sears has moved to selling on the Web. It discon-tinued its paper catalog. Accomplishing the transformationand restructuring, required the retooling of its informationsystems.

Sears had 18 data centers, one in each of 10 geographi-cal regions as well as one each for marketing, finance, andother departments. The first problem was created when thereorganization effort produced only seven geographical re-gions. Frequent mismatches between accounting and salesfigures and information scattered among numerous data-bases forced users to query multiple systems, even whenthey needed an answer to a simple query. Furthermore,users found that data that were already summarized madeit difficult to conduct analysis at the desired level of detail.Finally, errors were virtually inevitable when calculationswere based on data from several sources.

The Solution. To solve these problems Sears con-structed a single sales information data warehouse. Thisreplaced the 18 old databases which were packed with

redundant, conflicting, and sometimes obsolete data. Thenew data warehouse is a simple repository of relevantdecision-making data such as authoritative data for keyperformance indicators, sales inventories, and profit mar-gins. Sears, known for embracing IT on a dramatic scale,completed the data warehouse and its IT reengineeringefforts in under one year—a perfect IT turnaround story.

Using an NCR enterprise server, the initial 1.7 terabyte(1.7 trillion bytes) data warehouse is part of a projectdubbed the Strategic Performance Reporting System(SPRS). By 2003, the data warehouse had grown to over70 terabytes. SPRS includes comprehensive sales data; in-formation on inventory in stores, in transit, and at distri-bution centers; and cost per item. This has enabled Searsto track sales by individual items(skus) in each of its 1,950stores (including 810 mall-based stores) in the UnitedStates and 1,600 international stores and catalog outlets.Thus, daily margin by item per store can be easily com-puted, for example. Furthermore, Sears now fine tunes itsbuying, merchandising, and marketing strategies withpreviously unattainable precision.

SPRS is open to all authorized employees, who now canview each day’s sales from a multidimensional perspective(by region, district, store, product line, and individualitem). Users can specify any starting and ending dates forspecial sales reports, and all data can be accessed via ahighly user-friendly graphical interface. Sears managers

Minicase 1Precision Buying, Merchandising, and Marketing at Sears

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 534

Page 46: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

MINICASE 1 535

can now monitor the precise impact of advertising,weather, and other factors on sales of specific items. Thismeans that buyers and other specialists can examine andadjust if needed, inventory quantities, merchandising, andorder placement, along with myriad other variables, almostimmediately, so they can respond quickly to environmen-tal changes. SPRS users can also group together widely di-vergent kinds of products, for example, tracking sales ofitems marked as “gifts under $25.” Advertising staffers canfollow so-called “great items,” drawn from vastly differentdepartments, that are splashed on the covers of promo-tional circulars. SPRS enables extensive data mining, butonly on sku and location related analysis.

In 1998 Sears created a large customer database, dubbedLCI (Leveraging Customer Information) which containedcustomer-related sale information (which was not avail-able on SPRS). The LCI enable hourly records of transac-tions, for example, guiding hourly promotion (such as15% discounts for early birds).

In the holiday season of 2001 Sears decided to replaceits regular 10% discount promotion by offering deep dis-count during early shopping hours. This new promotion,which was based on SPRS failed, and only when LCI wasused the problem was corrected. This motivated Sears tocombine LCI and SPRS in a single platform, which enablessophisticated analysis (in 2002).

By 2001 Sears also had the following Web initiatives:e-commerce home improvement center, a B2B supplyexchange for the retail industry, a toy catalog (wishbook.com), an e-procurement system, and much more. All ofthese Web-marketing initiatives feed data into the data

warehouse, and their planning and control are based onaccessing the data in the data warehouse.

The Results. The ability to monitor sales by item perstore enables Sears to create a sharp local market focus. Forexample, Sears keeps different shades of paint colors in dif-ferent cities to meet local demands. Therefore, sales andmarket share have improved. Also, Web-based data moni-toring of sales at LCI helps Sears to plan marketing andWeb advertising.

At its inception, the data warehouse had been useddaily by over 3,000 buyers, replenishers, marketers,strategic planners, logistics and finance analysts, and storemanagers. By 2003, there were over 5000 users, sinceusers found the system very beneficial. Response time toqueries has dropped from days to minutes for typical re-quests. Overall, the strategic impact of the SPRS-LC datawarehouse is that it offers Sears employees a tool formaking better decisions, and Sears retailing profits haveclimbed more than 20 percent annually since SPRS wasimplemented.

Sources: Compiled from Amato-McCoy (2002), Beitler and Leary(1997); and press releases of Sears (2001–2003).

Questions for Minicase 1

1. What were the drivers of SPRS?

2. How did the data wareshouse solve Sear’s problems?

3. Why was it beneficial to integrate the customers’ data-base with SPRS?

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 535

Page 47: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

536 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

Public transportation in Dallas and its neighboring communi-ties is provided by Dallas Area Rapid Transit (DART), whichoperates buses, vans, and a train system. The service area hasgrown very fast. By the mid-1980s, the agency was no longerable to respond properly to customer requests, make rapidchanges in scheduling, plan properly, or manage security.

The solution to these problems was discovered usingGISs. A GIS digitizes maps and maplike information, inte-grates it with other database information, and uses thecombined information for planning, problem solving, anddecision making. DART (dart.org) maintains a centralizedgraphical database of every object for which it is responsible.

The GIS presentation makes it possible for DART’smanagers, consultants, and customers to view and ana-lyze data on digitized maps. Previously, DART created ser-vice maps on paper showing bus routes and schedules.The maps were updated and redistributed several timesa year, at a high cost. Working with paper maps made itdifficult to respond quickly and accurately to the nearly6,000 customer inquiries each day. For example, to an-swer a question concerning one of the more than 200 busroutes or a specific schedule, it was often necessary to lookat several maps and routes. Planning a change was also atime-consuming task. Analysis of the viability of bus routealternatives made it necessary to photocopy maps frommap books, overlay tape to show proposed routes, andspend considerable time gathering information on thedemographics of the corridors surrounding the proposedroutes.

The GIS includes attractive and accurate maps that in-terface with a database containing information about busschedules, routes, bus stops (in excess of 15,000), trafficsurveys, demographics, and addresses on each street in thedatabase. The system allows DART employees to:

● Respond rapidly to customer inquiries (reducing re-sponse time by at least 33 percent).

● Perform the environmental impact studies required bythe city.

● Track where the buses are at any time using a globalpositioning system.

● Improve security on buses.

● Monitor subcontractors quickly and accurately.

● Analyze the productivity and use of existing routes.

For instance, a customer wants to know the closest busstop and the schedule of a certain bus to take her to a cer-tain destination. The GIS automatically generates the an-swer when the caller says where she is by giving an address,a name of an intersection, or a landmark. The computer cancalculate the travel time to the desired destination as well.

Analyses that previously took days to complete are nowexecuted in less than an hour. Special maps, which previ-ously took up to a week to produce at a cost of $13,000 to$15,000 each, are produced in 5 minutes at the cost of 3 feetof plotter paper.

In the late 1990s, the GIS was combined with a GPS.The GPS tracks the location of the buses and computes theexpected arrival time at each bus stop. Many maps are ondisplay at the Web site, including transportation lines andstops superimposed on maps.

Sources: Condensed from GIS World, July 1993; updated with in-formation compiled from dart.org (2003).

Questions for Minicase 2

1. Describe the role of data in the DART system.

2. What are the advantages of computerized maps?

3. Comment on the following statement: “Using GIS, userscan improve not only the inputting of data but also theiruse.”

4. Speculate on the type of information provided by theGPS.

Minicase 2GIS at Dallas Area Rapid Transit

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 536

Page 48: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

MINICASE 2 537

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 537

Page 49: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

538 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

You were intrigued by the opening story of Harrah’s use ofCRM data to improve the customer experience, and itseems to you that there are many parallels between satisfy-ing Harrah’s customers and The Wireless Café’s customers(aside from the slot machines, of course). You have noticedthat Jeremy likes to review The Wireless Café’s financialand customer data using Excel charts and graphs, so youthink he might be interested in discussing ways to exploitexisting The Wireless Café data for improving the customerexperience in the diner.

1. As you’ve grown quite fond of Marie-Christine’s cook-ing, you would like to help her develop even more pop-ular menu hits. The idea of using business intelligenceand data mining to better predict customer orderingtrends and behaviors strikes you as something thatwould help. For example, do customers buy moredesserts on Friday and Saturday than during the week?Do some research on business intelligence software and

prepare a brief report for Jeremy and Marie-Christine todescribe the kinds of business intelligence that could beused to prepare smash-hit menus.

2. The Wireless Café does not have a significant marketingbudget; however, some inexpensive Web-based market-ing strategies can be effective. Earlier, Barbara expressedinterest in creating a CRM system to better know andserve The Wireless Café’s customers. Describe a strategywhereby you could combine CRM data and Web-basedmarketing to increase the business at The Wireless Café.

3. Business intelligence, data warehousing, data mining,and visualization are activities that could benefit TheWireless Café’s bottom line. Many CIOs these days ex-pect an IT project to pay for itself within 12 months orless. How would you measure the benefits of introduc-ing new data management software to determine whichcompeting projects to implement and whether youcould expect a 12-month payback on the software?

Virtual Company AssignmentData Management

Adjeroh, D. A., and K. C. Nwosu, “Multimedia Database Man-agement—Requirements and Issues,” IEEE Multimedia, July—September 1997.

Alter, S. L., Decision Support Systems, Reading, MA: Addison Wesley,1980.

Amato-McCoy, D. M., “Sears Combines Retail Reporting And Cus-tomer Databases on a Single Platform,” Stores, November 2002.

Amato-McCoy, D. M., “Movie Gallery Mines Data to Monitor As-sociate Activities,” Stores, May 2003a.

Amato-McCoy, D. M., “AAFES Combats Fraud with ExceptionReporting Solution,” Stores, May 2003b.

Apte, C., et al., “Business Application of Data Mining,” Communica-tions of the ACM, August 2002.

Asprev, L., and M. Middleton, (eds.), Integrative Document and Con-tent Management, Hershey, PA: The Idea Group, 2003.

Atzeni, P., et al., “Managing Web-Based Data,” IEEE Internet Com-puting, July–August, 2002.

Banerjee, P., and D. Zetu, Virtual Manufacturing: Virtual Reality andComputer Vision Techniques. New York: Wiley, 2001.

Bates, J., “Business in Real Time–Realizing the Vision,” DM Review,May 2003.

Baumer, D., “Innovative Web Use to Learn about Consumer Be-havior and Online Privacy,” Communications of the ACM, April,2003.

Becker, S. A., (Ed.) Effective Database for Text and Document Manage-ment, Hershey, PA: IRM Press, 2003.

Beitler, S. S., and R. Leary, “Sears’ Epic Transformation: Convert-ing from Mainframe Legacy Systems to OLAP,” Journal of DataWarehousing, April 1997.

Beroggi, G.E., “Visual Interactive Decision Modeling in PolicyManagement,” Eurpoean Journal of Operational Research, January2001.

Berry, M., Survey of Text Mining: Clustering, Classification and Retrieval.Berlin: Springer–Verlag, 2002.

Best’s Review Magazine, May 1993.

BIXL, Business Intelligence for Excel, a white paper, Business Intelli-gence Technologies, Inc. 2002 (BIXL.com).

Brauer, J. R., “Data Quality Is the Cornerstone of Effective Busi-ness Intelligence,” DM Review, October 5, 2001.

Brown, D. E., and S. Hagen, “Data Association Methods with Appli-cations to Law Enforcement,” Decision Support Systems, March 2003.

Campbell, D., “Visualization for the Real World,” DM Review, Sep-tember 7, 2001.

Canada NewsWire, “European Court of Human Rights Saves Timeand Money for a News Wire,” April 29, 2003 NAICS#922110.

Carbone, P. L., “Data Warehousing: Many of the Common Failures,”Presentation, mitre.org/support/papers/tech…9_00/d-warehoulse_presentation.htm (May 3, 1999).

Church, R. L., “Geographical Information Systems and LocationScience,” Computers and Operations Research, May 2002.

Chopoorian, J. A., et al., “Mind Your Business by Mining YourData,” SAM Advanced Management Journal, spring 2001.

REFERENCES

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 538

Page 50: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

REFERENCES 539

Civic.com/pubs (accessed March 2001).

Codd, E. F., et al., “Beyond Decision Support,” Computerworld, July1993.

Cognos.com. “Platform for Enterprise Business Intelligence,” Cog-nos Inc., 2001.

Cohen, J. B., “‘Virtual Newsstand’ Debuts Online,” Editor and Pub-lisher, Vol. 129, No. 24, June 15, 1996, pp. 86–87.

Cole, B., “Document Management on a Budget,” Network World,Vol. 13, No. 8, September 16, 1996.

Creese, G., and A. Veytsel, “Data Quality at a Real-Time Tempo,”InSight, An Aberdeen Group publication, January 9, 2003,aberdeen.com/2001/research/01030003.asp.

Dalgleish, J., Customer-Effective Web Sites. Upper Saddle River, NJ:Pearson Technology Group, 2000.

Datamation, May 15, 1990.

Delcambre, L., et al., “Harvesting Information to Sustain Forests,”Communications of the ACM, January 2003.

Dimensional Insight, Business Intelligenc and OLAP Terms: An OnlineGlossary, dimins.com/Glossory1.html (accessed June 15, 2003).

Etzioni, O., “The WWW: Quagmire or Gold Mine,” Communicationsof the ACM, November 1996.

Fayyad, U. M., et al., “The KDD Process for Extracting UsefulKnowledge from Volumes of Data,” Communications of the ACM,November 1996.

Finnish Business Report, April 1997.

Fong, A. C. M., et al., “Data Mining for Decision Support,” IT Pro,March/April, 2002.

Fortune, “1994 Information Technology Guide,” August 1993.

GIS World, July 1993.

Goddard, S., et al., “Geospatial Decision Support for Drought RiskManagement,” Communications of the ACM, January 2003.

Grant, G., ed., ERP and Datawarehousing in Organizations: Issues andChallenges, Hershey, PA: IRM Press, 2003.

Gray, P., and H. J. Watson, Decision Support in the Data Warehouse.Upper Saddle River, NJ, Prentice-Hall, 1998.

Grossnickle, J., and O. Raskin, The Handbook of Marketing Research.New York: McGraw-Hill, 2000.

Hamilton, J. M., “A Mapping Feast,” CIO, March 15, 1996.

Hardester, K. P., “Au Enterprise GIS Solution for Integrating GISand CAMA,” Assessment Journal, November/December, 2002.

Hasan, B., “Assesing Data Authenticity with Benford’s Law,”Information Systems Control Journal, July 2002.

Hirji, K. K., “Exploring Data Mining Implementation,” Communica-tions of the ACM, July 2001.

IBM Systems Journal, Vol. 29, No. 3, 1990.

Information Week (May 14, 2001).

Infoworld, January 27, 1997.

Inmon, W. H., Building The Data Warehouse, 3rd ed. New York:Wiley, 2002.

Inmon, W. H., “Why Clickstream Data Counts,” e-Business Advisor,April 2001.

I/S Analyzer, “Visualization Software Aids in Decision Making,”I/S Analyzer, July 2002.

Kimball, R., and M. Ross, The Data Warehouse Tool Kit, 2nd ed. NewYork: Wiley, 2002.

Kerlow, I. V., The Art of 3D, 2nd ed. New York: Wiley, 2000.

Korte, G. B., The GIS Book, 5th ed. Albany, NY: Onward Press,2000.

Lau, H. C. W., et al., “Development of an Intelligent Data-MiningSystem for a Dispered Manufacturing Network,” Expert Systems,September 2001.

Levinson, M., “Jackpot! Harrah’s Entertainment,” CIO Magazine,February 1, 2001.

Li, T., et al., “Information Visualization for Intelligent DSS,”Knowledge Based Systems, August 2001.

Liautaud, B., E-Business Intelligence, New York: McGraw-Hill, 2001.

Linoff, G. S., and J. A. Berry, Mining the Web: Transforming CustomerData. New York: Wiley, 2002.

Liu, S., “Data Warehousing Agent: To Make the Creation andMaintenance of Data Warehouse Easier,” Journal of Data Warehous-ing, Spring 1998.

Loveman, G., “Diamonds in the Data,” Harvard Business Review,May 2003.

Markus, M. L., et al., “A Design Theory for Systems that SupportEmergent Knowledge Processes,” MIS Quarterly, September 2002.

Mattison, R., Winning Telco Customers Using Marketing Databases.Norwood, MA: Artech House, 1999.

Merrill Lynch, 1998.

MIS Quarterly, December 1991.

Moad, J., “Mining a New Vein,” PC Week, January 5, 1998.

Moss, L. R., and S. Atre, Business Intelligence Road Map, Boston:Addison Weeky, 2003.

Nasirin, S., and D. F. Birks, “DSS Implementation in the UK RetailOrganizations: A GIS Perspective,” Information and Management,March, 2003.

NCR Corp., 2000.

Nemati, H. R., and C. D. Barko, “Enhancing Enterprise Decisionthrough Organizational Data Mining,” Journal of Computer Informa-tion Systems, Summer, 2002.

Nwosu, K. C., et al., “Multimedia Database Systems: A New Fron-tier,” IEEE Multimedia, July–September 1997.

O’Looney, J. A., Beyond Maps: GIS Decision Making in Local Govern-ments. Redlands, CA: ESRI Press, 2000.

Oguz, M. T., “Strategic Intelligence: Business Intelligence in Com-petitive Strategy,” DM Review, May 31, 2003.

Park, Y. T., “Strategic Uses of Data Warehouses,” Journal of DataWarehousing, April 1997.

Pritsker, A. A. B., and J. J. O’Reilly, Simulation with Visual SLAMand Awesim, 2nd ed. New York: Wiley, 1999.

Radding, A., “Going with GIS,” Bank Management, December 1991;updated February 2003.

Ray, N., and S.W. Tabor, “Cyber-Surveys Come of Age,” MarketingResearch, Spring 2003.

Redman, T. C., “The Impact of Poor Data Quality on the TypicalEnterprise,” Communications of the ACM, February 1998.

Rudnensteiner, E. A., et al., “Maintaining Data Warehousing overChanging Information Sources,” Communications of the ACM, June2000.

Sadeh, N., M-Commerce. New York: Wiley, 2002.

Sears (2001–2003).

Sikder, I., and A. Gangopadhyay, “Design and Implementation ofa Web-Based Collaborative Spatial Decision Support System: Or-ganizational and Managerial Implications,” Information ResourcesManagement Journal, October–December 2002.

Smith, H. J., “Who Owns Personal Data?” Beyond Computing,November–December 1997.

Southam.com, “Southam on the Web,” 2001.

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 539

Page 51: MIS - Chapter 11 - Data Management - Warehousing, Analyzing, Mining, And Visualization

540 CHAPTER 11 DATA MANAGEMENT: WAREHOUSING, ANALYZING, MINING,

Steede-Terry, K., Integrating GIS and GPS. Redlands, CA: Environ-mental Systems Research (eSRI.com), 2000.

Strauss, J., et al., E-Marketing. Upper Saddle River, NJ: PrenticeHall, 2003.

Strong, D. M., et al., “Data Quality in Context,” Communications ofthe ACM, May 1997.

Sweiger, M., et al., Clickstream Data Warehousing, New York: JohnWiley and Sons, 2002.

Tang C., et al., “An Agent-Based Geographical Information Sys-tem,” Knowledge-Based Systems, Vol. 14, 2001.

Terry, K., and D. Kolb, “Integrated Vehicle Routing and TrackingUsing GIS-Based Technology,” Logistics, March/April, 2003.

Tucker, M. J., “Poppin’ Fresh Dough,” (Database Marketing),Datamation, May 1997.

Turban, E., et al., Electronic Commerce: A Managerial Perspective. 3rded., Upper Saddle River, NJ: Prentice-Hall, 2004.

usaa.com (2001).

Watson, H. J., et al., “The Effects of Technology-Enabled BusinessStrategy At First American Corporation,” Organizational Dynamics,Winter, 2002.

Weiss, T. R., “Online Retail Sales On the Rise,” PC World, January2003.

Winter, R., Large Scale Data Warehousing with Oracle 9i Database,Special Report, Waltham MA: Winter Corp., 2001.

0006D_c11_490-540.qxd 10/9/03 8:32 PM Page 540