agile data warehousing - the agilist

28
Agile Data Warehousing: Incorporating Agile Principles by Dr. Ken Collier, Senior Consultant, Cutter Consortium; with Jim Highsmith, Director, Cutter Consortium Agile Product & Project Management Practice This Executive Report targets organizations that are considering a data warehousing project, organizations that have struggled to implement a data warehouse, data warehouse developers, and IT executives who are responsible for overseeing such projects. The report recommends a leaner approach than that of traditional methods; it advocates a working system that meets users’ business needs rather than heavily process-centric development approaches that emphasize documentation and contractual agreements. Armed with the agile data warehousing approach, organizations can increase the likelihood of a successful data warehouse implementation on time and within budget. Business Intelligence REPRINT

Upload: others

Post on 12-Sep-2021

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Agile Data Warehousing - The Agilist

Agile Data Warehousing:Incorporating Agile Principles

by Dr. Ken Collier, Senior Consultant, CutterConsortium; with Jim Highsmith, Director, Cutter Consortium Agile Product & ProjectManagement Practice

This Executive Report targets organizations that are considering a

data warehousing project, organizations that have struggled to

implement a data warehouse, data warehouse developers, and IT

executives who are responsible for overseeing such projects. The

report recommends a leaner approach than that of traditional

methods; it advocates a working system that meets users’ business

needs rather than heavily process-centric development approaches

that emphasize documentation and contractual agreements. Armed

with the agile data warehousing approach, organizations can increase

the likelihood of a successful data warehouse implementation on time

and within budget.

Business Intelligence

REPRINT

Page 2: Agile Data Warehousing - The Agilist

The Business Intelligence AdvisoryService Executive Report is publishedby the Cutter Consortium, 37 Broadway,Suite 1, Arlington, MA 02474-5552,USA. Client Services: Tel: +1 781 6419876; Fax: +1 781 648 8707; E-mail:[email protected]; Web site: www.cutter.com. Group Publisher: Chris Generali,E-mail: [email protected]. ManagingEditor: Cindy Swain, E-mail: [email protected]. Print ISSN: 1540-7403(Executive Report, ExecutiveSummary, and Executive Update);online/electronic ISSN: 1554-7078.

©2010 Cutter Consortium. All rightsreserved. Unauthorized reproduction inany form, including photocopying, down-loading electronic copies, posting on theInternet, image scanning, and faxing isagainst the law. Reprints make an excel-lent training tool. For more informationabout reprints and/or back issues of CutterConsortium publications, call +1 781 6488700 or e-mail [email protected].

This article originally appeared asExecutive Report Vol. 4, No. 12 of theBusiness Intelligence Advisory Service.

Cutter Consortium is a truly unique ITadvisory firm, comprising a group of morethan 100 internationally recognized expertswho have come together to offer content,consulting, and training to our clients.These experts are committed to deliveringtop-level, critical, and objective advice.They have done, and are doing, groundbreak-ing work in organizations worldwide, helpingcompanies deal with issues in the core areasof software development and agile projectmanagement, enterprise architecture, busi-ness-technology trends and strategies, inno-vation, enterprise risk management, metrics,and sourcing.

Cutter offers a different value propositionthan other IT research firms: We give youAccess to the Experts. You get practitioners’points of view, derived from hands-on experi-ence with the same critical issues you arefacing, not the perspective of a desk-boundanalyst who can only make predictions andobservations on what’s happening in themarketplace. With Cutter Consortium, you getthe best practices and lessons learned fromthe world’s leading experts — experts whoare implementing these techniques at compa-nies like yours right now.

Cutter’s clients are able to tap into its exper-tise in a variety of formats, including printand online advisory services and journals,mentoring, workshops, training, and consult-ing. And by customizing our informationproducts, training, and consulting services,you get the solutions you need while stayingwithin your budget.

Cutter Consortium’s philosophy is that thereis no single right solution for all enterprises,or all departments within one enterprise, oreven all projects within a department. Cutterbelieves that the complexity of the business-technology issues confronting corporationstoday demands multiple detailed perspectivesfrom which a company can view its opportu-nities and risks in order to make the rightstrategic and tactical decisions. The simplisticpronouncements other analyst firms make donot take into account the unique situation ofeach organization. This is another reason topresent the several sides to each issue: toenable clients to determine the course ofaction that best fits their unique situation.

Expert ConsultantsCutter Consortium products and servicesare provided by the top thinkers in ITtoday — a distinguished group of inter-nationally recognized experts committedto providing top-level, critical, objectiveadvice. They create all the written deliver-ables and perform all the consulting. That’swhy we say Cutter Consortium gives youAccess to the Experts.

For more information, contact CutterConsortium at +1 781 648 8700or [email protected].

ABOUT CUTTER CONSORTIUM

Access to the Experts

Rob Austin Ron Blitstein Christine Davis Tom DeMarco Lynne Ellyn Tim Lister Lou Mazzucchelli Ken Orr Robert Scott Mark Seiden Ed Yourdon

Cutter Business Technology Council

Business Intelligence

Page 3: Agile Data Warehousing - The Agilist

INTRODUCTION

This Executive Report targetsorganizations considering a datawarehousing project, organiza-tions that have struggled to get adata warehouse implemented,data warehouse developers, andIT executives whose job it is tooversee such projects. By combin-ing experience in data warehous-ing (Dr. Ken Collier) and in agiledevelopment practices (JimHighsmith), the authors havedeveloped a new approach todata warehousing projects calledagile data warehousing (ADW).

While the ADW method incorpo-rates the tried-and-tested funda-mentals of data warehousing, itadapts many of the successfulagile software development

practices and principles to datawarehousing techniques. Thereport doesn’t suggest a revolu-tionary change in data architec-tures, modeling methods, orwarehousing technologies;instead it recommends a leanapproach that emphasizes aworking system that meets users’business needs rather than heavilyprocess-centric developmentapproaches that emphasizedocumentation and contractualagreements.

The goal of this report is to changethe ways that data warehousingprojects are implemented. Datawarehousing is difficult, and thereare many accounts of failed proj-ects. Our experience suggeststhat an ADW approach will

greatly increase the likelihoodof successful implementation ofa data warehouse on time andwithin budget.

This approach has strong roots inthe work of AgileAlliance mem-bers. We have adapted publishedtechniques such as Test-DrivenDevelopment (TDD) from CutterConsortium Senior ConsultantKent Beck, the Framework forIntegrated Test (FIT) from WardCunningham, Agile Modelingfrom Cutter Consortium SeniorConsultant Scott Ambler, andothers to the unique characteris-tics of data warehouse develop-ment. We have also incorporatedexperience and lessons learnedfrom real ADW projects and thework of other experts.

by Dr. Ken Collier, Senior Consultant, Cutter Consortium; with Jim Highsmith, Director,

Cutter Consortium Agile Product & Project Management Practice

Agile Data Warehousing:Incorporating Agile Principles

BUSINESS INTELLIGENCEADVISORY SERVICEExecutive Report, REPRINT

Page 4: Agile Data Warehousing - The Agilist

REPRINT www.cutter.com

22 BUSINESS INTELLIGENCE ADVISORY SERVICE

WHEN DATA WAREHOUSINGGOES BAD

Traditional data warehouse proj-ects follow a typical waterfalldevelopment model in which rig-orous efforts are made to collectcomplete requirements, designcomprehensive architectures anddata models, develop and popu-late repositories, and, ultimately,develop the analytical reports andartifacts that users want. Theseprojects are complex affairs,involving a project manager lead-ing a team of specialists includingbusiness analysts, data architects,and so forth. Depending on theirmagnitude, these projects gener-ally run at least six months andcan easily exceed US $1 million.

I have worked with some talentedand experienced data warehousedevelopers. I’ve also had the bene-fit of working with savvy clientswho have reasonable expectationsand a pretty stable set of require-ments. Sounds like a recipe fortotal success, right? But often,business users are less than ecsta-tic about the first data warehouseproduction rollout. User reactionscommonly range from “This isn’twhat I was told I would get” to “Ican see where this might be usefulwith some refinement.”

Most data warehouse developershave participated in projectsthat were less than successful.

I recently worked with a midsizecompany seeking to replace itsexisting homegrown reportingapplication with a properly archi-tected data warehouse. At theoutset, the project seemed poisedfor success and user satisfaction.Despite the best efforts of devel-opers, project managers, andstakeholders, however, the proj-ect ran over budget and pastdeadline, and users were lessthan thrilled with the outcome.Because this project in particularmotivated the development ofADW, the following offers a briefretrospective to highlight the prin-ciples and practices presentedlater in this report.

Project Attributes

Existing application. Internally,the company’s existing reportingapplication was referred to as a“data warehouse.” In reality,though, the data model was areplication of parts of the legacyoperational database. This repli-cated database did not includedata scrubbing and was wrappedin a significant amount of customJava code to produce the speci-fied reports required. At vari-ous times, users had requestednew custom reports, thus overbur-dening the application with highlyspecialized and seldom-usedreporting features. All the reportscould be classified as canned

reports. No advanced analyticalcapabilities were provided.

Project motivation. Because theexisting “data warehouse” wasnot architected according to datawarehousing best practices, it hadreached the practical limits ofmaintainability and scalabilityneeded to continue meeting userrequirements. Additionally, withthe new billing system comingonline, it was evident that theexisting system could not easilybe adapted to accommodate thenew data. Therefore, at the execu-tive level, there was strong sup-port for a properly designed datawarehouse.

External drivers. A sales teamfrom a leading worldwide vendorof data warehousing and businessintelligence (BI) software initiallyenvisioned the data warehousingproject. Providing guidance andpre-sales support, the sales teamhelped project sponsors under-stand the value of eliciting thehelp of experienced BI consul-tants with knowledge of industrybest practices. But as with manysales efforts, initial estimates ofproject scope, cost, and schedulewere too ambitious.

Development team. The devel-opment team comprised exclu-sively external data warehousingconsultants. Because the

Page 5: Agile Data Warehousing - The Agilist

company’s existing IT staff hadother high-priority responsibilities,the development team did nothave extensive knowledge of thebusiness or existing operationalsystems. The development teamdid, however, have open accessto business and technical expertswithin the company as well astechnology experts from thecompany’s software vendor.While initial discovery effortswere challenging, all stakeholdersparticipated extensively.

Customer role. The company’sfinance division filled the roleof the primary “customer” forthe new data warehouse, andthe CFO sponsored the project.Together, the customer team andthe CFO had a relatively focusedbusiness goal of gaining more reli-able access to revenue and prof-itability information. They also hada substantial amount of existingreports used in business analysison a routine basis, offering a rea-sonable basis for requirementsanalysis.

Project management. CorporateIT handled project managerresponsibilities using traditionalProject Management Institute(PMI) practices. The IT group wassimultaneously involved in twoother large development projects,both of which had direct or indi-rect impact on the scope of thedata warehouse project.

Hosted environment. Due to lim-ited resources and infrastructure,the company’s IT leadership had

recently decided to partner withan ASP to provide hosting servicesfor newly developed productionsystems. The data warehouse wasexpected to reside at the hostingfacility located on the West Coastof the US, while company head-quarters were located on the EastCoast. Because legacy systemsremained on the corporate infra-structure, separate geographiclocations for the warehouse andheadquarters created complica-tions — though not insurmount-able ones — for the movement oflarge volumes of data.

Project Outcome

The original project plan called foran initial data warehouse launchwithin 90 days. As any experi-enced data warehouse developerknows, such a deadline imposesan ambitious, but not impossible,schedule (assuming appropriatescope definition). But the datawarehouse launch for this projectcame a full eight months afterproject start. User acceptancetesting did not go well. Userswere already annoyed with proj-ect delays, and when they finallysaw the promised features, therewas a significant gap betweenuser expectations and the endproduct. As is common with lateprojects, staff members wereadded to the development teammidstream to try and get the proj-ect back on track. So project costsfar exceeded the planned budget,and the project was placed onhold until further planning could

be done to justify continueddevelopment.

Retrospective

So who was to blame? Usersthought that developers hadmissed the mark and failed toimplement all their requirements.Developers believed that userexpectations had not been prop-erly managed and thus projectscope had grown out of control.Project sponsors thought that thevendor and the consulting firmhad overpromised and underdeliv-ered. The vendor and consultingfirm believed that internal politicsand organizational issues were toblame. Finally, many members ofthe company’s IT staff felt threat-ened by their own lack of owner-ship on the project and secretlycelebrated the failure.

The project degenerated into aseries of meetings to review con-tracts and project documents todetermine who should be heldresponsible. And guess what?Everyone involved was partiallyto blame. In addition to the com-mon technical challenges of dataextraction, integration, and cleans-ing, the following were identifiedas root causes of project failure:

The project contract did notsufficiently define the projectscope.

Requirements were incom-plete, vague, and open-ended.

There were conflicting inter-pretations of the previously

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 33

Page 6: Agile Data Warehousing - The Agilist

approved requirements anddesign documents.

Developers put in long nightsand weekends in a chaoticattempt to respond to userchanges and new demands.

Developers did not fully under-stand user requirements orexpectations and did notmanage requirementschanges well.

Users had significant miscon-ceptions about the purpose ofa data warehouse since theprevailing understanding of awarehouse was based on theprevious reporting application(which was not a good model).

Vendors made ambitiouspromises that developers couldnot fulfill in the time available.

The project manager did notmanage user expectations.

IT staff withheld importantinformation from developers.

The ASP partner did not pro-vide the level of connectivityand technical support thatdevelopers expected.

Hindsight truly is 20/20, and in thewaning days of this project, sev-eral realities became apparent:

1. A higher degree of interactionbetween developers, users,stakeholders, and internal ITexperts would have ensuredaccurate understandingbetween all participants.

2. Early and frequent workingsoftware, no matter how

simplistic, would have greatlyreduced users’ misconceptionsand increased the accuracy oftheir expectations.

3. Greater emphasis on usercollaboration would havehelped to avoid conflictinginterpretations of requirements.

4. A project plan that focusedon adapting to changes ratherthan on meeting a fixed andarbitrary deadline would havegreatly improved user satisfac-tion with the end product.

In the end, and regardless of whois to blame, for this data ware-housing project and many otherfailures, the root cause is a dis-connect between user and devel-oper expectations.

ADW BACKGROUND (Ken Collier)

Until 2003, upon meetingAgileAglliance cofounder JimHighsmith, I was only superficiallyaware of the Alliance, a group ofthought leaders from the softwareengineering community. I hadfocused on advancements in datawarehousing and BI rather thanon software engineering. Butafter meeting Jim, I began to seethe connection between agileprinciples and data warehousingpractices.

Jim and I, along with another col-league, met weekly to share cof-fee, experiences, and ideas. Jimwould talk about his Agile ProjectManagement (APM) principles asthey unfolded, and I would lamentmy most recent data warehouseproject for being understaffed andoverscoped and for having ever-changing user requirements. Mydevelopment team would workovertime, and yet users werenever satisfied.

But during a weekly meeting,I had an “aha!” experience: itdawned on me that all this agilestuff that Jim and others havetalked and written about has adirect application to data ware-housing and BI in general. I real-ized it made sense to establish aset of values, principles, and prac-tices for infusing agility into all BIprojects, from canned reportingto data visualization, data miningand quantitative analytics, and ofcourse, data warehousing. I havesince put ADW principles intopractice and am enthusiasticabout the results.

Using ADW practices, I was ableto lead three developers with littleexperience in data warehousingto complete a data warehouse intwo and a half months withoutour team having to pull a singleall-nighter. Although our resultingdata warehouse will mature overtime, the complete end-to-endarchitecture has been established,and multiple OLAP reports areavailable for use. Author LukeHohmann likens early-stage agile

REPRINT www.cutter.com

44 BUSINESS INTELLIGENCE ADVISORY SERVICE

Regardless of who is to blame,

the root cause of data warehousing

project failure is a disconnect

between user and developer

expectations.

Page 7: Agile Data Warehousing - The Agilist

projects to babies: they are com-plete but immature. This is a use-ful way to think about ADW. Ourgoal is to get a working systemin the hands of users as early aspossible so that we can start get-ting feedback and improving thesystem.

In the spirit of full disclosure,ADW borrows unabashedly fromthe works of agile pioneers suchas Ambler, Beck, Martin Fowler,Highsmith, Hohmann, CutterConsortium Senior ConsultantAlistair Cockburn, and others. Mycontribution is the adaptation ofthese thought leaders’ ideas toindustry-standard data warehous-ing best practices. Using my ownknowledge and experience, andwith much coaching from Jim,my aim is to establish a practical,applicable, and more successfuldata warehouse developmentalternative.

ADW AND APM (Jim Highsmith)

Ken and I met less than two yearsago and were introduced by amutual friend. Many of our earlyconversations concerned politics,restaurants, and especially skiing,but eventually discussion turnedto the project Ken was workingon and my work in agile softwaredevelopment and project manage-ment. As Ken talked about hisdata warehousing project and oth-ers that exhibited similar waterfallcharacteristics, we discussed howan agile approach might havemitigated some of the problems.

Since then, and as discussed inthis report, Ken began to use agilepractices — and in particular,APM practices — with successfulresults. We have also workedtogether implementing agiledevelopment and project man-agement with a joint client on adata warehouse–like project. As Ilearned more about data ware-housing and BI from Ken, and helearned more about agile develop-ment from me, we came to themutual conclusion that agile prac-tices could solve some of the illsplaguing traditional data ware-house and BI projects.

Historically, these projects arethe epitome of big design up front.Even worse, the usual implemen-tation sequence for these projectsleaves user reporting until late inthe project — after all the datastaging, cleansing, reporting database development, and other“technically” challenging portionsof the project are complete. Inmany cases, unfortunately, whenusers finally see the results, theydon’t like what they see. Doublyunfortunate, by the time userssee the results that fall short oftheir expectations, most of theproject’s resources have beenspent and options have becomelimited. With the noose on theproject tightening, the blamegame begins, because everyonehas already lost.

Ken and I believe that combiningagile practices with data ware-house development has tremen-dous potential to ensure that

these projects deliver value tocustomers early and often in aproject. In place of a huge projectthat goes on for 12 to 18 monthsbefore users have access to valu-able results, this approach has thepotential to deliver results in justa few months, with subsequentquarterly deliveries.

Note: Ken has authored the bulkof this report; I’ve chipped in hereand there, especially in the APMsection below.

APM

An Agile Synopsis

In 2001, the AgileAlliance outlineda set of core values, principles,and practices for developing soft-ware that better meets users’needs and expectations. The mostsignificant outgrowth of the firstAgileAlliance workshop in 2001was a statement of shareddevelopment values called theManifesto for Agile SoftwareDevelopment [2], which states:

We are uncovering betterways of developing softwareby doing it and helping oth-ers do it. Through this workwe have come to value:

Individuals and inter-actions over processesand tools

Working software over com-prehensive documentation

Customer collaborationover contract negotiation

Responding to change overfollowing a plan

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 55

Page 8: Agile Data Warehousing - The Agilist

That is, while there is valuein the items on the right[roman typeface], we valuethe items on the left [boldtypeface] more. [2]

For practitioners like me withtechnical roots in the rigors of pre-scriptive methodologies, the AgileManifesto is as powerful as it issimplistic. After years of practicing“heavyweight process” develop-ment, I find these agile values lib-erating. In my experience, whenpush comes to shove, we instinc-tively adopt the values expressedin the Agile Manifesto. Unfortu-nately, when push comes toshove, we have often crossedthe boundary between order andchaos. By adopting these valuesfrom the outset of a project, wecan maintain order from thebeginning and gain satisfactionin the knowledge that our time isspent productively on creatingworking software.

From Agile Principles toBusiness Objectives

APM is characterized by envision-ing and exploring processes ratherthan planning and doing proc-esses. APM addresses projectsin which the solution is unknownin the beginning and the projectteam must explore the problemand solution space to createa product that meets the cus-tomer’s vision and deliverscustomer value. When a projecthas significant risk and uncer-tainty, it requires an envisionexplore process. Explorationincludes experimenting with

different solutions to see whichones work.

Early on in high exploration–factorprojects, the team may not knowthe complete feature list. It mayhave a reasonable product vision,various business constraints (e.g.,target schedule, target costs), ageneral idea of key features, andsome ideas about the overallarchitectural skeleton, but thedetails will emerge as the teamdelivers completed features everycouple of weeks. The team’sprocess is one of evolution andadaptation, not planning and opti-mization; the process embraceschange.

While most people recognizeindustry volatility, they have nottruly modified either their mindsetor their management practices toacknowledge change. For exam-ple, the perceived high cost ofchange has driven methodologiesto strive to limit change by engag-ing in extensive front-end planningand analysis. The problem is thatno matter how thorough this earlywork, stuff happens; change hap-pens. So rather than allow thehigh cost of change to dictateprocess, agile developers focuson reducing costs; low-costchange enables a developmentcycle of minimum up-front workfollowed by a succession of shortdevelopment iterations. When weadequately reduce the cost ofchange, the entire economics ofsoftware development changes;the process shifts from one based

on anticipation (define, design,and build) to one based on adap-tation (envision, explore, andrefine).

While plan do processes workfor low exploration–factor proj-ects, they are no match for today’svolatile business environment. Sosoftware development and projectmanagement processes must begeared toward mobility, experi-mentation, and speed. But first,they must be geared toward busi-ness objectives.

Reliable Innovation

There are five key businessobjectives for agile softwaredevelopment and APM: (1) contin-uous innovation (delivering oncurrent customer requirements);(2) product adaptability (deliver-ing on future customer require-ments); (3) reduced deliveryschedules (meeting market win-dows and improving ROI); (4)people and process adaptability(responding rapidly to productand business change); and (5)reliable results (supporting busi-ness growth and profitability).

Continuous Innovation

Developing new software applica-tions, such as those for data ware-housing and BI, requires amindset that fosters innovation.Business value isn’t derived fromthese applications by followingprescribed paths but by blazingnew trails.

REPRINT www.cutter.com

66 BUSINESS INTELLIGENCE ADVISORY SERVICE

Page 9: Agile Data Warehousing - The Agilist

Product Adaptability

No matter how well a data ware-house application is developed,the future always brings surprises.As an enterprise changes, as newexecutives use the application,as business processes evolve, asnew markets are pursued, theneeds of data warehouse and BIapplications change as well. Theonly way to survive is to strive foradaptability — a critical design cri-terion for a data warehouse appli-cation. Agile technical practicesfocus on reducing the cost ofchange (or adaptation) as anintegral part of the developmentprocess.

Reduced Delivery Schedules

Reducing delivery schedulesremains a high-priority businessgoal for executives, but providingincremental business value is justas important. Would you ratherhave a complete data warehouseapplication in 18 months or incre-mental data marts every threemonths? Further, incrementaldelivery can significantly increasethe ROI of projects as well as gen-erate valuable feedback from thedevelopment process throughincremental releases. APM con-tributes to reducing deliveryschedules in two key ways:focus and streamlining.

First, the constant focus on fea-tures and the priority for theirrelease in short, iterative time-boxes forces teams (comprising

customers and developers) toconsider both the number of fea-tures to include and the depth ofthose features. This reduces theoverall workload by eliminatingmarginally beneficial features.Second, APM streamlines thedevelopment process by concen-trating on value-adding activitiesand eliminating overhead.

People and Process Adaptability

Have you ever worked on twoprojects that were identical: thatis, they shared the same people,same problem, same constraints,same customers, and the sameexecutives? Why, then, do manyorganizations insist that the sameprocesses and practices shouldapply to every project? Whilemany organizations strive forstandardization of process andpractices, they should strive forconsistency. What we need inmost instances is a consistentframework with local variationsin process and practices toaccommodate differences amongprojects. We must also buildadaptable teams whose membersare comfortable with change andview change as integral to thrivingin a dynamic project environmentrather than as an obstacle.

Reliable Results

Manufacturing processes aredesigned to be repeatable and todeliver the same result time aftertime. They are predictable andtherefore repeatable. However,because of the uncertainty andrisk, exploration processes aredifferent. They can deliver on acustomer’s vision and accordingto a customer’s requirements asthese requirements evolve, butthey can’t deliver a completelyprespecified result. APM deliversconsistent, reliable results.

The APM Framework

As shown in Figure 1, the APMframework supports the businessobjectives of continuous innova-tion, product adaptability, reduceddelivery schedules, people andprocess adaptability, and reliableresults through its five phases:envision, speculate, explore,adapt, and close.

In practice, projects have twoiterative cycles of collaborativeplanning and collaborative devel-opment, as shown in Figures 2and 3. In APM, as in all agile meth-ods, both planning and develop-ment are done collaboratively.The collaborative planning cycle,which focuses on product vision,project scope and boundaries,and the overall project releaseplan, can — and should — beused throughout the project whenenough new information has beengathered to significantly alter theplanning effort. The collaborative

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 77

While many organizations strive

for standardization of process

and practices, they should

strive for consistency.

Page 10: Agile Data Warehousing - The Agilist

development cycle defines whatis accomplished during each two-to-four-week iteration: detailedplanning, development, andadaptation.

Envision

During the envision phase,customers and the project team

create a product and project com-munity vision that covers who,what, and how. Visioning is visual;the team must create a picturethat focuses developers and cus-tomers on the critical aspects ofthe product. One technique fordoing so is the product visionbox. The team designs a “box”in which the application will be

packaged, which forces the teamto focus on exactly who the cus-tomer is and the key three orfour features that will sell theapplication. The second aspect ofvisioning is to create a picture ofthe entire project community —of customers, product managers,project team members, and stake-holders — and how these partiesintend to work together. Otherspecific practices in the envisionphase include (1) generatingproduct architecture and guidingprinciples, (2) creating the projectdata sheet, and (3) process andpractice tailoring.

Speculate

While the word “speculate” maycreate an image of reckless risktaking, one dictionary definition is“to conjecture something basedon incomplete facts or informa-tion.” As mentioned previously,APM builds on an envisionexplore rather than a plan domodel, and therefore the impliedcertainty usually attributed to theword “plan” must be overcome.APM may reliably deliver value,but recognizing that value mayrequire a few twists and turns.

The two critical characteristicsof the release-milestone-iterationplanning practice used in thisphase are short, timeboxed itera-tions and feature-based planning.Iterations are generally two-to-six-week timeboxes. Features, nottasks, are assigned to each itera-tion. Features are represented byindex cards on which sufficient

REPRINT www.cutter.com

88 BUSINESS INTELLIGENCE ADVISORY SERVICE

Requirements

Cards

F e a tu re /C o m p o ne n t Req u ir e m en ts Car d

F e at ur e/ Co m pone nt ID : P lann ed Cy cle : F e at ur e/ Co m pone nt N am e :

F e at ur e/ Co m pone nt T yp e:

F e at ur e/ Co m pone nt D es crip tio n :

Es t. Wo rk E f fort :

R equ irem ents Un certain ty (H ,M ,L):

D epen dencie s w ith o ther F ea tu re s:

Architecture1.1

Architecture1.2

F e a tu re /C o m p o ne n t Req u ir e m en ts Car d

F e at ur e/ Co m pone nt ID : P lann ed Cy cle :

F e at ur e/ Co m pone nt N am e :F e at ur e/ Co m pone nt T yp e:

F e at ur e/ Co m pone nt D es crip tio n :

Es t. Wo rk E f fort :

R equ irem ents Un certain ty (H ,M ,L):

D epen dencie s w ith o ther F ea tu re s:

F ea tu r e/ Com po n en t R eq u ire m e n ts C ard

Fe ature /Co m po n en t I D: Pl ann ed C ycle :

Fe ature /Co m po n en t N am e :

Fe ature /Co m po n en t T ype :

Fe ature /Co m po n en t D es cr ipt i on:

Est . Wo rk E f fo rt:

Re qui re m ents Un ce rtain ty (H ,M ,L ): D ep end en ci es with other Fea tu res:

F ea tu r e/ Com po n en t R eq u ire m e n ts C ard

Fe ature /Co m po n en t I D: Pl ann ed C ycle : Fe ature /Co m po n en t N am e :

Fe ature /Co m po n en t T ype :

Fe ature /Co m po n en t D es cr ipt i on:

Est . Wo rk E f fo rt:

Re qui re m ents Un ce rtain ty (H ,M ,L ): D ep end en ci es with other Fea tu res:

F e at u re /Co m p o n en t R e qu ir e m e nt s C a rd

Fe atur e/Co m pon ent ID: P la nn ed C ycl e:

Fe atur e/Co m pon ent N am e :

Fe atur e/Co m pon ent T ype :

Fe atur e/Co m pon ent D es cript i on :

E st. Wo rk E f fo rt : Re qu ire m e nt s U nc er ta inty (H,M ,L) :

Dep e nd encie s with o th er F e atu r es :

F e at u re /Co m p o n en t R e qu ir e m e nt s C a rd

Fe atur e/Co m pon ent ID: P la nn ed C ycl e:

Fe atur e/Co m pon ent N am e :

Fe atur e/Co m pon ent T ype :

Fe atur e/Co m pon ent D es cript i on :

E st. Wo rk E f fo rt : Re qu ire m e nt s U nc er ta inty (H,M ,L) :

Dep e nd encie s with o th er F e atu r es :

F ea tu r e/ C om p o n e nt R e q ui re m e n ts C a rd

F ea tu re /Com p onen t ID: P la nne d C ycle :

F ea tu re /Com p onen t Na m e:

F ea tu re /Com p onen t T ype :

F ea tu re /Com p onen t De scri pt ion :

Est. W ork E ffort :

Re qui reme nts U nc erta inty ( H,M ,L): De pend enci es with o th er F e at ur es :

F e a tu re /C o m p o ne n t Req u ir e m en ts Car d

F e at ur e/ Co m pone nt ID : P lann ed Cy cle : F e at ur e/ Co m pone nt N am e :

F e at ur e/ Co m pone nt T yp e:

F e at ur e/ Co m pone nt D es crip tio n :

Es t. Wo rk E f fort :

R equ irem ents Un certain ty (H ,M ,L):

D epen dencie s w ith o ther F ea tu re s:

F e a tu re /C o m p o ne n t Req u ir e m en ts Car d

F e at ur e/ Co m pone nt ID : P lann ed Cy cle :

F e at ur e/ Co m pone nt N am e :F e at ur e/ Co m pone nt T yp e:

F e at ur e/ Co m pone nt D es crip tio n :

Es t. Wo rk E f fort :

R equ irem ents Un certain ty (H ,M ,L):

D epen dencie s w ith o ther F ea tu re s:

F ea tu r e/ Com po n en t R eq u ire m e n ts C ard

Fe ature /Co m po n en t I D: Pl ann ed C ycle :

Fe ature /Co m po n en t N am e :

Fe ature /Co m po n en t T ype :

Fe ature /Co m po n en t D es cr ipt i on:

Est . Wo rk E f fo rt:

Re qui re m ents Un ce rtain ty (H ,M ,L ): D ep end en ci es with other Fea tu res:

F ea tu r e/ Com po n en t R eq u ire m e n ts C ard

Fe ature /Co m po n en t I D: Pl ann ed C ycle : Fe ature /Co m po n en t N am e :

Fe ature /Co m po n en t T ype :

Fe ature /Co m po n en t D es cr ipt i on:

Est . Wo rk E f fo rt:

Re qui re m ents Un ce rtain ty (H ,M ,L ): D ep end en ci es with other Fea tu res:

F e at u re /Co m p o n en t R e qu ir e m e nt s C a rd

Fe atur e/Co m pon ent ID: P la nn ed C ycl e:

Fe atur e/Co m pon ent N am e :

Fe atur e/Co m pon ent T ype :

Fe atur e/Co m pon ent D es cript i on :

E st. Wo rk E f fo rt : Re qu ire m e nt s U nc er ta inty (H,M ,L) :

Dep e nd encie s with o th er F e atu r es :

F e at u re /Co m p o n en t R e qu ir e m e nt s C a rd

Fe atur e/Co m pon ent ID: P la nn ed C ycl e:

Fe atur e/Co m pon ent N am e :

Fe atur e/Co m pon ent T ype :

Fe atur e/Co m pon ent D es cript i on :

E st. Wo rk E f fo rt : Re qu ire m e nt s U nc er ta inty (H,M ,L) :

Dep e nd encie s with o th er F e atu r es :

Project D ata Shee tP ro jec t Na m e: Pro je ct Le ad er :

P ro jec t S ta rt Da te : Executive Sp onsor:

C lie nts: C lien t B enefits :

P ro jec t O b jec tive S tatem en t:

Per fo rm an ce /Q ua lity A tt ributes :

F ocus M atrix : E xc e l Im pr o ve Ac ce p t Ta rg e t

S c op e

S c he d ule

D e fec ts

R e so urc e

F eatu res : (Abil ity to S tatem ents ) Arch ite cture :

M ajor M ile sto nes: Issues/R isks :

Date

1 .

2 .

3 .

4 .

1 9 92 -9 7 Kn o w led ge S t ru c tures , In c.

Project D ata Shee tP ro jec t Na m e: Pro je ct Le ad er :

P ro jec t S ta rt Da te : Executive Sp onsor:

C lie nts: C lien t B enefits :

P ro jec t O b jec tive S tatem en t:

Per fo rm an ce /Q ua lity A tt ributes :

F ocus M atrix : E xc e l Im pr o ve Ac ce p t Ta rg e t

S c op e

S c he d ule

D e fec ts

R e so urc e

F eatu res : (Abil ity to S tatem ents ) Arch ite cture :

M ajor M ile sto nes: Issues/R isks :

Date

1 .

2 .

3 .

4 .

1 9 92 -9 7 Kn o w led ge S t ru c tures , In c.

F ea tu r e/ C om p o n e nt R e q ui re m e n ts C a rd

F ea tu re /Com p onen t ID: P la nne d C ycle :

F ea tu re /Com p onen t Na m e:

F ea tu re /Com p onen t T ype :

F ea tu re /Com p onen t De scri pt ion :

Est. W ork E ffort :

Re qui reme nts U nc erta inty ( H,M ,L): De pend enci es with o th er F e at ur es :

Productvision

Project scopeand

boundariesRelease plan

Cycle 0 Cycle 1 Cycle 2 Cycle 3 Cycle 4

Architecture

1.0

Architecture

1.1

Architecture

1.2

Requirements

cards

Feature 1

Feature 2

Feature 3

Feature 4

Feature 5

Feature 6

Feature 7

Figure 2 — The collaborativeenvision cycle. (Source: [7].)

culate

Close

Envision

Speculate Explore

Adapt

Close

Feature

list

Adaptive

action

Final

product

Release

plan

Completed

features

Vision

Figure 1 — Highsmith’s Agile Project Management framework. (Source: [7].)

Iteration

plan

DevelopReview

and adapt

Figure 3 — The collaborativedevelopment cycle. (Source: [7].)

Page 11: Agile Data Warehousing - The Agilist

information is recorded to con-duct the planning. At the end ofthese iterations, features are com-plete only if they are “done-done”:that is, if they have been devel-oped and unit-tested (done)and then user acceptance–tested(done-done). If eight features arecompleted during an iteration, thevelocity of development is eightfeatures per iteration, and thisdelivery rate is used for planningthe next iteration. The two criteriafor assigning features to iterationsare (1) delivering customer valu-able features (as prioritized by thecustomer representatives on theteam) and (2) implementing fea-tures that reduce project risk.

As shown in Figures 2 and 3, spec-ulating is done at two levels: forthe entire project and for the nextiteration. The overall envisionplanning cycle may be repeatedevery two to six iterations depend-ing on how much has changedsince completion of the previousrelease plan. Specific practices inthis phase include creating theproduct feature list, feature (orstory) cards, and the release-milestone-iteration plans.

Explore

The explore phase develops prod-uct features. During this phase,the project team delivers the fea-tures identified in the releaseand iteration plans, and projectmanagers must focus on threecritical activity areas: (1) deliver-ing features by managing the

workload and using appropriatetechnical practices; (2) creating acollaborative, self-organizing proj-ect community; and (3) managingthe interactions among develop-ers, customers, executives, andother stakeholders.

In the explore phase, specificpractices include workload man-

agement (the team manages thedistribution of its work); technicalpractices, such as ruthless testingand refactoring; coaching andteam development; and partici-patory decision making.

Adapt

Project managers usually refer tothe adapt phase as monitoringand correcting and to the resultsof this activity as “correctiveaction.” The phrase “correctiveaction,” which actually meansto correct according to the plan,implies that the plan was correctand that the reason for variationfrom the plan is poor team per-formance. APM uses the term“adaptive action,” which impliessomething quite different. Adap-tive action implies modification orchange rather than success or fail-ure. Agile projects are guided by

the philosophy that responding tochange is more important thanfollowing a plan; so while teamperformance may be an issue,the change may be attributable topoor planning or new information.

In an agile project, every itera-tion’s conclusion provides aforum for the team to reflect onthe project through various lenses:customer issues, technical con-cerns, project status, as well aspersonnel, process, and perfor-mance issues. The analysis exam-ines actual versus planned results;but even more important, it con-siders new information. Theresults of adaptation are fed intoa replanning effort to begin thenext iteration.

Close

The final phase of the APMframework is closing the project.Projects have a beginning and anend, and each must be recog-nized. The failure to identify aproject’s end point can createproblems of perception amongcustomers. The end of a projectshould include cleanup activitiesand a complete project retrospec-tive to pass lessons learned alongto the next project team.

The Essence of APM

In the end, affirmative answersto these two questions form theessence of APM and agile soft-ware development: (1) are youdelivering innovative products toyour customers?; and (2) are you

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 99

Projects have a beginning and an

end, and each must be recognized.

The failure to identify a project’s end

point can create problems of

perception among customers.

Page 12: Agile Data Warehousing - The Agilist

excited about going to work everyday? Agilists want to build innova-tive products — products that testthe limits of our abilities — andto create a work environment inwhich people, as individuals andas teams, can thrive.

Just like Broadway plays alwaysdeliver on opening night, APMdelivers on time and in accor-dance with the customer’s vision— more reliably than any otherapproach for high exploration–factor projects. Given the highdegree of uncertainty in manynew product efforts, given techno-logical change, and given the ebband flow of staff, reliable resultsare still obtainable. Given all themaybes, agile methods still deliver— a tribute to the passion, drive,persistence, and ingenuity of proj-ect team members.

TRADITIONAL DATAWAREHOUSING

If you have taken part in a datawarehousing project, you areaware of the numerous chal-lenges, perils, and pitfalls. RalphKimball, Bill Inmon, and otherdata warehousing pioneers havedone an excellent job of develop-ing reusable architectural patternsfor data warehouse implementa-tion. Software vendors have donea good job of creating tools andtechnologies to support the con-cepts. Nonetheless, data ware-housing is just plain hard, and forseveral reasons:

Most organizations have notpreviously built a data ware-

house or have only limitedexperience in doing so.

Most organizations don’t buildmultiple data warehouses,and therefore developmentprocesses don’t get a chanceto mature.

Organizations often set out tobuild an enterprise data ware-house, or at least a broad-reaching data mart, whichmakes the process morecomplex.

Data warehouse consultantshave extensive expertise indata warehousing, but not inthe organization’s businessdomain.

Business users typically don’tunderstand what a data ware-house does and doesn’tprovide.

Business users often thinkof data warehousing as atechnology-based plug-and-play application.

As users gain a better under-standing of data warehousing,their needs and wishes change.

Business data always has qual-ity problems that surface in thedata warehouse, causing usersto question the warehouserather than the source systems.

Organizations often view a datawarehouse as an IT applicationrather than a joint venturebetween business stakeholdersand IT developers.

Data warehousing requiresan entirely different skill set

than that of typical databaseadministrators (DBAs) anddevelopers.

Data warehousing requires amultitude of unique skills suchas multidimensional modeling;data cleansing; extraction,transformation, and loading(ETL) development; OLAPdesign; application develop-ment; and so forth.

Figure 4 depicts an example of theclassic data warehousing architec-ture, as first described by Kimball[8]. It consists of the followingcomponents:

Operational source systems.These are one or more existingsystems from which data isextracted, transformed, andloaded into the data warehousestaging database. They are opti-mized for the daily transactionalprocessing required to run busi-ness operations.

Staging database. This data-base often mirrors the datastructures in the source sys-tems and provides a “holdingpen” where data can beprocessed, manipulated,transformed, cleansed, andvalidated without placing anundue burden on the opera-tional systems.

Presentation repository.Data is extracted from thestaging database, transformed,and loaded into appropriatestructures for optimizedmultidimensional reporting.This system is designed to sup-

REPRINT www.cutter.com

1100 BUSINESS INTELLIGENCE ADVISORY SERVICE

Page 13: Agile Data Warehousing - The Agilist

port the data slicing and dicingthat defines the power of a datawarehouse.

Data access server. This con-ceptual server represents thevarious tools that provide userswith access to data, includingreport writers, ad hoc querying,OLAP, data visualization, datamining, statistical analysis, andso on.

Data Warehousing Technical Skills

Inherent in this architecture arethe following aspects of develop-ment, each of which requires aunique set of development skills:

Data modeling. This involvesdesign and implementation ofdata models for both the stag-ing database and presentation

repository. Each has uniqueproperties.

ETL development. ETL refersto the extraction of data fromsource systems into staging, thetransformations necessary torecast source data for analysis,and the loading of transformeddata into the presentation repos-itory. ETL includes the selectioncriteria to extract data fromsource systems, performing anynecessary data transformationsor derivations needed, data-quality audits, and cleansing.

Data cleansing. Sourcedata is typically imperfect.Furthermore, merging datafrom multiple sources caninject new data quality issues.As an important aspect of a

data warehouse, achieving datahygiene requires specific skillsand techniques.

OLAP design. Typically datawarehouses support somevariety of OLAP techniques(HOLAP, MOLAP, or ROLAP).Each OLAP technique is differ-ent and requires special designskills to balance reportingrequirements and performanceconstraints.

Application development.Users commonly require anapplication interface with thedata warehouse that providesan easy-to-use front end com-bined with comprehensiveanalytical capabilities and thatis tailored to the way the userswork. This often requires some

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 1111

Operational

source systems

AP/AR

Sales

HR

Operations

Inventory

ETL processesETL processes

Data

access

server

Staging

database

Presentation

repository

Executive

reporting

Management

reporting

Field

reporting

• Data integrity checking

• Data cleansing

• Exception handling

• Data transformations

• Data preparation

Optimized for:

• Analytical reporting

• Flexible querying

• Multidimensional analysis

• “Large” data for historical

analysis

Figure 4 — An example data warehouse architecture.

Page 14: Agile Data Warehousing - The Agilist

customized programmingor commercial applicationcustomization.

Production automation.Data warehouses are generallydesigned for periodic auto-mated updates where new andmodified data is added into thewarehouse so that users canview the most recent dataavailable. These automatedupdate processes must havebuilt-in failover strategies andmust ensure data consistencyand correctness.

General network and DBAskills. Data warehouse devel-opers must have many of thesame skills as those of the typi-cal network administrator andDBA. They must understandthe implications of efficientlymoving potentially largevolumes of data across the net-work and the issues of effec-tively storing changing data.

Data warehousing is also hardbecause of the need for these spe-cialized skills. Most organizationsdo not have staff members withadequate expertise in these areas.

The Traditional DataWarehousing Process

Data warehousing projects typi-cally follow some variant of thewaterfall development approach(see Figure 5). Waterfall andrelated approaches observe aplan do model in which exhaus-tive planning is followed by com-prehensive design, development,and testing.

This process is driven by a rigor-ous up-front requirements analysiswith an eye toward collecting anddocumenting comprehensive userrequirements that establish a“contractual” agreement betweenthe developers and users. In thisstage, the challenge is to ensurethat the users have an accurateunderstanding of what a datawarehouse can and cannot pro-vide and that they have a clearunderstanding of their ownrequirements.

Once consensus on requirementsis reached, these requirementsdrive a thorough and detailed dataarchitecture and data modelingeffort. This is the core of the datawarehouse design cycle, alongwith other design activities such

as volumetric and network loadanalyses.

By this point in the traditionalapproach, developers have mini-mal interaction with users sincethe parties have already signed offon requirements. Instead, devel-opment effort is spent developingformal and detailed data modelsusing tools such as ERwin andVisio. During the design cycle, thedesign document and data dictio-nary are typical artifacts thatdemonstrate progress.

The remainder of the develop-ment effort is spent implementingthe design; developing ETL code,OLAP cubes, and data warehouseupdate scripts; and finally, inblack-box, white-box, and system-level testing. Final testing mayeven be handed off to a dedicatedQA team to verify that all require-ments have been met withoutintroducing new data anomalies.

Finally, when the developers,testers, and DBAs are confidentthat the data warehouse meetsrequirements (or, more com-monly, when the schedule runsout), users are treated to reviewsand user acceptance testing. Atthis point, the following conditionsare most common:

Users have become more edu-cated about data warehousing.

User requirements havechanged or become morerefined.

Users’ memories of earlyrequirements reviews are fuzzy.

REPRINT www.cutter.com

1122 BUSINESS INTELLIGENCE ADVISORY SERVICE

Design

Implement

Test

Release

RequirementsRequirements

Design

Implement

Test

Release

User reviewNo user interactionUser input

Six to nine months of development

Figure 5 — The typical data warehousing approach.

Page 15: Agile Data Warehousing - The Agilist

Users have high expectations interms of anticipating a new anduseful tool.

Developers continue to buildthe system based on the initialsnapshot of user understandingand requirements definition.

All these factors lead to a naturalgap between what is built andwhat is needed.

Data Warehouse Failures

In November 2004, TechTargetpolled its subscribers asking, “Haveyou ever seen a DW project fail?”Among respondents, 52% replied,“More than once”; 10% said,“Once”; 13% said, “Almost, but wemade it”; and the remaining 23%said, “Not even close” [10].

A quick Google search on thephrase “data warehouse failure”results in a small library of casestudies, postmortems, and assess-ment articles. While there is noclear definition of what constitutes“failure,” Sid Adelman and CutterConsortium Senior ConsultantLarissa Moss classify the followingsituations as characteristic of proj-ect failures [1]:

The project is over budget.

The schedule has slipped.

Some expected functionalitywas not implemented.

Users are unhappy.

Performance is unacceptable.

Availability of the warehouseapplications is poor.

There is no ability to expand.

The data and/or reportsare poor.

The project is not cost-justified.

Management does not recog-nize the benefits of the project.

In other words, simply completingthe technical implementation of adata warehouse doesn’t constitutesuccess. Take another look at thislist. Nearly every situation is “cus-tomer” focused; that is, primarilyend users determine whether aproject is successful.

Assessment

There are literally hundreds ofsimilar evaluations of project fail-ures, and they exhibit a great dealof overlap in terms of root causes:incorrect requirements, weakprocesses, inability to adapt tochanges, project scope misman-agement, unrealistic schedules,inflated expectations, and so forth.

Unfortunately, the traditionaldevelopment model does little touncover these deficiencies earlyin the project. As Jeff DeLuca,one of the founders of FeatureDriven Development (FDD), toldHighsmith, “We should try to

break the back of the project asearly as possible to avoid the highcost of change later downstream.”In a traditional approach, it is pos-sible for developers to plow aheadin the blind confidence that theyare building the right product, onlyto discover at the end of the proj-ect that they were sadly mistaken.This is true even when one usesall the best practices, processes,and methodologies.

What is needed is an approachthat promotes early discovery ofproject peril. Such an approachshould place the responsibilityof success equally on the users,stakeholders, and developers andshould reward a team’s ability toadapt to new directions and sub-stantial requirements changes.

ADW FRAMEWORK

ADW is characterized by a highlyiterative approach with substan-tial collaboration between devel-opers, users, and stakeholders.By adapting the Highsmithenvision explore cycle of projectmanagement to data warehousingmethods, we avoid the disconnectbetween developers and usersas depicted in Figure 5. Figure 6captures the essence of theenvision explore cycle for datawarehousing. It is critical fordevelopers and users to collabo-rate extensively during the envi-sion cycle and to collaboratefrequently and periodically duringthe explore cycle.

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 1133

Simply completing the

technical implementation of

a data warehouse doesn’t

constitute success. Primarily

end users determine whether

a project is successful.

Page 16: Agile Data Warehousing - The Agilist

ADW Principles

It’s useful to follow a set of guid-ing principles during data ware-house design and development.When data warehousing posesdifficult tradeoffs, these principlesoften serve as the tiebreaker.Similarly, the AgileAlliance hasestablished a set of principles forsoftware development. The fol-lowing ADW principles borrowliberally from AgileAllianceprinciples [3]:

1. Our highest priority is to satisfythe data warehouse user com-munity through early and con-tinuous delivery of workinguser features.

2. We welcome changingrequirements, even late in thecourse of development. Agileprocesses harness change fordata warehouse users’ com-petitive advantage.

3. We deliver working softwarefrequently, providing userswith new data warehousefeatures every few weeks.

4. Data warehouse users, stake-holders, and developers must

share project ownership andwork together daily throughoutthe project.

4. We value the importance oftalented and experienced BIexperts. We give them theenvironment and support theyneed and trust them to get thejob done.

5. The most efficient and effec-tive method of conveyinginformation to and within adevelopment team is face-to-face conversation.

6. A working data warehousesystem is the primary measureof progress.

7. We recognize the balance ofproject scope, timeline, andresources. The data ware-housing team must work ata sustainable pace.

8. Continuous attention to thebest data warehousing prac-tices enhances agility.

9. The best architectures,requirements, and designsemerge from self-organizingteams.

10. At regular intervals, the teamreflects on how to becomemore effective, then adjustsits behavior accordingly.

Take a minute to reflect on theseprinciples. How many are presentin your organization’s projects? Dothey make sense for your organi-zation? Are they realistic goals foryour organization? They are notonly commonsense principles butalso effective and achievable inreal projects. Furthermore, adher-ence to these principles ratherthan reliance on a prescriptiveand exhaustive process model isvery liberating.

ADW Practices

I believe that ADW meshes wellwith existing data warehousingarchitectures and practices. I amnot advocating ADW as an alter-native to the mature practicesdeveloped by Kimball, Inmon,Cutter Consortium SeniorConsultant Claudia Imhoff, andother data warehousing thoughtleaders; rather, ADW is largely aframework that redefines theprocess of implementing thosepractices.

Some of these practices aredirectly tied to the APM conceptsthat Jim discussed previously.Some are adaptations of agilesoftware development practicesto data warehousing techniques.Others are based on my experi-ence in trying to tackle the prob-lem of showing users workingsoftware early and frequently.

REPRINT www.cutter.com

1144 BUSINESS INTELLIGENCE ADVISORY SERVICE

Cycle 0 Cycle 1 Cycle 3 Cycle 4Cycle 2Architecture

1.0

Requirements

Cards

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Architecture

1.1

Architecture

1.2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature 1

Feature 2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature 3

Feature 4

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature 5

Feature 6

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature 7

Cycle 0 Cycle 1 Cycle 3 Cycle 4Cycle 2Architecture

1.0

Requirements

Cards

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Architecture

1.1

Architecture

1.2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature 1

Feature 2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Cycle 0 Cycle 1 Cycle 3 Cycle 4Cycle 2Architecture

1.0

Requirements

Cards

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Architecture

1.1

Architecture

1.2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature 1

Feature 2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature 3

Feature 4

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature 5

Feature 6

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature 7

Envision cycle Explore cycle

Releaseplanning

Businessintelligence (BI)

visionProject scopeand

boundaries

Reviewand

adapt

Iterationplanning

Develop

DevelopmentPlanning Collaborative

Figure 6 — The agile data warehousing (ADW) collaborative development cycle.

Page 17: Agile Data Warehousing - The Agilist

Practice 1: Frequent, Short Iterations

ADW is marked by a highlyiterative development cyclewhereby each iteration producesa working and demonstrabledata warehouse. The workingdata warehouse instances arearchitecturally complete even ifthey are immature relative to thetotal set of project requirements.Although the length of iterationsvaries, the aim is to establish anagile rhythm of two-week iterationcycles. This time frame is longenough to build something mean-ingful and to undo mistakes with-out it being too costly. But there isnothing wrong with iterations thatare as long as 30 days. An iterationthat lasts longer than a month,however, runs the risk of theproject losing its essential user-developer collaboration.

Presenting users with a workingdata warehouse early and oftenaccomplishes several goals:

Users are directly involved inthe development cycle.

Users and stakeholderscan see the real progressof development and canappreciate the challengesthat developers face.

Developers can verify that theyare on the right track.

After each iteration, a user reviewtakes place; these highly informalmeetings involve developers andusers. Other than developmentactivities, these meetings should

require minimal preparationand planning. The goal of suchreviews is to demonstrate themost current working system andto elicit feedback from users. Iprefer to follow a customer focusgroup approach in which the rulesare relatively simple. Developersdemonstrate, users provide feed-back, and a scribe takes notes onuser feedback. Developers shouldnot solve problems, troubleshoot,or make commitments based onuser feedback. The action takenas a result of feedback is priori-tized based on an assessmentof the effort required to makechanges. Others may observe butshould not participate in thesemeetings.

Periodically, but less frequentlythan user reviews, executivereviews should be conducted. Iprefer to schedule an executivereview after every third or fourthiteration, with the goal of demon-strating to executive sponsors andother key stakeholders that theproject is progressing and thatusers’ needs are being met.Participants should include thesponsoring executive, stakehold-ers, and representation from theusers and development team.Whenever possible, the executivereview should immediately followthe user review. In this case, exec-utives and stakeholders shouldquietly observe the user reviewmeeting. Then additional projectstatus information can be pre-sented with the active involve-ment of all parties.

Practice 2: The Conceptual Phase

One exception to the two-weekiteration goal is the first iteration,which Highsmith calls iterationzero. This iteration is earmarkedfor establishing the entire infra-structure necessary for sustain-able agile development. Iterationzero includes the following datawarehousing activities:

Capturing user requirements

Acquiring and configuring thedevelopment environment,including hardware, software,and databases

Identification, acquisition, andunderstanding of data sourcesand data quality issues

Establishing an automated test-ing framework (which is dis-cussed further later in thisreport)

Establishing team roles,responsibilities, and workingagreements

Other activities needed toensure productive develop-ment cycles

Practice 3: Develop Features,Not Functions

Agile development is driven by thecompletion of customer featuresrather than functions. The aim isto produce demonstrable featuresthat users can review and evalu-ate in light of their businessrequirements.

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 1155

Page 18: Agile Data Warehousing - The Agilist

Like software, many complexitiesof data warehousing are invisibleto users. Typical data warehouseusers care about the reports pro-vided by the application, the valid-ity of the values in the reports, andthe usefulness of the intelligencein users’ daily decision making.

All too often, developers try to“show” users the underlyingarchitectural structures, the ETLlogic, and data validation andcleansing code. After all, this iswhere much of our hard worklies. But despite this hard work,what users truly care about is thereporting application. So the focusof ADW development is on archi-tectural feature spikes: that is, fea-tures that are demonstrable to,and thus resonate with, users,such as reports or analytical mod-els. A data warehouse feature isarchitecturally complete when itpulls data through each of thearchitectural components and allthe functionality at each stage isimplemented and tested.

So how do you develop a featurespike in just two weeks? Afterall, what about all the requisitedesign, logical and physical datamodeling, ETL programming, datacleansing, and other developmenttasks that are part of data ware-housing? My response is twofold:

1. Define the feature to be smallenough to be manageable. Ifa single star schema in yourwarehouse supports severalreports, choose a single report.If a single report has a high

degree of complexity andunderlying logic, identify ameaningful subset of therequirements for that reportto implement. Remember, afeature must be architecturallycomplete, not necessarilymature.

2. Do the minimum amount ofdesign, modeling, and devel-opment required to completethe feature spike. It is notnecessary to fully architect ormodel the staging databaseand data warehouse repositoryto complete the feature. Notonly is it acceptable to itera-tively evolve the data modelsand system design, but doingso helps ensure that these arti-facts more accurately reflectwhat you build.

Figure 7 conceptually depicts howI think of FDD as it relates to datawarehousing. I coach ADW devel-opers to focus on the aspects ofeach architectural component thatare related to the feature they aredeveloping. For example, supposeyou are developing an OLAPreport for product revenue andprofitability. The feature spikemight include the followingactivities:

Develop the star schema modelbased on a fact table contain-ing net profit, gross profit, andrevenue.

Identify existing dimensionsthat can be conformed to thenew star schema.

Identify the source tables nec-essary to populate your factand dimension tables.

Develop the data model foryour staging database to cap-ture the requisite source datafor this feature.

Determine the data integrityaudits, data cleansing logic,and any data merging require-ments for the source data.

Implement your logical datamodels for both staging andwarehouse repositories.

Implement the ETL code nec-essary to source the data intothe staging database.

Fully test the data in the relatedstaging tables according to yourQA requirements.

Implement the ETL code nec-essary to transform your stag-ing data into your physical starschema tables.

Fully test the data in your starschema according to your QArequirements.

Design, build, and populate theOLAP cube.

Fully test the base table andcalculated measures accordingto your QA requirements.

REPRINT www.cutter.com

1166 BUSINESS INTELLIGENCE ADVISORY SERVICE

Developers must design and model

the feature-related aspects of the

system with an eye toward how their

design will integrate with the

completed system and how it

will affect other developers.

Page 19: Agile Data Warehousing - The Agilist

Develop the report(s) in theuser application.

Fully test the reportedvalues according to yourQA requirements.

Note that it is important for agiledevelopers to maintain a globalperspective while taking a local-ized approach. In other words,developers must design and modelthe feature-related aspects of thesystem with an eye toward howtheir design will integrate with thecompleted system and how it willaffect other developers.

This challenge of FDD emphasizesthe importance of developer col-laboration. There is no substitutefor discussing a feature-specificdesign decision with other devel-opers whose features may beaffected by that decision. Noris this feature-driven approacha substitute for good design.Occasionally a feature may evolveto meet user acceptance criteriabut may not be built on the best

underlying data models anddesign. Refactoring is the agilepractice of redeveloping anaccepted feature using betterdesign principles.

Practice 4: Develop Production-Quality Features

An architectural feature spikerefers to a completed feature: thatis, a feature that is production-ready and “done-done.” Thismeans that all user requirementsrelative to the feature have beenimplemented and fully tested.Highsmith likes the guideline thatdeems features complete when,even if project funding werestopped immediately, users wouldat least have a working productthat meets some requirements.I like the idea that once usersreview and accept the feature,it requires no further action —it’s done!

As additional architectural featurespikes are completed, the systemevolves to a fully production-ready

state of maturity. In fact, usersshould already be using com-pleted features to the extent thatuse does not interfere with ongo-ing development. As users activelyutilize completed features, theyare in a better position to providevaluable feedback. Some of thisfeedback can be immediatelyimplemented with minimal effort,while other feedback may be pri-oritized for later implementation.

Practice 5: Test-Driven Development

Created by Beck, TDD is “the craftof producing automated tests forproduction code, and using thatprocess to drive design and pro-gramming” [4]. Beck advocatesdeveloping tiny bits of functional-ity based on tiny test cases. Thevalue of TDD is that requirementsdetermine the tests: as soon asyour code passes the test, youhave met the requirementsrelated to the test.

We can adopt a similar strat-egy in data warehousing.

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 1177

Source

systems

Staging

databaseWarehouse

repository

Data access

servers

User

applications Feature spike 1

Feature spike 2

Feature spike 3

Feature spike 4

Feature spike 5

QAfrom sourceto staging

QAfrom staging

to warehouse

QAfrom warehouse to access server

QA from access server to

user apps

©2004 Ken Collier

Figure 7 — Architectural feature spikes.

Page 20: Agile Data Warehousing - The Agilist

Unfortunately it is somewhat morechallenging when dealing withpotentially large volumes of data.Here is a list of the TDD practicesthat are proving effective in datawarehousing:

1. Manufacture a test bed.Contrive a replica of yoursource databases that containa handpicked cross section ofdata in the live systems. Thetest bed should include signifi-cant data volume but shouldnot be unmanageable. Thecross section should includerecords that exemplify all thedifferent types of data found inthe source systems, includingdata that reflects all knownquality issues as well as“good” data.

2. Develop a test-bed catalog.This refers to a detaileddescription of the groupings ofrecords found in the test bed.The description should includeknown data anomalies andother characteristics reflectedby the data values and datagroupings. It provides a refer-ence for developers to ensurethat test cases accommodateall possibilities.

3. Develop test cases against thetest bed. Since the test bed is astatic and contrived database,developers can anticipate theresults of each developmenteffort. These anticipated resultsbecome the test of success.

4. Develop tiny and test tiny. Inthe spirit of incremental devel-opment, small test cases

should be developed and thenthe code should be writtenincrementally to pass the test.For example, if ETL logicthat will replace null revenuevalues with a 0.00 in the stagingdatabase is implemented,the developer can easily andquickly verify whether all nullsare successfully replaced with-out affecting non-null values.

5. Test each stage in the archi-tecture. This is a standard datawarehousing QA practice thatinvolves ensuring that the datais correct at each layer in thearchitecture. By combining thispractice with the others listedhere, a developer can developa feature that is of productionquality when it is complete.

Practice 6: Automated Testing

Given all the books and articles ondata warehousing, surprisingly lit-tle has been written about soundtesting practices. Data warehousetesting typically involves somedegree of integration testing, sys-tem testing, data validation, loadtesting, and user acceptance test-ing. Unfortunately, practitionersare left to define their own testingmethods.

As any experienced developerknows, manual testing is tediousand time-consuming. While tradi-tional developers are in the habitof unit-testing, typical systemtesting is often handled by adedicated QA team near the endof the project cycle — a modelthat is not well suited to the

agile practice of testing duringdevelopment.

As mentioned in the introduction,Ward Cunningham has developedthe Framework for Integrated Test,which is an open source frame-work for automated testing in anagile software development envi-ronment [6]. The FIT frameworkincludes code libraries for auto-mated testing using HTML todefine test cases and expectedresults.

While FIT is not directly applica-ble to data warehouse testing, theprinciples and approach underly-ing FIT can be adapted to datawarehouse testing with relativeease. Additionally, I am currentlyworking with a development teamthat automates many of the TDDpractices outlined previously.

We have developed scripts thathelp automate the creation of testcases based on the “source testbed” database. As code is imple-mented that performs data cleans-ing, data transformation, and dataderivations, the developer canspecify an expected result set. Asecond set of scripts comparesthe actual result set with theexpected result set and producesa report detailing the mismatchesfound. It is then incumbent onthe developer to resolve themismatches.

We are using this approach fortesting each table in the stagingdatabase and in the warehouserepository as well as the validation

REPRINT www.cutter.com

1188 BUSINESS INTELLIGENCE ADVISORY SERVICE

Page 21: Agile Data Warehousing - The Agilist

of data in the final reports andmodels. Coupled with user accep-tance testing, this automatedapproach enables developers touse TDD without the challengesinherent in manual testing.

Practice 7: Incremental Design

Like testing, design and modelingshould be integrated into the agiledevelopment iterations. Whileconceptual design largely occursduring the envision cycle or duringiteration zero, ADW practitionersshould not shirk the responsibilityof detailed design. Critics oftenargue that agile methods short-circuit rigorous and valuabledesign practices. In fact, agilemethods value the importance ofgood design. Working code, how-ever, is the ultimate test of design.

In one case, I worked on a projectin which the company had initi-ated the project two full yearsprior to my involvement. The proj-ect team spent the first 18 monthseliciting user requirements anddeveloping volumes of use-casescenarios to describe all possibili-ties. It had spent the next sixmonths in system design and wasin the process of developingexhaustive design models. By thetime the team was ready to begindevelopment, the user communityhad grown tired of waiting forresults and was skeptical that any-thing would ever be developed.

Ambler has developed AgileModeling, including a set of princi-ples and practices for effectively

applying leading software model-ing techniques in an agile devel-opment environment [5]. Amblerdoes not reinvent modelingmethodologies. Instead, his prac-tices are based on a set of guidingprinciples and practices that guidedevelopers to model in smallincrements and then prove theirmodels with working code, anapproach that is inherently agile.

Figure 8 extends the envisionexplore cycle by adding a buildcycle that describes the develop-ment phase of exploration. Thebuild cycle includes model build

test. During a single ADW fea-ture development effort, the buildcycle may be repeated multipletimes.

An approach that meshes particu-larly well with the testing practiceof developing tiny and testing tinyis to model a few aspects of the

feature and then implement andtest just those aspects. Using ourproduct revenue and profit OLAPreport as an example, thisapproach might start by simplyreplicating source tables in thestaging database, building a sim-ple star schema to contain onlythe basic revenue measure, andpopulating the OLAP cube withuncleansed data and minimaltransformation logic. The model-ing activity for staging might sim-ply be the identification of relatedsource tables and their join condi-tions. The modeling activity for thewarehouse repository will simplycontain the product revenue facttable and the dimensions that arespecified. The next build cyclemight add in net and gross profitmeasures or data cleansing logic.

The agility in this approach isclear. The models evolve as thefeature matures, and testing is

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 1199

Cycle 0 Cycle 1 Cycle 3 Cycle 4Cycle 2Ar chi t ectur e

1.0

Requir ement s

Car ds

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Ar chit ect ur e

1.1

Ar chi t ectur e

1.2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feat ure 1

Feat ure 2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feat ure 3

Feat ure 4

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feat ure 5

Feat ure 6

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feat ure 7

Cycle 0 Cycle 1 Cycle 3 Cycle 4Cycle 2Ar chi t ectur e

1.0

Requir ement s

Car ds

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Ar chit ect ur e

1.1

Ar chi t ectur e

1.2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feat ure 1

Feat ure 2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Cycle 0 Cycle 1 Cycle 3 Cycle 4Cycle 2Ar chi t ectur e

1.0

Requir ement s

Car ds

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Ar chit ect ur e

1.1

Ar chi t ectur e

1.2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feat ure 1

Feat ure 2

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feat ure 3

Feat ure 4

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feat ure 5

Feat ure 6

Feature/Component Requirements Card

Feature/Component ID: Planned Cycle:

Feature/Component Name:

Feature/Component Type:

Feature/Component Description:

Est. Work Effort:

Requirements Uncertainty (H,M,L):

Dependencies with other Features:

Feat ure 7

Envision cycle Explore cycle

Releaseplanning

BI vision

Project scopeand

boundaries

Reviewand

adapt

Iterationplanning

Develop

Build cycle

Model in smallincrements

Prove it with code

Inspectand test

Figure 8 — Modeling in ADW.

Page 22: Agile Data Warehousing - The Agilist

tightly integrated into the develop-ment cycle. It’s important to notethat the value of modeling liesprimarily in the act of creating themodel, not in the artifact itself;and this value increases if themodeling activity is a collaborativeand open discussion. I’m sure youcan recall various whiteboard ses-sions with team members wheregood ideas emerged through theact of discussing a problem andsketching alternative approaches.

Note that modeling is not neces-sarily equivalent to documenta-tion, although models can bemaintained as archival documentsfor future reference. It is importantto understand the cost involvedin maintaining your models.Modeling serves one of two pur-poses: it helps you understandwhat you are working on or com-municate your ideas to others.Once the goal is accomplished,the model should be discarded orbe kept and maintained. Toooften, we are afraid to discard a

model once it has served its pur-pose. Unfortunately, keeping amodel without maintaining it cancause unintended problems. Itdoesn’t take long before themodel no longer accuratelyreflects the working code. Ambleroutlines the following principlesfor modeling:

Discard temporary models.These models have servedtheir purpose.

Formalize contract models.These models are needed tocapture agreements.

Update only when it hurts.While it is important to main-tain formalized models, thisshould not be the highestpriority.

Practice 8: Barely Sufficient Modeling

This practice is most likely towake the sleeping giants. Formore than a decade, the devel-opment mantra has been thatsuccess lies in comprehensive

requirements analysis andexhaustive design. The inherentproblem with this approach is thatit is difficult and time-consumingand offers false security, sincethings inevitably change and initialunderstandings are often wrong.

There are actually two extremesin data warehouse and databasemodeling: (1) nonexistent model-ing, which leads to patchwork sys-tems; and (2) excessive modeling,which leads to overburdeneddevelopment and documentation.The aim of Agile Modeling is tofind the sweet spot between theseextremes that is appropriate foreach ADW project (see Figure 9).Highsmith uses the following anal-ogy: If you are trekking throughthe desert, you will benefit froma map, a hat, good boots, and acanteen of water. But you won’tmake it if you burden yourselfwith hundreds of gallons of water,too much gear, and a collectionof books about the desert. But itwould be foolish to try to makethe journey without a minimumof supplies.

While modeling is critical to suc-cess, effective Agile Modelingrequires producing only themodels necessary to achieve theintended goal of either under-standing or communicating. Oneof the goals of Agile Modeling isbare sufficiency. That is, agilemodels should accomplish butnot exceed their intended goals.Agile models have the followingcharacteristics:

REPRINT www.cutter.com

2200 BUSINESS INTELLIGENCE ADVISORY SERVICE

Sweet spot

• Nonexistent modeling

• Poorly thought-out software

• Significant rework

• No documentation

• Excessive models

• Overburdened development

• Loss of focus on working software

• Excessive model maintenance

Agile Modeling goal:

Model enough to explore and document your system effectively,

but not so much that it becomes a burden.

Figure 9 — The goal of Agile Modeling.

Page 23: Agile Data Warehousing - The Agilist

1. Agile models fulfill theirpurpose. If the purpose is tocommunicate, then the drivingmodeling questions are “Towhom?” and “To communicatewhat?” If the purpose is under-standing, then the drivers are“What is the question?” and“Who needs to be involved?”

2. Agile models are understand-able. They are developed usingthe correct “language” for theintended audience, and theyprovide just enough detail toachieve their purpose.

3. Agile models are sufficientlyaccurate. Models do not needto be 100% accurate. Modelsmust be accurate enough toserve their purpose. If a logicaldata model refers to the cus-tomer as a “client” rather thana “customer,” it may still be suf-ficiently accurate to serve itsintended purpose.

4. Agile models are sufficientlyconsistent. Similarly, modelsneed not be 100% consistent.Models must, however, be con-sistent enough to serve theirpurpose. If an employee’s firstname appears as a varchar(30)in the source system modelbut as a varchar(50) in thestaging data model, it is clearlyinconsistent. Is this a show-stopper? Probably not.

5. Agile models are sufficientlydetailed. Sufficient detaildepends on the audience andpurpose. For example, a con-ceptual architecture like theone in Figure 4 (on page 11) is

perfectly fine for a high-leveldiscussion about the flow ofdata. However, the developersneed much more detailed logi-cal and physical data modelsfor the implementation of thatarchitecture.

6. Agile models provide positivevalue. Does the value of themodel outweigh the cost ofcreating and maintaining it? Inmany cases, flipchart diagramsand digital snapshots are justas valuable as a more detailedERwin or Visio diagram, andthe flipchart diagrams are fasterand easier to create.

7. Agile models are as simpleas possible. Limit the level ofdetail to only what is neededto serve the purpose. Limit thenotational symbols to only whatis necessary to serve the pur-pose. Sometimes a rough blockdiagram will achieve the samepurpose as a more formalentity-relationship diagramwith all of its notational power.

Practice 9: Build the RightProject Team

The Agile Manifesto value “Indi-viduals and interactions over

processes and tools” is a recogni-tion that no formalized develop-ment process can substitute forthe right people. We have learnedthat people are not plug-and-playresources. As Highsmith states,“Getting the right people (whichof course implies getting rid of thewrong ones) and the right man-agers determines project successmore than any other factor” [7].My own experience supports thisclaim.

Data warehousing requires abroad and varied set of technicalskills. Additionally, strong linkagebetween technical developers andthe business needs of users is crit-ical. Finally, executive supportfor an ADW approach is essential.

Whenever possible, I advocate theADW project organizational struc-ture that is depicted in Figure 10.In this structure, the overall proj-ect has a specific executive spon-sor. The BI manager must have aclear understanding of the needsof users as well as an understand-ing of data warehousing funda-mentals to help manage userexpectations. The agile projectmanager must be well versed inthe APM methods summarized

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 2211

User

communityDevelopment

team

Executive

sponsor

BI managerAgile

projectmanager

Figure 10 — An ADW organizational structure.

Page 24: Agile Data Warehousing - The Agilist

previously. The role of the agileproject manager is different fromthat of a traditional project man-ager. This group makes up thecore project steering committee,which may involve other appropri-ate participants as well. Projectparticipant roles are as follows:

1. The executive sponsor:

Provides vision and direc-tion to the agile team

Conducts periodic execu-tive reviews

2. The agile project manager:

Serves as the primary pointof contact between therest of the stakeholdercommunity and the devel-opment team

Facilitates collaborativedesign and development

Ensures that agile develop-ers have the necessary“tools”

Communicates and collab-orates with the BI manager

3. The BI manager:

Represents the usercommunity

Liaisons with select usersfor reviews and feedback

Communicates andcollaborates with theagile project manager

The development team mustcomprise the best developerswith the appropriate mix of data

warehousing skills for the project.Additionally, ADW developersmust have the willingness andability to work in an agile environ-ment. Some extremely talenteddata warehouse developers areuncomfortable with the lack ofcomprehensive up-front require-ments and exhaustive designthat mark traditional development.In my experience, however, oncethey participate in a few iterations,developers warm to the approach,because it more accurately reflectshow developers naturally work.

The selection of ADW users isanother critical success factor. Itshould be apparent by now thatthe role of the user in ADW is any-thing but passive. Users must becommitted to the project. Theymust have a sense of ownershipof project success or failure. Theymust also be willing to contributetheir time and effort to the envi-sioning phase and the frequentreviews in the exploration phase.The user community shouldinclude a good cross section ofthe entire user population.

In this report, I periodicallyreferred to “project stakeholders.”All projects have some involve-ment by people who are not nec-essarily developers or users,but who have some “skin inthe game.” Author and CutterConsortium Senior Consultant RobThomsett categorizes stakeholdersinto three levels of participants,each with a different potentialimpact on the project [11]:

1. Critical. These participants canprevent project success beforeor after implementation. Theyare showstoppers.

2. Essential. These participantscan delay your project fromachieving success, but you canwork around them if necessary.

3. Nonessential. These are inter-ested third parties. They haveno direct impact on projectsuccess, but if they are notincluded, they can becomeessential or critical.

WRAP-UP AND FINALTHOUGHTS

This report discusses how totransform agile principles intoagile practices that enable devel-opment teams to create a higher-quality data warehouse that betteraccommodates the needs of theuser as these needs evolve. Byproviding users with working fea-tures early and frequently, agiledata warehouse developers canavoid many of the pitfalls inherentin data warehousing projects.Users’ expectations will naturallyalign with what is being devel-oped and demonstrated. Devel-opers’ understanding of userneeds will naturally become morerefined and more accurate. Thesystem will evolve into a high-value business tool that serves itsintended purpose. Responsibilityfor success will be shared equallyamong all stakeholders. It is evenpossible that users may determinethat the system is sufficientlycomplete before the end of the

REPRINT www.cutter.com

2222 BUSINESS INTELLIGENCE ADVISORY SERVICE

Page 25: Agile Data Warehousing - The Agilist

scheduled project cycle, reducingdevelopment cost and increas-ing ROI.

ADW establishes an environmentthat promotes continuous innova-tion by satisfying the requirementsof users. It also promotes systemadaptability by establishing theability to rapidly respond tochange and accommodatingfuture user requirements. ADWcan also increase ROI by deliver-ing working applications early aswell as reducing developmentschedules.

In 2002, IEEE Software reportedthat agile methods demonstratedsignificant improvements in pro-ductivity, cost, and cycle time rela-tive to industry benchmarks. Thefindings indicated an increase inproductivity of 15%-23%; a reduc-tion in development cost of 5%-7%; and a reduction in time tomarket of 25%-50% [9]. Currently,Cutter Consortium metrics expertshave conducted similar assess-ments of agile methods, and thefindings were quite favorable(download the results here:“How Agile Projects Measure Up,and What This Means to You” —www.cutter.com/offers/measureup.html).

As I mentioned at the beginningof this report, I have experiencedboth data warehousing successesand failures. In my assessmentof project failures or struggles, Iconsistently come to the sameconclusions. To be successful,data warehousing projects require

a high degree of user/developerinteraction, adaptation, and theright people. Thus far, myexperiences with applying agilityin data warehousing have been aresounding success. Perhaps thebest testimonial to the success ofADW is a recent conversation Ihad with my wife. During the daysleading up to the final executivereview and project wrap-up, mywife wondered how things weregoing. When I told her that theproject would be finished in twodays, she asked, “If this is the endof the project, then why aren’t youall stressed out and staying up allnight?” She’s obviously grownfamiliar with the traditional proj-ect home stretch. She alreadyprefers the new approach.

As with anything new, the princi-ples and practices that I havebeen incorporating into ADW areevolving and maturing with expe-rience. I recently started a newADW project, and so far, so good.However, I fully expect to incorpo-rate the lessons learned duringthis project into the refinement ofADW practices. So stay tuned, Imay have an update for you in mynext report. Meanwhile, I hopeyou will consider running an ADWproject in your organization. Try iton a smaller data mart or a revi-sion of your current data ware-housing system. If you do it right,

I know you’ll be pleased withthe outcome.

ABOUT THE AUTHORS

Ken Collier is a Senior Consultantwith Cutter Consortium’s BusinessIntelligence and Agile Product &Project Management practices,where he brings more than 20years’ experience in advancedcomputing and technology. Withexpertise in data warehousing,business intelligence (BI), soft-ware engineering, and agile meth-ods, Dr. Collier extends his skillsacross many industries. He wasa former VP at KSolutions, Inc.,and former head of the BusinessIntelligence Solutions practiceat KPMG Consulting (nowBearingPoint). In both roles, Dr.Collier was responsible for BIsolutions and services, includingoverseeing and managing multiplesoftware development projects.

Dr. Collier spent 10 years as atenured Associate Professor ofcomputer science engineering atNorthern Arizona University (NAU)specializing in software engineer-ing, database theory, and machinelearning. At NAU, he cofoundedthe Center for Data Insight, a lead-ing center of excellence in datamining and advanced analytics.Dr. Collier has been an invitedspeaker on various BI and agiledevelopment topics. He holdsa PhD in computer scienceengineering from Arizona StateUniversity. He can be reachedat [email protected].

©2010 CUTTER CONSORTIUM REPRINT

EXECUTIVE REPORT 2233

Agile data warehousing establishes

an environment that promotes

continuous innovation by satisfying

the requirements of users.

Page 26: Agile Data Warehousing - The Agilist

Jim Highsmith is Director ofCutter Consortium’s Agile Product& Project Management practiceand a frequent keynoter at CutterSummits. He is the recipient of the2005 Stevens Award in recognitionof his work in adaptive softwaredevelopment and agile processes.Mr. Highsmith has 25-plus yearsexperience as an IT manager,product manager, project man-ager, consultant, and softwaredeveloper. He has consulted withIT and product development orga-nizations and software companiesin the US, Europe, Canada, SouthAfrica, Australia, Japan, India, andNew Zealand to help them adaptto the accelerated pace of devel-opment in increasingly complex,uncertain environments. His areasof consulting include agile soft-ware development, project man-agement, and collaboration.

Mr. Highsmith is author of AgileProject Management: CreatingInnovative Products, in itssecond edition, and AgileSoftware DevelopmentEcosystems. His first book,Adaptive Software Development:A Collaborative Approach toManaging Complex Systems,won the prestigious Jolt Award.He is coeditor (with Cutter SeniorConsultant Alistair Cockburn) of

the Agile Software DevelopmentSeries. Mr. Highsmith is alsocoauthor of the Agile Manifesto,coauthor of the Declaration ofInterdependence for project lead-ers, founding member of the AgileAlliance, and cofounder and firstPresident of the Agile ProjectLeadership Network. A frequentspeaker at conferences world-wide, he has written dozens ofarticles for major industry publica-tions. He holds a BS in electricalengineering and an MS in man-agement. He can be reached [email protected].

REFERENCES

1. Adelman, Sid, and Larissa Moss.“Data Warehouse Failures.” TheData Administration Newsletter,Issue 14.0, October 2000.

2. The AgileAlliance. TheAgile Manifesto, 2001 (www.agilemanifesto.org).

3. The AgileAlliance. “PrinciplesBehind the Agile Manifesto,”2001 (www.agilemanifesto.org/principles.html).

4. The AgileAlliance. “Test-DrivenDevelopment.” Version 1.2, June2003.

5. Ambler, Scott W. AgileModeling: Effective Practicesfor Extreme Programming andthe Unified Process. John Wiley& Sons, 2002.

6. Cunningham, Ward, and JimShore. “Framework for IntegratedTest.” December 2004 (http://fit.c2.com).

7. Highsmith, Jim. Agile ProjectManagement: Creating InnovativeProducts. 2nd edition. Addison-Wesley, 2009.

8. Kimball, Ralph. The DataWarehouse Toolkit: The CompleteGuide to Dimensional Modeling.2nd ed. John Wiley & Sons, 2002.

9. Reifer, Donald J. “How GoodAre Agile Methods?” IEEESoftware, Vol. 19, No. 4, July/August 2002.

10. TechTarget. Data WarehouseFailures survey, 2004.

11. Thomsett, Rob. RadicalProject Management. PrenticeHall PTR, 2002.

REPRINT www.cutter.com

2244 BUSINESS INTELLIGENCE ADVISORY SERVICE

Page 27: Agile Data Warehousing - The Agilist

CUTTER CONSORTIUM

Consulting

Agile Assessment

Receive a complete assessment of your

organization’s use of agile methods, its

software engineering practices, and its

project management skills and capabilities,

performed by Cutter’s team of experts and

led by Practice Director Jim Highsmith.

Technical Debt Assessmentand Valuation

Technical debt is a real cost. Cutter’s

Technical Debt Assessment and Valuation,

led by Israel Gat, helps you determine

how much money is required to “pay back”

your software’s technical debt and answer

the question, “Is your software an asset

or a liability?”

Agile Enablement

Which agile practices will bring the

greatest ROI to your organization? Cutter’s

Agile Enablement engagement helps you

implement agile techniques that are right

for your organization, shorten your product

development schedule, and increase the

quality of the resultant product.

Training

Advanced Agile Project Management:Creating the Flexibility to Thrive

In this two-day workshop, Jim Highsmith

explores a variety of advanced topics —

from the Agile Triangle to release planning

to phase-gate project governance, and

more — giving your organization the tools

it needs to enhance its ability to deliver

successful agile projects and fully embrace

the agile ethic.

Agile Business Intelligence

Bring Cutter’s Ken Collier into your firm

and discover the values, principals, and

practices behind Agile Business Intelligence.

Learn how you can use this highly iterative

and evolutionary approach to building

any business intelligence system to help

you generate early and frequent analytical

results for review and feedback by stake-

holders.

Agile Metrics

If you want to ensure your gains from agile

are above industry norms, Cutter’s Agile

Metrics techniques provides real-world

metrics of agile projects and how they

measure up against waterfall and other

projects in terms of productivity, schedule,

and quality.

Research and Analysis

Become a client of Cutter’s Agile Practice

and your team will immediately benefit

from access to our online resource library,

including Webinars, podcasts, white papers,

reports, and articles.

Popular Research:

n “The Agile Triangle — Quality Today

and Tomorrow” by Jim Highsmith

n “How Agile Projects Measure Up,

and What This Means to You” by

Michael Mah

n “To Release No More or To ‘Release’

Always: Parts I-IV” by Israel Gat

n “Pitfalls of Agile V: Quality Assurance?”

by Jens Coldewey

n “The Siren Song of Improving

Productivity” by Jim Highsmith

n “Making Agile ‘Sticky’: Strategies for

Long-Term Success with Agile Adoption”

by Amr Elssamadisy

Test-Drive the Service

To gain a free trial to Cutter’s Agile Product

& Project Management resource center or

discuss how Cutter’s consultants can help

you successfully deploy agile practices at

your organization, call us at +1 781 648

8700, send e-mail to [email protected],

or visit www.cutter.com.

CUTTER CONSORTIUM

Agile Product & ProjectManagement Practice Cutter’s Agile Practice unites agile methods such as iterative development, test-

first design, project chartering, and onsite customers with cutting-edge ideas on

collaboration, governance, and measurement to help your firm respond quickly

to market changes, increase customer satisfaction, innovate, and deliver high ROI.

Jim Highsmith

Page 28: Agile Data Warehousing - The Agilist

Abou

t the

Pra

ctice Business Intelligence

PracticeThe strategies and technologies of business intelligence and knowledgemanagement are critical issues enterprises must embrace if they are to remaincompetitive in the e-business economy. It’s more important than ever to makethe right strategic decisions the first time.

Cutter Consortium’s Business Intelligence Practice helps companies take all theirenterprise data, augment it if appropriate, and turn it into a powerful strategicweapon that enables them to make better business decisions. The practice is uniquein that it provides clients with the full picture: technology discussions, productreviews, insight into organizational and cultural issues, and strategic advice acrossthe full spectrum of business intelligence. Clients get the background they need tomanage technical issues like data cleansing as well as management issues such ashow to encourage employees to participate in knowledge sharing and knowledgemanagement initiatives. From tactics that will help transform your company to aculture that accepts and embraces the value of information, to surveys of the toolsavailable to implement business intelligence initiatives, the Business IntelligencePractice helps clients leverage data into revenue-generating information.

Through Cutter’s subscription-based service and consulting, mentoring, and training,clients are ensured opinionated analyses of the latest data warehousing, datamining, knowledge management, CRM, and business intelligence strategies andproducts. You’ll discover the benefits of implementing these solutions, as wellas the pitfalls companies must consider when embracing these technologies.

Products and Services Available from the Business Intelligence Practice

• The Business Intelligence Advisory Service• Consulting• Inhouse Workshops• Mentoring• Research Reports

Other Cutter Consortium PracticesCutter Consortium aligns its products and services into the nine practice areasbelow. Each of these practices includes a subscription-based periodical service,plus consulting and training services.

• Agile Product & Project Management • Business Intelligence• Business-IT Strategies• Business Technology Trends & Impacts• Enterprise Architecture• Enterprise Risk Management & Governance• Innovation & Enterprise Agility• IT Management• Measurement & Benchmarking Strategies• Social Networking• Sourcing & Vendor Relationships

Senior ConsultantTeamThe Senior Consultants on Cutter’s BusinessIntelligence team are thought leaders in themany disciplines that make up businessintelligence. Like all Cutter ConsortiumSenior Consultants, each has gained a stellarreputation as a trailblazer in his or her field.They have written groundbreaking papers andbooks, developed methodologies that havebeen implemented by leading organizations,and continue to study the impact thatbusiness intelligence strategies and tactics arehaving on enterprises worldwide. The teamincludes:

• Verna Allee• David Coleman• Ken Collier• Michael Dressler• Lance Dublin• Clive Finkelstein• Bob Furniss• Curt Hall• David C. Hay• Dave Higgins• Vince Kellen• Larissa T. Moss• Ciaran Murphy• Ken Orr• Gabriele Piccoli• Thomas C. Redman• Ricardo Rendón• Karl M. Wiig