the final frontier v3
DESCRIPTION
TRANSCRIPT
![Page 1: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/1.jpg)
Agile Data Warehouse The Final Frontier
![Page 2: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/2.jpg)
@tbunio
bornagainagilist.wordpress.com
www.protegra.com
Terry Bunio
![Page 3: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/3.jpg)
Who Am I?
• Data Base Administrator
– Oracle, SQL Server, ADABAS
• Data Architect
– Investors Group, LPL Financial, Manitoba
Blue Cross, Assante Financial
• Agilist
– Innovation Gamer, Team Member, Project
Manager, PMO on SAP Implementation
![Page 4: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/4.jpg)
![Page 5: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/5.jpg)
![Page 6: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/6.jpg)
![Page 7: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/7.jpg)
Learning Objectives
• Learn about how a Data Warehouse Project
can be Agile
• Introduce Agile practices that can help to be
DWAgile
• Introduce DW practices that can help to be
DWAgile
![Page 8: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/8.jpg)
What is Agile?
• Deliver frequently as possible
• Minimize Inventory
– All work that doesn’t directly contribute to
delivering value to the client
– Typically value is realized by code
![Page 9: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/9.jpg)
Enterpise Models
Spock Method
Visualization
Spectre of the Agility
Database/Data Warehouse Architecture
DWAgile Practices
![Page 10: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/10.jpg)
Data Warehouse
• Definition
– “a database used for reporting and data analysis.
It is a central repository of data which is created
by integrating data from multiple disparate
sources. Data warehouses store current as well
as historical data and are commonly used for
creating trending reports for senior management
reporting such as annual and quarterly
comparisons.” – Wikipedia.org
![Page 11: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/11.jpg)
Data Warehouse
• Can refer to:
– Reporting Databases
– Operational Data Stores
– Data Marts
– Enterprise Data Warehouse
– Cubes
– Excel?
– Others
![Page 12: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/12.jpg)
Two sides of Database Design
![Page 13: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/13.jpg)
Two design methods
• Relational – “Database normalization is the process of organizing
the fields and tables of a relational database to
minimize redundancy and dependency. Normalization
usually involves dividing large tables into smaller (and less
redundant) tables and defining relationships between them.
The objective is to isolate data so that additions, deletions,
and modifications of a field can be made in just one table
and then propagated through the rest of the database via
the defined relationships.”.”
![Page 14: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/14.jpg)
Two design methods
• Dimensional – “Dimensional modeling always uses the concepts of facts
(measures), and dimensions (context). Facts are typically
(but not always) numeric values that can be aggregated,
and dimensions are groups of hierarchies and descriptors
that define the facts
![Page 15: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/15.jpg)
Relational
• Relational Analysis
– Database design is usually in Third Normal
Form
– Database is optimized for transaction
processing. (OLTP)
– Normalized tables are optimized for
modification rather than retrieval
![Page 16: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/16.jpg)
Normal forms
• 1st - Under first normal form, all occurrences of a
record type must contain the same number of fields.
• 2nd - Second normal form is violated when a non-
key field is a fact about a subset of a key. It is only
relevant when the key is composite
• 3rd - Third normal form is violated when a non-key
field is a fact about another non-key field
Source: William Kent - 1982
![Page 17: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/17.jpg)
Dimensional
• Dimensional Analysis
– Star Schema/Snowflake
– Database is optimized for analytical
processing. (OLAP)
– Facts and Dimensions optimized for retrieval
• Facts – Business events – Transactions
• Dimensions – context for Transactions
– Accounts
– Products
– Date
![Page 18: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/18.jpg)
Relational
![Page 19: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/19.jpg)
Dimensional
![Page 20: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/20.jpg)
![Page 21: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/21.jpg)
Kimball-lytes
• Bottom-up - incremental
– Operational systems feed the Data
Warehouse
– Data Warehouse is a corporate dimensional
model that Data Marts are sourced from
– Data Warehouse is the consolidation of Data
Marts
– Sometimes the Data Warehouse is generated
from Subject area Data Marts
![Page 22: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/22.jpg)
Inmon-ians
• Top-down
– Corporate Information Factory
– Operational systems feed the Data
Warehouse
– Enterprise Data Warehouse is a corporate
relational model that Data Marts are sourced
from
– Enterprise Data Warehouse is the source of
Data Marts
![Page 23: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/23.jpg)
The gist…
• Kimball’s approach is easier to implement as
you are dealing with separate subject areas,
but can be a nightmare to integrate
• Inmon’s approach has more upfront effort to
avoid these consistency problems, but takes
longer to implement.
![Page 24: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/24.jpg)
Spectre of the Agility
![Page 25: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/25.jpg)
Incremental - Kimball
•In Segments •Detailed Analysis
•Development •Deploy •Long Feedback loop
•Considerable changes •Rework •Defects
Waterfall - Inmon •Detailed Analysis •Large Development
•Large Deploy •Long Feedback loop •Extensive changes
•Many Defects
Data Warehouse
Project
![Page 26: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/26.jpg)
![Page 27: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/27.jpg)
Popular Agile Data Warehouse Pattern
• Son’a method
– Analyze data requirements department by
department
– Create Reports and Facts and Dimensions for
each
– Integrate when you do subsequent
departments
![Page 28: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/28.jpg)
The two problems
• Conforming Dimensions
– A Dimension conforms when it is in
equivalent structure and content
– Is a client defined by Marketing the same as
Finance?
• Probably not
– If the Dimensions do not conform, this
severely hampers the Data Warehouse
![Page 29: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/29.jpg)
The two problems
• Modeling the use of the data versus the data
– By using reporting needs as the primary
foundation for the data model, you are modeling
the use of the data rather than the data
– This will cause more rework in the future as the
use of the data is more likely to change than the
data itself.
![Page 30: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/30.jpg)
Where is she?
![Page 31: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/31.jpg)
Where is the true Agility?
• Iterations not Increments
• Brutal Visibility/Visualization
• Short Feedback loops
• Just enough requirements
• Working on enterprise priorities – not just for
an individual department
![Page 32: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/32.jpg)
Fact
• True iterative development on a Data
Warehouse project is hard – perhaps harder
than a traditional Software Development
project
– ETL, Data Models, and Business Intelligence
stories can have a high impact on other
stories
– Can be difficult to create independent stories
– Stories can have many prerequisites
![Page 33: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/33.jpg)
Fiction
• True iterative development on a Data
Warehouse project is impossible
– ETL, Data Models, and Business Intelligence
stories can be developed iteratively
– Independent stories can be developed
– Stories can have many prerequisites – but
this can be limited
![Page 34: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/34.jpg)
Agile Mindset
• We need to implement an Agile Mindset to
Data Modelling
– What is just enough Data Modelling?
– And do no more…
![Page 35: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/35.jpg)
Our Mission
• “Data... the Final Frontier. These are the
continuing voyages of the starship Agile.
Her on-going mission: to explore strange
new projects, to seek out new value and
new clients, to iteratively go where no
projects have gone before.”
![Page 36: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/36.jpg)
The Prime Directive
![Page 37: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/37.jpg)
![Page 38: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/38.jpg)
![Page 39: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/39.jpg)
The Prime Directive
• Is a vision or philosophy that binds the
actions of Starfleet
• Can an Data Warehouse project truly be
Agile without a Vision of either the Business
Domain or Data Domain?
– Essentially it is then just an Ad Hoc Data
Warehouse. Separate components that may fit
together.
– How do we ensure we are working on the right
priorities for the entire enterprise?
![Page 40: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/40.jpg)
Enterprise Data Model?
![Page 41: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/41.jpg)
Torture
• Why does the creation of Enterprise Data
Models feel like torture?
– Interrogation
– Coercion
– Agreement on Excessive detail without direct
alignment to business value
![Page 42: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/42.jpg)
Enterprise Models
![Page 43: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/43.jpg)
Enterprise Models
![Page 44: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/44.jpg)
Two new models
![Page 45: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/45.jpg)
Agile Enterprise Normalized Data Model
• Confirms the major entities and the
relationships between them
– 30-50 entities
• Confirms the Data Domain
• Starts the definition of a Normalized Data
Model that will be refined over time
– Completed in 1 – 4 weeks
![Page 46: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/46.jpg)
Agile Enterprise Normalized Data Model
• Is just enough to understand the data
domain so that the iterations can proceed
• Is not mapping all the attributes
– Is not BDUF
• Is an Information Map for the Data Domain
• Contains placeholders for refinement
– Like a User Story Map
![Page 47: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/47.jpg)
Agile Enterprise Dimensional Data Model
• Confirms the Business Objects and the
relationships between them
– 10-15 entities
• Confirms the Business Domains
• Starts the definition of a Dimensional Data
Model that will be refined over time
– Completed in 1 – 2 weeks
![Page 48: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/48.jpg)
Agile Enterprise Dimensional Data Model
• Is just enough to understand the business
domain so that the iterations can proceed
– And to validate the understanding of the data
domain
• Is not mapping all the attributes
– Is not BDUF
• Is an Information Map for the Business Domain
• Contains placeholders for refinement
– Like a User Story Map
![Page 49: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/49.jpg)
Agile Information Maps
![Page 50: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/50.jpg)
Agile Information Maps
• Agile Information Maps allow for:
– Efficient Navigation of the Data and Business
Domains
– Ability to set up ‘Neutral Zones’ for areas that
need more negotiation
– Visual communication of the topology of the
Data and Business Domains
• Easier and more accurate to validate than text
• ‘feels right’
![Page 51: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/51.jpg)
Agile Information Maps
• Are
– Our vision
– Our Maps for the Data and Business Domains
– A guide for our solution
– Minimizes rework and refactoring
– Our Prime Directive
– Data Models
![Page 52: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/52.jpg)
Kimball or Inmon?
![Page 53: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/53.jpg)
Spock
• Hybrid approach
– It is only logical
– Needs of the many outweigh the needs of the
few – or the one
![Page 54: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/54.jpg)
Spock Approach
Agile Normalized
Data Model
DM
DM
DM
ODS
DW Agile Dimensional
Data Model
Business
Domain
Spike
![Page 55: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/55.jpg)
Spock Approach
• Business Domain Spike
• Agile Information Maps
– Agile Enterprise Normalized Data Model
– Agile Enterprise Dimensional Data Model
• Implement
– Operational Data Store
– Dimensional Data Warehouse
• Reporting can then be done from either
![Page 56: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/56.jpg)
Business Domain Spike
• Needs to precede work on Agile Information
Maps
• Need to understand the business and
industry before you can create Data of
Business Information Maps
• Can take 1-2 weeks for an initial
understanding
– Constant refinement
![Page 57: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/57.jpg)
Benefits of Spock Approach
• Agile Enterprise Normalized Data Model
– Validates knowledge of Data Domain
– Ensure later increments don’t uncover data
that was previously unknown and hard to
integrate
• Minimizes rework and refactoring
– True iterations
• Confirm at high level and then refine
![Page 58: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/58.jpg)
Benefits of Spock Approach
• Agile Enterprise Dimensional Data Model
– Validates knowledge of Business Domain
– The process of ‘cooking down’ to a
Dimensional Model validates design and
identifies areas of inconsistencies or errors
• Especially true when you need to design how
changes and history will be handled
– True iterations
• Confirm at high level and then refine
![Page 59: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/59.jpg)
Benefits of Spock Approach
• Operational Data Store
– Model data relationally to provide enterprise
level operational reports
– Consolidate and cleanse data before it is
visible to end-users
– Is used to refine the Agile Enterprise
Normalized Data Model
– Start creating reports to validate data model
immediately!
![Page 60: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/60.jpg)
Benefits of Spock Approach
• Dimensional Data Warehouse
– Model data dimensionally to provide
enterprise level analytical reports
– Provide full historical data and context for
reports
– Is used to refine the Agile Enterprise
Dimensional Data Model
– Clients can start creating reports to validate
data model immediately!
![Page 61: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/61.jpg)
Do we need an ODS and DW?
• Relational Analysis provides
– Validation of the Data domain
• Dimensional Analysis provides
– Validation of the Business domain
– Additional level of confirmation of the Data
domain as the relational model in translated
into a dimensional one
• Much easier for inconsistencies and errors to
hide in 300+ tables as opposed to 30+
![Page 62: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/62.jpg)
Most Importantly..
• Operational Data Store
– Minimal Data Latency
– Current state
– Allow for efficient Operational Reporting
• Data Warehouse
– Moderate Data Latency
– Full history
– Allows for efficient Analytical Reporting
![Page 63: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/63.jpg)
Agile Approach
• With an Agile approach you can deliver just
enough of an Operational Data Store or Data
Warehouse based on needs
– No longer do they need to be a huge deliverable
• Neither presumes a complete implementation
is required
• The Information Models allow for iterative
delivery of value
![Page 64: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/64.jpg)
How do we work iteratively on
a Data Warehouse?
![Page 65: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/65.jpg)
Increments versus iterations
• Increments
– Series by series – department by department
• Iterations
– Story by story – episode by episode
• Enterprise prioritization
– Work on the highest priority for the enterprise
– Not just within each series/department
![Page 66: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/66.jpg)
![Page 67: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/67.jpg)
Iterative Focus
• Instead of focusing on trying to have a
complete model, we focused on creating
processes that allow us to deliver changes
within 30 minutes from model to deployment
![Page 68: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/68.jpg)
Captain, we need more Visualization!
![Page 69: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/69.jpg)
The View Screen
![Page 70: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/70.jpg)
The View Screen
• Enabled bridge to bridge communications
• Provided visual images in and around the
ship
– From different angles
– How did that work?
• Allowed for more understanding of the
situation
![Page 71: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/71.jpg)
Visualization
![Page 72: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/72.jpg)
Visualization
• Is required to:
– Report Project status
– Provide a visual report map
![Page 73: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/73.jpg)
Kanban Board
• We used a standard Kanban board to track
stories as we worked on them
– These stories resulted in ETL, Data Model,
and Reporting tasks
– We had a Data Model/ETL board and a Report
board
– ETL and Data Model required a foundation
created by the Information Maps before we
could start on stories
![Page 74: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/74.jpg)
• We also used thermometer imagery to report
how we were progressing according to the
schedule
– Milestones were on the thermometer along
with the number of reports that we had
completed every day
Report Visualization
![Page 75: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/75.jpg)
![Page 76: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/76.jpg)
Cardassian Union
![Page 77: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/77.jpg)
Be careful how you spell that…
![Page 78: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/78.jpg)
Data Modeling Union
• For too long the Data Modellers have not
been integrated with Software Developers
• Data Modellers have been like the
Cardassian Union, not integrated with the
Federation
![Page 79: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/79.jpg)
Issues
• This has led to:
– Holy wars
– Each side expecting the other to follow their
schedule
– Lack of communication and collaboration
• Data Modellers need to join the ‘United
Federation of Projects’
![Page 80: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/80.jpg)
How did we be Agile?
![Page 81: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/81.jpg)
Tools of the trade
![Page 82: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/82.jpg)
Tools of the Trade
• Version Control and Refactoring
• Test Automation
• Communication and Governance
• Adaptability and Change Tolerance
• Assimilation
![Page 83: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/83.jpg)
Version Control
![Page 84: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/84.jpg)
Version Control
• If you don’t control versions, they will control
you
• Data Models must become integrated with
the source control of the project
– In the same repository of project trunk and
branches
• You can’t just version a series of SQL files
separate from your data model
![Page 85: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/85.jpg)
Our Version Experience
• We are using Subversion
• We are using Oracle Data Modeler as our
Modeling tool.
– It has very good integration with Subversion
– Our DBMS is SQL Server 2012
• Unlike other modeling tools, the data model
was able to be integrated in Subversion with
the rest of the project
![Page 86: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/86.jpg)
ODM Shameless plug
• Free
• Subversion Integration
• Supports Logical and Relational data models
• Since it is free, the data models can be
shared and refined by all members of the
development team
• Currently on version 2685
![Page 87: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/87.jpg)
How do we roll out versions?
• Create Data Model changes
• Use Red Gate SQL Compare to generate
alter script
– Generate a new DB and compare to the last
version to generate alter script
• 95% of changes deployed in less than 10
minutes
![Page 88: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/88.jpg)
How do we roll out versions?
• We build on the Farley and Humble Blue-
Green Deployment model
– Blue – Current Version and Revision – Database
Name will be ‘ODS’
– Green – 1 Revision Old – Database Name will be
‘ODS-GREEN’
– Brown – 1 Major Version Old – Database Name
will be ‘ODS-BROWN’
![Page 89: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/89.jpg)
Versioning
• SQL Change scripts are generated all
changes
• A full script is generated for every major
version
– A new folder is created for every major
version
– Major version folders and named after the
greek alphabet. (alpha, beta, gamma)
![Page 90: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/90.jpg)
SQL Script version naming standards • [revision number]-[ODS/DW]-[I/A][version number]-
[subversion revision number of corresponding Data
model].sql
– Revision number – auto-incrementing
– Version Number – A999
• Alphabetic character represents major version – corresponds
with folder named after greek alphabet
• 999 indicates minor versions
– subversion revision number of corresponding Data model – allows
for a exact synchronization between Data Model and SQL Scripts
• All objects are stored within one Subversion repository
– They all share the same revision numbering
![Page 91: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/91.jpg)
SQL Script version naming standards
• For example:
– 0-ODS-I-A001-745.sql – initial db and table
creation for current ODS version (includes
reference data)
– 1-ODS-A-A001-1574.sql – revision 1 ODS alter
script that corresponds to data model subversion
revision 1574
– 2-ODS-A-A001-1590.sql - revision 2 ODS alter
script that corresponds to data model subversion
revision 1590
![Page 92: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/92.jpg)
SQL Script error handling
• Validation is done to prevent
– Scripts being run out of sequence
– Revision being applied without addressing
required refactoring
– Scripts being run on any environment but the
Blue environment
![Page 93: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/93.jpg)
But what about Refactoring?
• Having Agile Information Maps has
significantly reduced refactoring
– This was an entirely new data domain for the
team
• Using the Blue-Green-Brown deployment
model has simplified required refactoring
• We have used the methods described by
Scott Ambler on the odd occasion
![Page 94: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/94.jpg)
Good Start
![Page 95: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/95.jpg)
Create the plan for how you
will re-factor
![Page 96: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/96.jpg)
Refactoring Experience
• We haven’t needed to refactor much
• Since are iteratively refining we haven’t had
to re-define much
– Just adding more detail
– Main Information Maps have held together
![Page 97: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/97.jpg)
Test Automation
![Page 98: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/98.jpg)
Test Automation
• Enterprise was saved due to constantly
running tests on the warp engine
• Allowed for quick decision making
![Page 99: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/99.jpg)
Automated Test Suite
• Leveraged the tSQLt Open Source
Framework
• Purchased SQL-test from Red-Gate for a
enhanced interface
• Enhanced the framework to execute tests
from four custom tables we defined
![Page 100: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/100.jpg)
Automated Test Suite
• Leveraged Data Mapping spreadsheet that
the automated tests used
– Two database tables were loaded from the
spreadsheet
– Two additional tables contained ETL test
cases
– 13 Stored Procedures executed the tests
– 3300+ columns mapped!
![Page 101: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/101.jpg)
Table Tests • TstTableCount: Compares record counts between source
data and target data.
• TstTableColumnDistinct: Compares counts on distinct values
of columns.
• TstTableColumnNull: Generates a report of all columns
where all the contents of a field is all null.
![Page 102: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/102.jpg)
Column Tests • TstColumnDataMapping: Compares columns directly
assigned from a source column on a field by field basis for 5-10
rows in the target table.
• TstColumnConstantMapping: Compares columns assigned a
constant on a field by field basis for 5-10 rows in the target
table.
• TstColumnNullMapping: Compares columns assigned a Null
value on a field by field basis for 5-10 rows in the target table.
• TstColumnTransformedMapping: Compares transformed
columns on a field by field basis for 5-10 rows in the target
table.
![Page 103: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/103.jpg)
Data Quality Tests • TstInvalidParentFKColumn: Tests that an Invalid Parent FK
value results in the records being logged and bypassed. This
record will be added to the staging table to test the process.
• TstInvalidFKColumn: Tests that an Invalid FK value results in
the value being assigned a default value or Null. This record
will be added to the staging table to test the process.
• TstInvalidColumn: Tests that an Invalid value results in the
value being assigned a default value or Null. This record will be
added to the staging table to test the process.
![Page 104: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/104.jpg)
Process Integrity Tests • TstRestartTask: Tests that a Task can be started from the
start and subsequent steps will run in sequence.
• TstRecoverTask: Tests that a Task can be re-started in the
middle and that record are processed correctly and subsequent
steps will run in sequence.
![Page 105: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/105.jpg)
Interested?
• Leave me a business card and I’ll send you
the design document and stored procedures
![Page 106: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/106.jpg)
Communication
![Page 107: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/107.jpg)
Team Communication
• Frequent Data Model walkthroughs with
application teams
• Full access to the Data model through the
Data Modeling development tool
• Data Models posted in every room for
developers to mark up with suggestions
• Database deployment to play with for every
release
![Page 108: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/108.jpg)
Client Communication
• Frequent Conceptual Data Model
walkthroughs with clients
– Includes presentation of scenarios with data
to confirm and validate understanding
• Collaboration on the iterative plan to ensure
they agree on the process and support it
![Page 109: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/109.jpg)
Monthly Governance Meeting – Visual Kan Ban boards reviewed
– Reports developed in the prior iterations were
demonstrated
– Business Areas were asked to submit a ranked
list of their top 10-20 data requirement/reports for
the next iteration.
![Page 110: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/110.jpg)
Adaptability
![Page 111: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/111.jpg)
Be Nimble
• Already discussed how we can roll out new
versions quickly
![Page 112: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/112.jpg)
Change Tolerant Data Model
• Only add tables and columns when they are
absolutely required
• Leverage Data Domains so that attributes
are created consistently and can be changed
in unison
– Use limited number of standard domains
![Page 113: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/113.jpg)
Change Tolerant Data Model
• Data Model needs to be loosely coupled and
have high cohesion
– Need to model the data and business and not
the applications or reports!
![Page 114: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/114.jpg)
Change Tolerant Data Model
• Don’t model the data according to the
application’s Object Model
• Don’t model the data according to source
systems
• These items will change more frequently
than the actual data structure
• Your Data Model and Object Model should
be different!
![Page 115: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/115.jpg)
Assimilate
![Page 116: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/116.jpg)
Assimilate
• Assimilate Version Control, Communication,
Adaptability, Refinement, and Re-Factoring
into core project activities
– Stand ups
– Continuous Integration
– Check outs and Check Ins
• Make them part of the standard activities –
not something on the side
![Page 117: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/117.jpg)
Our experience
![Page 118: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/118.jpg)
Our Mission
• These practices and methods are being
used to redevelop an entire Business
Intelligence platform for a major ‘Blue’ Health
Benefits company
– Operational and Analytical Reports
• 100+ integration projects
• SAP Claims solution
![Page 119: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/119.jpg)
Our Mission
• Integration projects are being run Agile
• 100+ team members across all projects
• SAP project is being run more in a more
traditional manner
– ‘big-bang’ SAP implementation
• I’m now also fulfilling the role of an Agile PMO
![Page 120: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/120.jpg)
Our Challenge
• How can we deploy to production early and
often when the system is a ‘big-bang’
implementation
– We were ready to deploy ahead of clients and
other projects
– We were dependant on other conversion
projects
![Page 121: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/121.jpg)
Our Challenge
• We are now exploring alternate ways to
deploy to production before the ‘big-bang’
implementation
– To allow the clients to use the reports and
iteratively refine them and the solution
– Also allows our team to validate data integrity
and quality iteratively
– We are now executing iterations to make this
possible
![Page 122: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/122.jpg)
Our BI Solution
• SQL Server 2012
– Integration Services
– Reporting Services
• SharePoint 2010 Foundation
– SharePoint Integrated Reporting Solution
![Page 123: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/123.jpg)
Our team
• Integrated team of
– 2 enterprise DBAs from the ‘Blue’
– 5 Data Analysts/DBAs/SSIS/SSRS developers
• Governance team comprised of
– Business Areas
– Systems Areas
– Stakeholders
![Page 124: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/124.jpg)
Current Stardate
• We have completed the initial ODS and DW
development – including ETL
• We have completed a significant revision of
ODS, DW, and ETL – without major issues
• We are now finishing Report development –
reports have required database changes and
ETL changes – but no major changes!
– 300+ reports developed
![Page 125: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/125.jpg)
Summary
• Use Agile Enterprise Data Models to provide
the initial vision and allow for refinements
• Strive for Iterations over Increments
• Align governance and prioritization with
iterations
• Plan and Integrate processes for Versioning,
Test Automation, Adaptability, Refinement
![Page 126: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/126.jpg)
What doesn’t change?
![Page 127: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/127.jpg)
Leadership
![Page 128: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/128.jpg)
Leadership
• “If you want to build a ship, don't drum up
people together to collect wood and don't
assign them tasks and work, but rather teach
them to long for the endless immensity of the
sea.” ~ Antoine de Saint-Exupery
![Page 129: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/129.jpg)
Leadership • “[A goalie's] job is to stop pucks, ... Well, yeah, that's
part of it. But you know what else it is? ... You're
trying to deliver a message to your team that things
are OK back here. This end of the ice is pretty well
cared for. You take it now and go. Go! Feel the
freedom you need in order to be that dynamic,
creative, offensive player and go out and score. ...
That was my job. And it was to try to deliver a
feeling.” ~ Ken Dryden
![Page 130: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/130.jpg)
Three awesome books
![Page 131: The final frontier v3](https://reader033.vdocuments.site/reader033/viewer/2022051210/54c632844a7959c9388b457c/html5/thumbnails/131.jpg)