education case study(datawarehouse )
TRANSCRIPT
EDUCATIONPRESENTED BY:
SHIVAM NAGPAL(13ITU038)
SONAL BISLA(13ITU042)
UNIVERSITY CASE STUDYWe will pretend that we work for a
university Or other type of educational
institution.
There is a need to understand what our student customers are buying
in terms of courses each term.
CHARACTERSTICS OF AN ACCUMULATING SNAPSHOT FACT TABLE
A single row represents the complete history of something.
Such a fact table is most appropriate for short-lived processes.
Multiple dates represents standard scenario of each row.
Each row is revisited and changed whenever something happens.
Both foreign keys and measured facts may be changed during the revisit.
SNAPSHOT FOR ADMISSION TRACKING
STUDENT APPLICANT PIPELINEThe process is much like a funnel , where
many applicants enter the pipeline ,but fewer progress through to the final stage.
This granularity represents the lower level of information captured when the prospect enters the pipeline.
As more information is collected while the prospect progresses towards the application , acceptance and admission , we continue to revisit and update the prospect’s status in the fact table row.
FACTLESS FACT TABLE There are no measured facts i.e there are no
measurement matrices.We will discuss this in event tracking and
coverage factless fact table.Events often are modeled as a fact table
containing a series of keys ,each representing a participating dimension in the events. The event tables those have no obvious numeric facts associated with them are called factless fact tables.
STUDENT REGISTRATION EVENTS
STUDENT REGISTRATION EVENT AS A FACTLESS FACT TABLE
The fact table has the following dimensionality: Term , student , student major and attainment ,course and faculty.
Here we are dealing with the fact data at the term level rather than at the more typical calendar day, week, month granularity.
Term is the lowest level for the registration events.
Each date in a daily calendar dimension should identify the term.
In this above example we don’t have numeric fact tied to this registration data .so analysis of this data will be based largely on counts.
The SQL for performing counts in this factless fact table is asymmetric because of the absence of any facts. When counting the number of registrations for a faculty member , any key can be used as the argument to COUNT function. For example :
SELECT FACULTY,COUNT(TERM_KEY)… GROUP BY FACULTY
More expressive SQL syntax is : SELECT
FACULTY,SUM(REGISTRATION_COUNT)… GROUP BY FACULTY
FACILITIES UTILIZATION COVERAGE
FACILITIES UTILIZATION COVERAGE
In this case we have include one row in the fact table for each facility for standard hourly time blocks during each day of the week during the term regardless of weather the facility is being used or not.
The facility dimension would include all type of attributes about the facility such as building, facility type , square, capacity.
Utilization status dimension include text descriptor with values of “Available” or “Utilized”.
STUDENT ATTENDANCE EVENTS
STUDENT ATTENDANCE EVENTS In this factless fact table the granularity
is by calendar date rather than merely term.
This model allows us to answer such question as which courses were the most heavily attended ? which courses suffered the least attendance over the term? Which students attended which courses? Which faculty member taught the most students?
EXPLICIT ROWS FOR WHAT DIDN’T HAPPEN
We can also add explicit rows to the fact table for attendance events that didn’t occur.
Creating rows for events that didn’t happen is ridiculous in many situations.
We must ask what doesn’t exist while framing the NOT EXISTS within a larger query.
PROMOTION COVERAGE AS A FACTLESS FACT TABLE
MULTIDIMENSIONAL HANDLING OF WHAT DIDN’T HAPPEN OLAP databases do an excellent job of
helping users understand of what didn’t happen. When the data cube is constructed , the multidimensional database handles the transaction database while minimizing the overhead burden of storing explicit zeroes.