olap cube

Upload: khalid-saleh

Post on 09-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 OLAP cube

    1/4

    OLAP cubeFrom Wikipedia, the free encyclopedia

    OLAP Cube

    An OLAP (Online analytical processing) cube is a data structure that allows fastanalysis of data. [1] It can also be defined as the capability of manipulating and analyzingdata from multiple perspectives. The arrangement of data into cubes overcomes alimitation of relational databases . Relational databases are not well suited for near instantaneous analysis and display of large amounts of data. [citation needed ] Instead, they are

    better suited for creating records from a series of transactions known as OLTP or On-Line Transaction Processing. [2] Although many report-writing tools exist for relationaldatabases, these are slow when the whole database must be summarized. [citation needed ]

    B ackgroundOLAP (online analytical processing) cubes can be thought of as extensions to the two-dimensional array of a spreadsheet . For example a company might wish to analyze somefinancial data by product, by time-period, by city, by type of revenue and cost, and bycomparing actual data with a budget. These additional methods of analyzing the data areknown as dimensions. [3] Because there can be more than three dimensions in an OLAPsystem the term hypercube is sometimes used.

    [edit ] Functionality

    The OLAP cube consists of numeric facts called m easures which are categorized bydim ensions . The cube metadata (structure) may be created from a star schema or snowflake schema of tables in a relational database . Measures are derived from therecords in the fact table and dimensions are derived from the dimension tables .

    [edit ] Pivot

  • 8/7/2019 OLAP cube

    2/4

    A financial analyst might want to view or " pivot " the data in various ways, such asdisplaying all the cities down the page and all the products across a page. This could befor a specified period, version and type of expenditure. Having seen the data in this

    particular way the analyst might then immediately wish to view it in another way. Thecube could effectively be re-oriented so that the data displayed now has periods across

    the page and type of cost down the page. Because this re-orientation involves re-summarizing very large amounts of data, this new view of the data has to be generatedefficiently to avoid wasting the analyst's time, i.e. within seconds, rather than the hours arelational database and conventional report-writer might have taken. [4]

    [edit ] Hierarchy

    Each of the elements of a dimension could be summarized using a hierarchy .[5] Thehierarchy is a series of parent-child relationships, typically where a parent member represents the consolidation of the members which are its children. Parent members can

    be further aggregated as the children of another parent. [6]

    For example May 2005 could be summarized into Second Quarter 2005 which in turnwould be summarized in the Year 2005. Similarly the cities could be summarized intoregions, countries and then global regions; products could be summarized into larger categories; and cost headings could be grouped into types of expenditure. Conversely theanalyst could start at a highly summarized level, such as the total difference between theactual results and the budget, and drill down into the cube to discover which locations,

    products and periods had produced this difference.

    [edit ] OLAP operations

    The analyst can understand the meaning contained in the databases using multi-dimensional analysis. By aligning the data content with the analyst's mental model, thechances of confusion and erroneous interpretations are reduced. The analyst can navigatethrough the database and screen for a particular subset of the data, changing the data'sorientations and defining analytical calculations. [6] The user-initiated process of navigating by calling for page displays interactively, through the specification of slicesvia rotations and drill down/up is sometimes called "slice and dice". Common operationsinclude slice and dice, drill down, roll up, and pivot.

    Sl ice : A slice is a subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset. [6]

    Dice : The dice operation is a slice on more than two dimensions of a data cube (or morethan two consecutive slices). [7]

    Dri ll Down/Up : Drilling down or up is a specific analytical technique whereby the user navigates among levels of data ranging from the most summarized (up) to the mostdetailed (down). [6]

  • 8/7/2019 OLAP cube

    3/4

    Roll- up : A roll-up involves computing all of the data relationships for one or moredimensions. To do this, a computational relationship or formula might be defined. [6]

    Pivot : This operation is also called rotate operation. It rotates the data in order to providean alternative presentation of data - the report or page display takes a different

    dimensional orientation.[6]

    [edit ] Linking cubes and sparsity

    The commercial OLAP products have different methods of creating and of linking cubesand hypercubes (see Types of OLAP ).

    Linking cubes is a method of overcoming sparsity . Sparsity arises when not every cell inthe cube is filled with data and so valuable processing time is taken by effectively addingup zeros. For example revenues may be available for each customer and product but costdata may not be available with this amount of analysis. Instead of creating a sparse cube,

    it is sometimes better to create another separate, but linked, cube in which a sub-set of thedata can be analyzed into great detail. The linking ensures that the data in the cubesremain consistent.

    [edit ] Variance in products

    The data in cubes may be updated at times, perhaps by different people. Techniques aretherefore often needed to lock parts of the cube while one of the users is writing to it andto recalculate the cube's totals. Other facilities may allow an alert that shows previouslycalculated totals are no longer valid after the new data have been added, but some

    products only calculate the totals when they are needed.

    [edit ] Technical definition

    In database theory , an OLAP cube is[8] an abstract representation of a projection of anRDBMS relation. Given a relation of order N , consider a projection that subtends X , Y ,and Z as the key and W as the residual attribute . Characterizing this as a function ,

    W : ( X ,Y , Z ) W ,

    the attributes X , Y , and Z correspond to the axes of the cube, while the W value into whicheach ( X, Y, Z ) triple maps corresponds to the data element that populates each cell of thecube.

    Insofar as two-dimensional output devices cannot readily characterize four dimensions, itis more practical to project "slices" of the data cube (we say project in the classic vector analytic sense of dimensional reduction, not in the SQL sense, although the two areconceptually similar), perhaps

    W : ( X ,Y ) W

  • 8/7/2019 OLAP cube

    4/4

    which may suppress a primary key, but still have some semantic significance, perhaps aslice of the triadic functional representation for a given Z value of interest.

    The motivation [8] behind OLAP displays harks back to the cross - tabbed report paradigmof 1980s DBMS . One may wish for a spreadsheet -style display, where values of X

    populate row $1; values of Y populate column $A; and values of W : ( X, Y ) W populate the individual cells "southeast of" $B2, so to speak, $B2 itself included. Whileone can certainly use the DML (Data Manipulation Language) of traditional SQL todisplay ( X, Y, W ) triples, this output format is not nearly as convenient as the cross-tabbed alternative : certainly, the former requires one to hunt linearly for a given ( X, Y )

    pair in order to determine the corresponding W value, while the latter enables one to moreconveniently scan for the intersection of the proper X column with the proper Y row.