dwh-ahsan abdullah 1 data warehousing lab lect-2 lab data set virtual university of pakistan ahsan...

Post on 25-Dec-2015

223 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

11

Data Warehousing Data Warehousing

Lab Lect-2Lab Lect-2Lab Data SetLab Data Set

Virtual University of PakistanVirtual University of Pakistan

Ahsan AbdullahAssoc. Prof. & Head

Center for Agro-Informatics Researchwww.nu.edu.pk/cairindex.asp

FAST National University of Computers & Emerging Sciences, IslamabadFAST National University of Computers & Emerging Sciences, Islamabad

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

22

Multi-Campus UniversityMulti-Campus University

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

33

Degree ProgramsDegree Programs

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

44

Disciplines for BSDisciplines for BS

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

55

Disciplines for MSDisciplines for MS

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

66

The need The need

Head Office wants a central data Head Office wants a central data repository for decision support i.e. a DWHrepository for decision support i.e. a DWH

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

77

Students Record Keeping & Mgmt. Students Record Keeping & Mgmt.

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

88

Data from Lahore CampusData from Lahore Campus

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

99

Data from Lahore Campus: SampleData from Lahore Campus: Sample

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

1010

Lahore: Header of Student TableLahore: Header of Student Table

SIDSID St_NameSt_Name Father_NameFather_Name

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

1111

Lahore: Header of Student TableLahore: Header of Student Table

Gender Gender AddressAddress [Date of Birth][Date of Birth] [Reg Date][Reg Date]

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

1212

Lahore: Header of Student TableLahore: Header of Student Table

[Reg Status][Reg Status] [Degree Status][Degree Status] [Last Degree][Last Degree]

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

1313

Lahore: Header of Course Reg. TableLahore: Header of Course Reg. Table

SID SID Degree Degree Semester Semester CourseCourse MarksMarks DisciplineDiscipline

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

1414

Lahore: Facts About DataLahore: Facts About Data

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

1515

Data from Karachi CampusData from Karachi Campus

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

1616

Data from Karachi Campus: SampleData from Karachi Campus: Sample

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

1717

Karachi: Header of Student TableKarachi: Header of Student Table

St_IDSt_ID NameName FatherFather DoBDoB M/FM/F DoRegDoReg RStatusRStatus DStatusDStatus AddressAddress QualificationQualification

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

1818

Karachi: Header of Course Reg. TableKarachi: Header of Course Reg. Table

SID: SID: CoursesCourses ScoreScore SemSem DispDisp

Degree (BS/MS) is missing because Degree (BS/MS) is missing because separate books are maintained, but the separate books are maintained, but the issue is critical while loading dataissue is critical while loading data

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

1919

Karachi: Facts About DataKarachi: Facts About Data

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

2020

Data from Islamabad CampusData from Islamabad Campus

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

2121

Data from Islamabad Campus: SampleData from Islamabad Campus: Sample

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

2222

Islamabad: Header of Student TableIslamabad: Header of Student Table

Roll NumRoll Num NameName FatherFather Reg DateReg Date Reg StatusReg Status Degree StatusDegree Status Date of BirthDate of Birth EducationEducation GenderGender AddressAddress

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

2323

Islamabad: Header of Course Reg. TableIslamabad: Header of Course Reg. Table

Roll Num:Roll Num: CourseCourse MarksMarks DisciplineDiscipline SessionSession

Degree (BS/MS) is missing, whereas Degree (BS/MS) is missing, whereas same table contains records for both. same table contains records for both. Only way to differentiate is through Only way to differentiate is through discipline attribute.discipline attribute.

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

2424

Islamabad: Facts About DataIslamabad: Facts About Data

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

2525

ExerciseExercise

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

2626

Problems with Adhoc ApproachProblems with Adhoc Approach

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

2727

LAHORE

KARACHI

ISLAMABAD

PESHAWAR

Text Files

Excel Book

MS-ACCESS

Text Files

Uses

Uses

Uses

Uses

Problem-1: Non-Standard data sourcesProblem-1: Non-Standard data sources

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

2828

Problem-2: Non-standard attributes Problem-2: Non-standard attributes

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

2929

Problem-3: Non Normalized database Problem-3: Non Normalized database

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

3030

Notepad: IssuesNotepad: Issues

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

3131

MS-Excel: IssuesMS-Excel: Issues

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

3232

MS-Access: IssuesMS-Access: Issues

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

3333

Problem StatementProblem Statement

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

3434

Data from Peshawar Campus Data from Peshawar Campus

Data at Peshawar campus is stored in Data at Peshawar campus is stored in Text filesText files

To store data regarding one complete To store data regarding one complete batch 2 text files are usedbatch 2 text files are used Lhr_Student_batch (Student record)Lhr_Student_batch (Student record) Lhr_Detail_batch (Course Reg. record)Lhr_Detail_batch (Course Reg. record)

22 text files for 11 BS batches22 text files for 11 BS batches 8 text files for 4 MS batches8 text files for 4 MS batches

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

3535

Data from Peshawar Campus: SampleData from Peshawar Campus: Sample

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

3636

Peshawar: Header of Student TablePeshawar: Header of Student Table

Reg#: Reg#: Student identityStudent identity Name: Name: Student nameStudent name Father: Father: Father nameFather name Address: Address: Permanent addressPermanent address Date of Birth: Date of Birth: Date of BirthDate of Birth lastDeg: lastDeg: Last degree achieved Last degree achieved Reg Date:Reg Date: Date of Enrollment Date of Enrollment Reg Status: Reg Status: Status of Enrollment (A/T)Status of Enrollment (A/T) Degree Status: Degree Status: Status of Degree (C/I)Status of Degree (C/I)

DWH-Ahsan AbdullahDWH-Ahsan Abdullah

3737

Peshawar: Header of Course Reg. TablePeshawar: Header of Course Reg. Table

Reg#: Reg#: Courses: Courses: Course codeCourse code Score: Score: Out of 100Out of 100 Program: Program: CS/TC/SE/CECS/TC/SE/CE Sem: Sem: Fall/SpringFall/Spring Year: Year: YYYY e.g. 1999YYYY e.g. 1999

We need to identify semester session We need to identify semester session (fall04) through combination of Sem and (fall04) through combination of Sem and YearYear

top related