data checking

21
Data Checking Applied Database II

Upload: mohawk

Post on 07-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Data Checking. Applied Database II. File structure. A database consists of a number of related files . A file consists of a number of related records . A record consists of a number of related fields . A field is a single data item consisting of a number of characters. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Checking

Data Checking

Applied Database II

Page 2: Data Checking

File structure

• A database consists of a number of related files.

• A file consists of a number of related records.

• A record consists of a number of related fields.

• A field is a single data item consisting of a number of characters.

Page 3: Data Checking

The database we will use on this course is

Microsoft Access, Microsoft FoxProTable structure

StudentID Surname Forename TYear TGroup Gender

2814 AHUJA Vishal 10 E M

2817 ASIJA Vandana 10 J F

2822 BENZIE Nicholas 10 E M

2823 BIBBY Carol 10 M F

2824 BOSWOOD Peter 10 G M

2827 CHAN Annjeanette 10 S F

2828 CHAN Anthony 10 G M

2829 CHAN Benson 10 S M

NB: In Access-

File = Table, Record = Row, Field = Column, Value = Cell

Page 4: Data Checking

Types of Files• A Master file is a permanent file of all the data

needed by a business. Some of the fields may be regularly updated from transactions

• A Transaction file is a temporary file of all the transactions (items, bought sold, added, deleted etc) which have taken place in a given period.

• An Archive file is a file of data in long-term storage (usually for legal reasons). Archive files will contain all information about the past dealings of the business and would normally be stored on a different site in case of fire etc

Page 5: Data Checking

The Data/Information Processing Cycle

• Acquisition/Origination, The collection of raw data from the outside world so it can be put into an information system

• Input, Putting the acquired data into the information system.

• Validation, Checking the quality of the data before it is processed

• Processing (manipulation), This is a key point. At this stage DATA is converted into INFORMATION.

• Storage, Unless you want to input the data every time you process it, it's sensible to store the data

• Retrieval, The reverse of storage

• Output, All the previous steps are useless unless you can see the results

• Communication, Sending data or information to another place. • Disposal, Deleting data or archiving data.

Page 6: Data Checking

Data Validation

• Data validation checks that the data is sensible before it is processed.

• Methods used for validation are...– range check; This checks that the data lies within a

specified range of values.

Eg the month of a person's DOB should lie between 1 and 12

– presence check; This checks that important data is actually there and has not been missed out.

Eg Customers may be required to have their telephone numbers.

Page 7: Data Checking

Data Validation (Cont.)

– check digits-- Used for numerical data. An extra digit is added to a number which is calculated from the digits

Eg The ISBN number on a book. The last digit is a check digit.

– batch totals--This checks for missing records. Numerical fields may be added together for all records in a batch

Eg Add the 'Total Cost' field of a number of transactions together.

– hash totals--This is just a batch total done on a meaningless field.

Eg Add the Telephone Numbers together for a number of Customers.

Page 8: Data Checking

– Reasonableness, does it seem logical? Seventy year old women are not likely to be pregnant

– Consistency, died before being born?

– Format, numeric name?

– Text check – spelling, grammar – Completeness, all digits present

– Sequence, data is ordered?

– Timeliness, is the data out of date?

Data Validation (Cont.)

Page 9: Data Checking

Data Verification

• Verification is used to prevent errors occurring when data is copied from one medium to another. (eg paper to disk, disk to disk, memory to disk)

• Methods used for verification...– Double keying-- Used to check for transcription errors.

The data is entered twice (by two different people).

– Visual check-- Checking for errors by looking through the data. Eg Proof-reading a typed document.

– Parity-- Used to check for transmission errors over networks or between memory and disk.

Page 10: Data Checking

Validation in Spreadsheet

• Provide a Drop-down List of Options

(1) (2) (3)

Page 11: Data Checking

• Validation Criteria Examples » Whole Number

– Decimal

» Date

Validation in Spreadsheet (Cont.)

Page 12: Data Checking

• Validation Criteria Examples – Time

– Text Length

Validation in Spreadsheet (Cont.)

Page 13: Data Checking

Validation on database

• Validation cannot ensure that the data on the database is correct, only that it is possible.

• What validation does is ensure that the data on the database follows certain business rules (semantics) of the database.

• Some of these rules are both general and absolute, ie

– the primary key of each row in a table must be non-null and unique (entity integrity)

– all foreign keys (references to other tables) must be valid (ie correspond to an existing primary key on the referenced table) or (sometimes, it depends on the semantics) wholly null

Page 14: Data Checking

• Some of these rules are general about fields

– a numeric field must contain a number

– a date/time field must contain a valid date/time

• Some are specific to the logic of a particular field

– the quantity ordered on an order line must be a positive integer

– the date of an order cannot be earlier than the date on which it is created

– a gender field must be male or female

– the county field must contain one of the values from a valid list of counties (but beware if you ever have overseas customers!)

Validation on database (Cont.)

Page 15: Data Checking

• Some of these field rules may be 'reasonableness' rather than absolute

– an active employee with a date born indicating an age of 120 or 3 is clearly wrong

– an active employee with a date born indicating an age of 85 or 13 is probably wrong (and should generate a warning rather than preventing the input)

• Some depend on more than one field, possibly on more than one table

– An order's delivery date cannot be any earlier than the order date

– If a customer is credit class 'cash only' then the customer's order payment type must be 'cash'

Validation on database (Cont.)

Page 16: Data Checking

Validation in MS Access

• field type-- often forgotten but it does ensure number fields contain numeric values

• input mask-- The input mask is a regular expression describing valid input

• Default value-- This makes sure that there is a value

• Validation Rule-- This allows you to specify an expression for valid values (this must be a simple expression such as '>12 and < 66' and cannot involve other fields

• Validation Text The text displayed in a message box if the validation rule is broken

Page 17: Data Checking

• Required-- Does the user have to input a value

• Allow Zero Length-- (Strings only) Can the field contain an empty string

• Indexed-- Is there an index on this column and, if so, does it allow duplicate values. Primary keys will be Indexed, no duplicates

Validation in MS Access (Cont.)

Page 18: Data Checking

• Field level validation– In Visual Basic, TextBoxes can respond to the

following events (amongst others):

• KeyPress (generated when the user types an ascii character into the TextBox)

• LostFocus (generated when the user transfers the focus outside of the TextBox - e.g., to another TextBox, or to a button)

• Record level validation– Immediately prior to writing data to the database,

Visual Basic generates a "Validate" event on the data control. This happens whenever the current record is changed (e.g., by using MoveNext) and when the Update method is invoked.

Validation in VB/MS Access

Page 19: Data Checking

Private Sub txtTitle_KeyPress (KeyAscii As Integer)

Dim char As Stringchar = Chr(KeyAscii)If char >= "0" And char <= "9"

ThenKeyAscii = 0

End IfEnd Sub

Validation in VB/MS Access (Cont.)

Page 20: Data Checking

Private Sub txtDoB_LostFocus () Dim msg As String If txtDoB.Text <> "" And Not

IsDate(txtDoB.Text) Then msg = "Please enter a valid date,

or leave the field empty."

msgBox msg txtDoB.SetFocus

End IfEnd Sub

Validation in VB/MS Access (Cont.)

Page 21: Data Checking

Record Level ValidationPrivate Sub Address_Validate (Action As Integer, Save As Integer) Dim msg As String If Not Address.RecordSet.EOF Then ' RecordSet is not empty, so validate current record

If Address.RecordSet.Fields("Surname") = "" Then msg = "Surname is a required field.

Please enter a Surname" MsgBox msg Action = vbDataActionCancel ' Cancel whatever action generated the event Save = False ' Do not write the record

to the databaseEnd If

End IfEnd Sub