data checking
DESCRIPTION
Data Checking. Applied Database II. File structure. A database consists of a number of related files . A file consists of a number of related records . A record consists of a number of related fields . A field is a single data item consisting of a number of characters. - PowerPoint PPT PresentationTRANSCRIPT
Data Checking
Applied Database II
File structure
• A database consists of a number of related files.
• A file consists of a number of related records.
• A record consists of a number of related fields.
• A field is a single data item consisting of a number of characters.
The database we will use on this course is
Microsoft Access, Microsoft FoxProTable structure
StudentID Surname Forename TYear TGroup Gender
2814 AHUJA Vishal 10 E M
2817 ASIJA Vandana 10 J F
2822 BENZIE Nicholas 10 E M
2823 BIBBY Carol 10 M F
2824 BOSWOOD Peter 10 G M
2827 CHAN Annjeanette 10 S F
2828 CHAN Anthony 10 G M
2829 CHAN Benson 10 S M
NB: In Access-
File = Table, Record = Row, Field = Column, Value = Cell
Types of Files• A Master file is a permanent file of all the data
needed by a business. Some of the fields may be regularly updated from transactions
• A Transaction file is a temporary file of all the transactions (items, bought sold, added, deleted etc) which have taken place in a given period.
• An Archive file is a file of data in long-term storage (usually for legal reasons). Archive files will contain all information about the past dealings of the business and would normally be stored on a different site in case of fire etc
The Data/Information Processing Cycle
• Acquisition/Origination, The collection of raw data from the outside world so it can be put into an information system
• Input, Putting the acquired data into the information system.
• Validation, Checking the quality of the data before it is processed
• Processing (manipulation), This is a key point. At this stage DATA is converted into INFORMATION.
• Storage, Unless you want to input the data every time you process it, it's sensible to store the data
• Retrieval, The reverse of storage
• Output, All the previous steps are useless unless you can see the results
• Communication, Sending data or information to another place. • Disposal, Deleting data or archiving data.
Data Validation
• Data validation checks that the data is sensible before it is processed.
• Methods used for validation are...– range check; This checks that the data lies within a
specified range of values.
Eg the month of a person's DOB should lie between 1 and 12
– presence check; This checks that important data is actually there and has not been missed out.
Eg Customers may be required to have their telephone numbers.
Data Validation (Cont.)
– check digits-- Used for numerical data. An extra digit is added to a number which is calculated from the digits
Eg The ISBN number on a book. The last digit is a check digit.
– batch totals--This checks for missing records. Numerical fields may be added together for all records in a batch
Eg Add the 'Total Cost' field of a number of transactions together.
– hash totals--This is just a batch total done on a meaningless field.
Eg Add the Telephone Numbers together for a number of Customers.
– Reasonableness, does it seem logical? Seventy year old women are not likely to be pregnant
– Consistency, died before being born?
– Format, numeric name?
– Text check – spelling, grammar – Completeness, all digits present
– Sequence, data is ordered?
– Timeliness, is the data out of date?
Data Validation (Cont.)
Data Verification
• Verification is used to prevent errors occurring when data is copied from one medium to another. (eg paper to disk, disk to disk, memory to disk)
• Methods used for verification...– Double keying-- Used to check for transcription errors.
The data is entered twice (by two different people).
– Visual check-- Checking for errors by looking through the data. Eg Proof-reading a typed document.
– Parity-- Used to check for transmission errors over networks or between memory and disk.
Validation in Spreadsheet
• Provide a Drop-down List of Options
(1) (2) (3)
• Validation Criteria Examples » Whole Number
– Decimal
» Date
Validation in Spreadsheet (Cont.)
• Validation Criteria Examples – Time
– Text Length
Validation in Spreadsheet (Cont.)
Validation on database
• Validation cannot ensure that the data on the database is correct, only that it is possible.
• What validation does is ensure that the data on the database follows certain business rules (semantics) of the database.
• Some of these rules are both general and absolute, ie
– the primary key of each row in a table must be non-null and unique (entity integrity)
– all foreign keys (references to other tables) must be valid (ie correspond to an existing primary key on the referenced table) or (sometimes, it depends on the semantics) wholly null
• Some of these rules are general about fields
– a numeric field must contain a number
– a date/time field must contain a valid date/time
• Some are specific to the logic of a particular field
– the quantity ordered on an order line must be a positive integer
– the date of an order cannot be earlier than the date on which it is created
– a gender field must be male or female
– the county field must contain one of the values from a valid list of counties (but beware if you ever have overseas customers!)
Validation on database (Cont.)
• Some of these field rules may be 'reasonableness' rather than absolute
– an active employee with a date born indicating an age of 120 or 3 is clearly wrong
– an active employee with a date born indicating an age of 85 or 13 is probably wrong (and should generate a warning rather than preventing the input)
• Some depend on more than one field, possibly on more than one table
– An order's delivery date cannot be any earlier than the order date
– If a customer is credit class 'cash only' then the customer's order payment type must be 'cash'
Validation on database (Cont.)
Validation in MS Access
• field type-- often forgotten but it does ensure number fields contain numeric values
• input mask-- The input mask is a regular expression describing valid input
• Default value-- This makes sure that there is a value
• Validation Rule-- This allows you to specify an expression for valid values (this must be a simple expression such as '>12 and < 66' and cannot involve other fields
• Validation Text The text displayed in a message box if the validation rule is broken
• Required-- Does the user have to input a value
• Allow Zero Length-- (Strings only) Can the field contain an empty string
• Indexed-- Is there an index on this column and, if so, does it allow duplicate values. Primary keys will be Indexed, no duplicates
Validation in MS Access (Cont.)
• Field level validation– In Visual Basic, TextBoxes can respond to the
following events (amongst others):
• KeyPress (generated when the user types an ascii character into the TextBox)
• LostFocus (generated when the user transfers the focus outside of the TextBox - e.g., to another TextBox, or to a button)
• Record level validation– Immediately prior to writing data to the database,
Visual Basic generates a "Validate" event on the data control. This happens whenever the current record is changed (e.g., by using MoveNext) and when the Update method is invoked.
Validation in VB/MS Access
Private Sub txtTitle_KeyPress (KeyAscii As Integer)
Dim char As Stringchar = Chr(KeyAscii)If char >= "0" And char <= "9"
ThenKeyAscii = 0
End IfEnd Sub
Validation in VB/MS Access (Cont.)
Private Sub txtDoB_LostFocus () Dim msg As String If txtDoB.Text <> "" And Not
IsDate(txtDoB.Text) Then msg = "Please enter a valid date,
or leave the field empty."
msgBox msg txtDoB.SetFocus
End IfEnd Sub
Validation in VB/MS Access (Cont.)
Record Level ValidationPrivate Sub Address_Validate (Action As Integer, Save As Integer) Dim msg As String If Not Address.RecordSet.EOF Then ' RecordSet is not empty, so validate current record
If Address.RecordSet.Fields("Surname") = "" Then msg = "Surname is a required field.
Please enter a Surname" MsgBox msg Action = vbDataActionCancel ' Cancel whatever action generated the event Save = False ' Do not write the record
to the databaseEnd If
End IfEnd Sub