m150: data, computing and information 1 unit five: storing, getting and sending your data

50
M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

Post on 20-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

M150: Data, Computing and Information

1

Unit Five: Storing, getting andSending your data

Page 2: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

1- Introduction The aims of this unit are:

Describe the notion of persistent data, how it is created, and how it is stored and accessed (logically and physically) on various types of storage device.

Explain how the internet and the applications that use it work, and address some of the issues that arise from transmitting data between computer systems

Explain how databases facilitate the storage, access and protection of data, and how metadata is important in providing access to multimedia databases

Explore the issues of privacy and ownership of data and analyze some of the risks arising from storing data on computers and transmitting it across networks

2

Page 3: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Storing text-based data in documents Persistent data: documents to persist (to exist after closing down

the application that created them or after switching off your computer) - you need to be saved.

To facilitate subsequent retrieval, you store your documents in some logical arrangement on a suitable storage medium for holding persistent data such as your computer’s hard disk.

Organized fashion example (strategy for retrieving your documents quickly): The filing cabinet has several drawers and each drawer may hold a

large number of files. Each file contains a number of related documents.

If you access your hard disk from the computer desktop, you will find a number of icons there. Many of these icons contain other objects (documents and folders). Inspecting the contents of the disk this way reveals what is termed a flat structure 3

Page 4: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Document: is the lowest level of the hierarchy, in which there are no further folders to open.

A hierarchical or nested folder structure: where each folder may contain other folders. At each stage you can see only those documents which are stored at that level.

You can use Windows Explorer to inspect the contents of a disk. An Explorer window has two panes. The right-hand pane is essentially the same as an ordinary folder window; its contents can be displayed as icons or small icons or as a list.

4

Page 5: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

The left-hand pane does not show any documents, but it does show hard disks, folders and any icon which holds other items (‘network share’). Some folders have a small + or – next them

You can think of the folder structure loosely as a tree lying on its side. The desktop is the root of the tree, and each folder is a branch. The leaves of the tree correspond to documents.

Any similar hierarchical arrangement of objects is frequently called a tree structure or just a tree.

The number of documents/folders that can store in a single folder is not limited, but it is best not to put too many items in a single folder since the human mind does not easily comprehend huge collections. (Preferably do NOT exceed 20)

5

Page 6: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

A path contains the names of all the folders (branches of the tree) that lead to it from the root

A path allows you to identify unambiguously a folder or docummnt and is often referred to as its full name or full path name (documents in the same folder must have different names)

The search/Find function Operating systems come with a search function which allows you to

find items you have ‘lost’. To find a document click on search, and dialog box will appear Consider the Windows path name:

C:\Projects\M150\Assignments\TMA02.doc ‘C:’ is the root. ‘Projects’ is the name of a folder at the top

level of the hard disk which contains a folder called ‘M150’ which in turn contains an ‘Assignments’ folder. The document ‘TMA02.doc’ is in the ‘Assignments’ folder.

6

Page 7: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Directories: Each folder has a list, or directory, of the folders and

documents that it contains. Part of the directory for a given folder can be displayed in a number of ways to aid human identification of the contents: Alphabetically by name is an obvious ordering; In order of last modification date By size is sometimes useful; By type

The directory of a folder also lists the address or physical location on the disk of each document and subfolder in that folder. This address is internal to the operating system and cannot be seen in a user window. In the case of a subfolder, the address given is the location of the directory of that subfolder.

7

Page 8: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Storage technologies There are various technologies of data storage: Storing data can be on the hard disk, or on removable

storage media such as CDs, DVDs, Zip disks, high-capacity tape cartridges…

There are various measures of storage size (Capacity):

8

Page 9: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Hard disk storage Hard disk is large enough to store several multi-volume

encyclopedias. Hard disk with a capacity of up to terabyte is available

The general principle of hard-disk storage are coated with a magnetic material that can be magnetized into a pattern representing a sequence of bits. The surface consists of millions of tiny magnets, which can each be magnetized in two possible directions representing 1 or 0. (computerized data is held and transmitted as sequences of bits)

A hard disk is 1 to 3 inches in diameter, and consists of one or more circular plates, each having two surfaces. The plates may be aluminum, ceramic or glass, and the surfaces are coated with a magnetisable ferrite coating. Data is recorded on each surface by magnetizing a series of concentric circles called tracks.

9

Page 10: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

The disk surface is divided into a number of equal sized wedge-shaped regions called sectors. Within a sector each track holds the same amount of data – usually 512 bytes.

Block is the basic unit of data handled by the disk control mechanism (each block of data is guaranteed to be the same size)

Bucket : is the unit of transfer several blocks between disk storage and the computer’s memory in a single operation

The sector hold the same amount of data on each track when the outer tracks are large than the inner tracks. This is because the bytes are packed more closely on the small inner circles than on the larger outer circles. 10

Page 11: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

The actual reading from and writing to the disk surface is performed by a read/write head, which is attached to an arm that moves to and from the centre. The disk is kept spinning continuously, so each sector is under the head at some time.

The head hovers close to the spinning surface, which needs to be engineered carefully to avoid physical contact between the head and the surface (disk crash).

Another cause of a disk crash is if a particle of dust gets in the tiny gap (5 microns or less) between the head and the surface.

For each plate in a disk there are two read/write heads, one for each surface. In a read operation the head detects a magnetized pattern. In a write operation, the head magnetizes the relevant pattern of bits on to the surface.

The heads associated with all surfaces move in and out together; at any one time they can read from the corresponding tracks on both surfaces of every plate in the disk. This set of tracks is called a cylinder (a disk may have 16000 cylinders i.e. each plate would have 16000 tracks on it).

11

Page 12: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Typical rotation speeds for hard disks are 5,000–15,000 revolutions per minute. A disk which rotates at 10,000 rev/min and has 60 sectors takes 0.1 millisecond (ms) for a 512-byte block of data to pass under the head.

On average it takes half a rotation (3ms) for the desired sector to reach the head – this is called the average latency. You also need to add to this the seek time (time taken for the head to move to the relevant track ~3–10ms)

What is the storage capacity in GB of a disk with 16 read/write heads, 40,000 cylinders, and each surface formatted into 50 sectors?

The number of track sectors = 16 x 40 000 x 50 = 32 000 000

Since each sector holds 512 bytes:

capacity = 32 000 000 x 512 bytes = 15GB (approx)12

Page 13: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Removable storage media devices Zip drive, which can accept removable hard disks of

100MB or 250MB capacity. It works on exactly the same principle as a fixed hard disk, but you can change the disk in the drive. (portable)

Read/Write floppy disks: (work as zip drive) capacity limited to 1.4MB.

Memory card: A removable medium which is very popular.

Optical discs which may be CD (compact disc) or DVD (digital versatile disc) Store documents using a different technology based on

the optical properties of the surface. The capacity of a CD is 650MB and data is stored on only one side of it in a single spiral groove which winds round the disc 22,188 times.

Outer tracks of the groove hold more data than inner ones (data is packed uniformly), so the disc spins more slowly when accessing data near the centre

13

Page 14: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Conventional CDs are called CD-ROMs (Read-Only Memory), and have bits of data stored as ‘pits’ in their groove. Beams of laser light are used to burn the pits on the disc. A CD drive works by shining a low-power laser beam on the disc, which detects the presence or absence of a pit (the pits do not reflect the light).

DVDs (DVD-ROMs) pack the data more tightly, using: Smaller pits a narrower groove Less overhead for error correction which increase the

capacity of a simple DVD to 4.7GB. DVDs can also be manufactured to use both sides of the

disk, and each side can have one or two layers, yielding a theoretical maximum capacity of 19MB

One important difference between CD/DVD-ROM discs and magnetic disks is the ability to write to them. Hard disks allow you to rewrite data, but standard optical disc do not (once a pit has been burned, it cannot be erased).

14

Page 15: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

There are two kinds of CDs: Recordable CDs (CD-R). Instead of burning pits on the

CD, the writing process dyes the relevant parts of the groove. When read by a CD drive these dye spots are indistinguishable from pits on a conventional CD. The process is not reversible.

Rewritable CDs (CD-RW), use a different technology altogether. CD-RW writer can heat a point on the disk to one of two temperatures corresponding to different states of the material. This process is reversible.

Not all CD drives can read CD-RW disks.

15

Page 16: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Labeling volumes Typically a Zip disk or CD is called a volume. Hard disk is likewise called a volume. It is possible to

partition a disk so that its contents appear to occupy more than one volume.

Identifying a CD or other removable medium is necessary. It is important that each is given a label with a title. The volume can be stored in a rack with many similar looking ones but can be identified by its external label.

Besides its physical label a volume should also have an electronic label, which, for consistency, should be the same as the physical label. This electronic label is the name of the volume, and it will be displayed when you search the contents of your computer.

16

Page 17: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Sensible organization of storage Each volume contains a large number of documents,

so there has to be a means of locating the one you want.

In the magnetic disk three numbers are required to identify a block of data: cylinder number, surface number and sector number. This set is called the address of the block.

Each volume has a volume table of contents or VTOC. The VTOC is a table with one line for each document.

17

Page 18: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

A single document might occupy one or more blocks on the disk. At the end of each block there is a marker which either indicates that

this is the final block for the document or gives the address of the block that holds the next portion of the document.

Moving documents Moving a document between folders on a disk is really an illusion

because the document does not move at all! The document’s physical location remains unchanged, but the

directories change (as illustrated in page 21). The directories will be changed, but the individual items are

stored on the volume exactly where they were before the move

18

Page 19: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Deleting documents Modern operating systems usually have mechanisms to

protect users against themselves. The operating system does not obey your instruction,

but, instead moves the document to a special folder called ‘Recycle Bin’ or ‘Trash’ from which it can be retrieved.

When your hard disk is becoming too full – you can decide to ‘empty’ it.

When the documents moved to recycle bin, they did not go anywhere; they remained in the same physical position on the disk. It was the directory entry for the document that was removed, with a new directory entry being created in the recycle bin.

19

Page 20: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

What you perceive when you navigate through the folders on your computer is not where the documents are located physically, but where they are located logically. That is, you are given a logical view of your documents which shows their relationship to each other in a hierarchical (nested) structure.

The operating system hides from you where items are located physically. The document does not need to be moved when you empty the bin but marked for deletion.

So the document may remain on your disk for a long time without being overwritten. However, it is inaccessible since its directory entries have disappeared.

It may be possible to recover the deleted document using a disk recovery utility 20

Page 21: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

2- Storing and accessing data in documents

Other storage media You cannot afford to rely on the single copies held on your hard

disk. Instead you need a strategy for backing up your work. Magnetic tape is a linear storage medium which is slow

and difficult to access. No direct access as there is with disks. The main strengths of tape are its high capacity, its

reusability and its cheapness. Magnetic tape is ideal for data back up and archiving

A hologram is a three-dimensional image made with the aid of a laser which helps storing much higher volumes of data.

A Biological storage media idea is to represent 0s and 1s using two color states of a suitable form of synthetic DNA. A number of such memory units would be attached to a

support substrate to form a memory cell.

21

Page 22: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

Computer Networking The internet has become a part of society, like the

telephone, radio and television. The web, which is based on the internet, has become

the platform on which all kinds of information are disseminated. For example, educational system or e-commerce (with its own computing practices and legal framework which involves buying and selling goods and services on the web).

Besides the internet, many other computer networks exist (banks, Police, Travel agents and airline…).

A network of computers is linked together by communication links. These links may be: Dedicated cable links; Public telephone networks; Radio or microwaves links.

22

Page 23: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

Any organization using more than one computer is likely to have a local area network (LAN) to exploit the benefits of resource sharing. A LAN may be contained within one building, or it may span several buildings on the same site.

Pocket-sized computers known as PDAs (personal digital assistants) can communicate with each other and with desktop computers using infra-red / Bluetooth signals. They form a small local network.

Many resources can be shared across a LAN like sharing of data, laser printer and a connection to the internet.

23

Page 24: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

The internet The internet comprises a huge collection of computers (called

hosts) with telecommunications links between them. The internet has its roots in the American military-funded

research community of the early 1970s. The first applications to use the internet were based purely on text.

The internet then began to be used for email and for file transfer. Modern graphical tools for accessing the internet (like Netscape Navigator) are much more recent dating back to the early 1990s.

ARPANET was a network of just four computers at four universities linked together as a project of the Advanced Research Projects Agency (ARPA) in the United States.

By 1996 that figure had grown to 15 million host computers, in 2002 it had multiplied ten times to 150 millions.

24

Page 25: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

The internet links together not just one type of computer but any type of computer running any operating system. By adopting the internet protocol each of these computers can become an internet host.

The telephone system (which uses analogue signals consisting of a continuously varying voltage) was designed for voice transmission.

As your computer communicates using digital signals (consisting of discrete bit patterns), a modem (modulator-demodulator) which is a piece of equipment will be sitting between your computer and the telephone socket.

The modem converts the data signals from the computer into analogue signals, a modem at the other end will convert the signal back into digital (typical modem download data at 56kbps).

ADSL (Asymmetric Digital Subscriber Lines) is a technology which allows data to be transmitted digitally at high speed (typically 400kbps) over conventional copper telephone wires.

25

Page 26: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

Browsing the web The web is a collection of hypertext documents

distributed worldwide and linked by the internet. The value of the web is that trillions of pages of web

content are linked together via multiple hyperlinks. Web browser is the software you use to access and view

documents on the web. HTML document is the basic unit of web content is the

web page. The browser accesses the page, held on a remote

computer (web server), and downloads it to your computer (the client).

Speed of download is depend on the amount of data, the speed of modem, the quality of the phone line, the speed of your computer and the amount of traffic on the internet.

26

Page 27: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

Internet addressing A message sent across the internet must have an address

like a letter sent via the postal system. The address has several levels to it, enabling the item of mail

to be routed successfully to the correct destination. Domain is a collection of internet hosts (at the highest level). The internet has two types of top-level domain

Codes of three letters or more group user by category edu: education org: organization com: commercial name: individuals

Codes with two-letter (normally countries) uk: United Kingdom ca: Canada de: Germany

27

Page 28: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

Country code domains are usually subdivided ac.uk (academic community), co.uk (commercial), gov.uk (national and local government)

Many individual domain names are available within each top-level domain. Within ac.uk there is open.ac.uk, the OU domain (the central address for the OU on the internet)

Once the domain name open.ac.uk has been approved by an external agency, the OU is free to allocate sub domains and host names within this naming scheme.

The address associated with a hyperlink is given in the form of a URI (uniform resource indicator), which specifies the service requested and the full address of the document.

28

Page 29: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

http://mcs.open.ac.uk/mcsexternal/courses/m150.htm http:// identifies the protocol (HTTP) mcs.open.ac.uk specifies the server.The host address is in

two part ‘mcs’ , which identifies a particular computer ‘open.ac.uk’, which identifies its domain.

The rest of the address is the path within ‘mcs’ that leads to the required document.

Naming hosts: It is usually convenient to assign a name to each

computer on a network so that users can identify it easily. In a large network it is common to use a systematic naming scheme.

A typical naming system might use names like ‘anemone’ or ‘buttercup’. These hosts in the OU domain would then be known to the internet as ‘anemone.open.ac.uk’ and ‘buttercup.open.ac.uk’

29

Page 30: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

IP numbers The naming scheme (convenient for humans) is not

actually used by the messages that travel across the internet. Instead each host has a 4-byte number associated with it, called its IP (internet protocol) number.

The IP number carried by a message ensures that it reaches the correct destination. A special directories, called domain name servers, keep that information and discover the IP number of message destination host

The first thing that happens when a URI is executed is that the host name is sent to a domain name server to be resolved.

New hosts are being added so domain name servers need to be kept up to date. 30

Page 31: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

Within each domain there is a domain name server which knows the address of each host at the next lower level in the hierarchy within its own domain.

Each domain name server sends updates to neighboring domain name servers

If a domain name server is asked for a host name which does not appear in its directory, it broadcasts a message across the internet to other domain name servers until the IP number is located.

IP numbers consist of four bytes (32 bits). This can accommodate 232 = 4,294,967,296 distinct hosts

31

Page 32: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

Logical and physical names Suppose that your website on ‘orchid.open.ac.uk’ has

crashed. And you acquired an up to-date computer which you name ‘peony.open.ac.uk’ and move your website there.

Problem: Now no one can find your web pages any more, as they are no longer at the same address.

A good solution is to avoid reliance on named physical machines. The way to do this is to identify the web server to the internet not by the name of the physical computer it resides on, but by a logical name.

This means that an index must be kept which associates the logical name of a server with its current physical host.

32

Page 33: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

Email Very popular use of computer networking Email combines high speed with a permanent record. Email is

asynchronous (it does not depend on the receiving party being available at the time you send the message)

Email combines immediacy and permanent record Organizations can have their own internal email system that is

independent of the internet. Email achieves its universality by using text messages

comprising ASCII-coded text only. Unlike other URIs, which identify hosts on the internet, an email address identifies a user: [email protected] The part before the @ symbol is the user name The part after the @ symbol is the domain name Within each domain user names must be unique. 33

Page 34: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

The internet works using a standard protocol. You do not know which route it took or which countries it passed through.

When data travels across the internet, it is broken up into units of a standard size called packets. Each packet carries the address information so that it will reach its intended destination. The packets are re-assembled into a single item on arrival.

Along with the actual data, an email also carries transmission information in a number of lines, called headers.

The header includes: Email address of the recipient, the date and time of dispatch, Subject of the message and ‘Reply-to’ field gives the reply address of the sender

Headers whose names begin with ‘X-’ are used to convey additional information. The header X-Mailer reveals that the message was composed using version 4.01 of the Pegasus

34

Page 35: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

Sending attachments Email transmission is restricted to text, it possible to attach

documents of any kind to an email message This is possible by encoding the attached file as a series of

alphabetic characters and appending them to the end of the message.

An arbitrary attachment can be converted into ASCII code In order to enable the receiving mail client to decode the

attachment, the encoding scheme must conform to a standard. One of the internet standard for encoding mail attachments is MIME (Multipurpose Internet Mail Extensions)

Currently MIME also allows: the use of non-ASCII character sets in email messages, an extensible set of formats for handling non-text parts of messages (pictures) and non-ASCII text in email headers.

35

Page 36: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

3- Transmitting data

How does data travel The transmission medium could be a wire carrying electrical

signals or an optical fiber carrying light signals or it could be a wireless connection such as infra-red, radio or microwave link.

Each case a sequence of bits is transmitted through the medium Communication takes place in the form of serial transmission -

a single channel carries a stream of bits in sequence. Protocols are needed to ensure that, on arrival, the receiving

computer interprets the stream of bits with its original meaning. The telephone and the radio are both modes of communication,

but they differ greatly. Radio communication is one-way but broadcast, whereas a telephone take place in both directions (duplex communication) but its normally one to one.

The receiver need to know when the transmission begins. A simple protocol is to include start & stop bits with each byte

If something has gone wrong, the receiver sends a simple message to the transmitter indicating a failed transmission.36

Page 37: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

4- Accessing data

Databases A database (db) is a collection of data stored in a computer

system according to a set of rules, and organized to facilitate access involving complex searches and selection.

Databases may be particular to an organization or may cover a particular area of knowledge

The primary purpose of Microsoft word is to create and modify documents and you can make them persistent by saving them

The primary purpose of database applications is on making the data persistent, and structuring it so as to minimize redundancy, avoid inconsistency and maximize the usefulness of the data for the purposes of access and updating.

A query (a request that specifies what the user wants) is used to get specific information from the database. The response to the query ideally extracts from the database all the relevant information. So a database is part of an information system.

37

Page 38: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

4- Accessing data

Dedicated software is available to help you express what you want (their requirements) and find it. A well-structured database will facilitate the retrieval of data meeting these requirements.

A database can be much more versatile than a text document or spreadsheet. Typically, a text document is one-dimensional – it is intended to be read in sequence from beginning to end. A spreadsheet is basically two dimensional; it presents data in a grid going across and down.

Field: specify a property or attribute in the table (the heading in the table are the fields)

Record: identify a particular row in the table Databases consist of many tables holding vast amounts of data,

designed with great care in order to be able to provide answers to (possibly complex) queries.

38

Page 39: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

4- Accessing data

A typical industrial database system will consist of: A collection of tables Data (called metadata) which describes the tables (what each

column means, and how many tables there are in a database) Facilities for backing up the tables. Facilities for ensuring security (credit card details) A query facility.

Use these tables to answer the following queries: The engine size of Tom Cobbley’s car is? How many white Vauxhalls are owned by members of staff?

39

Page 40: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

4- Accessing data

Object database Object database: is a database that may contain video, voice

and music along with more traditional forms of data. A data object (sound, image, video) may be contained within

a database as ‘an object in a box’. The box has a name and the database can access the object using this name.

An object stored in this way is called a BLOB (binary large object). This is an enhancement to a conventional database and provides facilities for the storage and retrieval of additional data types apart from text. But the weakness of this approach is that there is no provision for querying the content of a BLOB. (the BLOB has NO structure to it).

Solution: Metadata

40

Page 41: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

4- Accessing data

Metadata It is necessary to provide some form of explanatory data (i.e.

metadata) about the data. Email headers were used to describe an email message are good examples.

Web pages have a rudimentary form of metadata in the form of keywords that can be used by search engines to locate web pages of a particular topic.

The title of your HTML document is an example of metadata. HTML even has a <META> tag for including a variety of metadata

Each item in the <HEAD> section of an HTML document is an example of metadata. It is not part of the content of the document; rather, it says something about the content.

41

Page 42: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

4- Accessing data

An adequate collection of metadata (hooks or pointers) will identify the various features of multimedia databases to make it searchable.

Each still picture, audio clip and video will have a number of associated items of metadata.

MPEG-7 provided a visual descriptors (color, texture, shape, position, motion and face recognition) and audio descriptors (key, mood, tempo and tempo changes).

Using metadata to describe multimedia content will allow users to retrieve the information from databases.

42

Page 43: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

5- Ethical, legal and security issues

Privacy, data rights and obligations Privacy – keeping some things removed from general or

public knowledge – is central to our way of thinking.

There is a lot of data about us in a semi-public domain that we may not even be aware of (loyalty card at a supermarket allow to track all your purchases).

Computer systems provide the facility for the government, the local authority, the tax authorities, your bank and credit card companies to have control over vast, and detailed, amounts of data about you. They can store it for indefinite periods and use it in a variety of ways.

43

Page 44: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

5- Ethical, legal and security issues

Data and the law Data protection may vary from jurisdiction to jurisdiction, and

depends on the willingness of the relevant authorities to enforce it, and of most individuals and organizations to adhere to it.

Data protection laws in any jurisdiction are likely to have some or all of the following characteristics: A legal definition of data (whether the law is limited to electronic

form or also covers handwritten, photographs, audio recording…) A description of how data may be acquired lawfully; What uses the data may be put to; Any time limits on storage; Who may lawfully access and use the data, and for what purpose A description of what legal protection the subject of the data may

have in regards to type, means of gathering, correctness, access and use.

44

Page 45: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

5- Ethical, legal and security issues

Computer ethics Ethics is defined as a set of moral principles that should guide

our acts as a citizen. 1. not use a computer to harm other people.2. not interfere with other people’s computer work.3. not snoop around in other people’s computer files.4. not use a computer to steal.5. not use a computer to bear false witness.6. not copy or use software for which you have not paid.7. not use other people’s computer resources without

authorization or proper compensation.8. not appropriate other people’s intellectual output.9. think about the social consequences of the program you are

writing or the system you are designing.10. always use a computer in ways that ensure consideration and

respect for your fellow humans.45

Page 46: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

5- Ethical, legal and security issues

Linking and theft Web pages often contain links to web pages developed by

other users (gateways).

Gateways: contains links to associated sites

Consider a site which sets itself up as an internet newspaper and contains links to individual stories stored at other online newspaper sites. What should the ethical position be on this? It could be regarded as an example of intellectual property theft. It could be argued that the material has not been stolen because

the text has not been cut and pasted but simply linked to!!!

46

Page 47: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

5- Ethical, legal and security issues

Security Legislation to protect data, and in particular computerized

data, is desirable. However, the law itself is never sufficient. (You never leave your home open when you go out!!)

Once you link your computer to the internet, you need to think about ways of making it less accessible to unwanted visitors who, in computer jargon, are termed hackers.

Some solution prevent others from accessing your computer: By not allowing it to be shared To allow access using a password only. Firewall is used to secure a whole network of computers from

unauthorized outside access (a software system which controls data traffic entering and leaving the network).

47

Page 48: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

5- Ethical, legal and security issues

Ownership and rights over data The concept of data ownership is legally very unclear in most

countries. Who actually ‘owns’ medical records?? (Patients, Doctors, pharmacists or health departments)

Intellectual Property Rights (IPR): the right to gain financially from the products one creates

Moral rights: the right to say how one’s products can be used (the content of letters that you write, and even the content of assignments that you prepare …)

Copyright laws: afford some sort of protection for intellectual property (databases are subject to copyright)

48

Page 49: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

5- Ethical, legal and security issues

Worm, Viruses and Trojan horses A specific attempts to access your computer-held data Junk mail

A simple form of intrusion is unsolicited Email is a nuisance but will not normally harm your

computer You can receive a document on your computer, it could

cause damage to the contents of your storage media A worm

One form of malicious behavior is due to a worm which is intended to subvert a whole network of computers

It is a program intended to subvert a whole network of computers. It transfers copies of itself to other machines on the network.

49

Page 50: M150: Data, Computing and Information 1 Unit Five: Storing, getting and Sending your data

5- Ethical, legal and security issues

A virus It is a program designed to cause specific damage to your software by

attaching itself to documents. (deleting important documents from your hard disk).

It may make your system totally unusable. Viruses appear in many forms. Some viruses run macros which damage your software or your operating

system Macros are small programs which can improve the way in which Word

documents and Excel spreadsheets function by simplifying repetitive actions

Trojan horse It is a code which looks legitimate but attempts to do something quite

different (modify documents on your hard disk, collect passwords, convey network information). Typically the name of the document will be misleading.

Anti-virus software's Protect your system but needs regular updates because new viruses appear

daily Software is available both commercially and in free version. 50