the conundrum of optical storage

2
The Conundrum of Optical Storage Barrie T. Stern ADONIS Project Director, Elsevier Science Publishers, Sara Burgerhartstraat 25, 1055 KV Amsterdam, The Netherlands The euphoria surrounding the 10 cm diameter CD-ROM (Compact Disc Read Only Memory) within publishing and other information technology circles is brought about perhaps by three factors: - the ability to have locally accessible in- house databases which can be access- ed without relying on costly and sometimes unreliable telecommunica- tions networks; -the likely availability of low cost microcomputer workstations that can be used for a variety of other applica- tions; - the existence of standards that (hopefully) avoid system incom- patibility. The large majority of CD-ROM based information products are attempts to launch existing print or online products using a new form of information carrier: optical disc. As such it is almost a tech- nology in search of a market and it is ap- propriate to look at some of the techno- logy constraints of the medium so as to judge its suitability for different types of product. The Technology Optical discs mostly have a very thin film of material that can be burned by a high intensity laser beam. The material, such as tellurium suboxide, is irreversibly changed when it is burned. When the laser beam is very thin and is carefully focussed, a pit or hole can be 'written' that is 5 microns (5 x 10.3 mm) in diameter. These pits can be arranged in groups to represent a code of 'O's and ' l's used as computer-readable character codes to represent letters, numbers, etc. (e,g. the American Standard Code for Information Interchange--ASCII), where the letter 'A' is represented as 1000001. Most of the optical disc products being developed or offered are such text-based systems using ASCII codes as a series of pits on the surface of the disc. To read the disc (which has been indelibly written by the producer so the information can not be changed) a laser beam similar to that used 'to write' the disc is passed across the disc surface. Where there is a pit, the electrical impulse is a' 1' in the ASCII code, and where there is no pit, a '0' is read. Whereas such a system can be used for text containing standard letters and numbers, the system is unable to handle special characters (outside the character set of the ASCII or similar systems) such as mathematical or chemical symbols, graphs and illustrations such as half-tone photographs. To accommodate these it is necessary for optical disc systems to 'bit map' the information. In this system the same fine focus laser beam scans the printed symbol or part of a picture many times, each time moving very slightly 'down the page'. For example, if the im- age is the letter 'A', the beam can be mov- ed from left to right and register a 'miss or a hit' each time it encounters white space or the black printed image: 000010000 000101000 001111100 010000010 100000001 Clearly the finer the laser beam focus the more often it can scan from left to right and the greater the detail it can pick up. This brings us to the question of resolu- tion. Each time the laser scans a page from left to right it produces a bit stream of 'O's and 'l's: the more fine the detail needed, the more times it must scan across the page. Scanning resolutions are expressed as either 'lines per mm' or 'pels'. The first refers to the number of times the beam scans a character theoretically I mm high and the pel is the'number of picture elements per square inch which is equivalent to the number of 'misses and hits' that would occur in scanning a page area of 1 inch height and 1 inch breadth. For our purposes we can use the equivalents of 8 lines/mm equals 200 pel and 12 lines/mm equals 300 pel. For fineness of detail we can consider the size of typeface used in a printed page or for halftone illustrations the coarseness of the screen used. In a recent study of journals that might be included in ADONIS (a European project concerned with document delivery), type sizes down to 1 millimeter (approximately 0.04 inches or just under 3 point type) and line widths down to 0.1 millimeters (4 mils or about 0.004 inches) were revealed. In addition the same study of 41 journals revealed a preponderance of graphic materials of all types as well as photographs. These results are not par- ticularly surprising and give credence to the establishment of a minimum stan- dard of 300 x 300 pel per square inch. Capacity Compact discs have a capacity of about 600 mbytes, which is equivalent to 200,000 average A4 typewritten pages or 80,000 printed pages in character mode. (For information that is to be searched via inverted file structures rather than merely displayed or printed, then the ef- fective capacity is halved.) The equivalent of bit mapped pages depends on the 'density' of the information on the page. The more halftones or the smaller the type size, the smaller the storage capacity as expressed in page equivalents. For bit mapped information it is possi- ble to omit the white unprinted space on a page by using a software system that compresses the bits at a ratio of (com- monly) 8-12:1, thereby increasing the disc storage capacity. For purposes of storage estimates, it is reasonable to use a capacity of 5,000 printed pages scanned at 300 pel resolution for a CD-ROM. The larger 30 cm diameter optical discs (such as the gigadisc) are 2-4 times this capacity depending on the number of illustra- tions, compression ratios, single or dou- ble sided with similar increased capacity 200

Upload: barrie-t-stern

Post on 21-Jun-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The conundrum of optical storage

The Conundrum of Optical Storage Barrie T. Stern ADONIS Project Director, Elsevier Science Publishers, Sara Burgerhartstraat 25, 1055 KV Amsterdam, The Netherlands

The euphoria surrounding the 10 cm diameter CD-ROM (Compact Disc Read Only Memory) within publishing and other information technology circles is brought about perhaps by three factors: - the ability to have locally accessible in-

house databases which can be access- ed without relying on costly and sometimes unreliable telecommunica- tions networks;

- t h e likely availability of low cost microcomputer workstations that can be used for a variety of other applica-

tions; - the existence of standards that

(hopefully) avoid system incom- patibility.

The large majori ty of CD-ROM based

information products are attempts to launch existing print or online products using a new form of information carrier: optical disc. As such it is almost a tech- nology in search of a market and it is ap- propriate to look at some of the techno- logy constraints of the medium so as to judge its suitability for different types of product.

The Technology

Optical discs mostly have a very thin film of material that can be burned by a high intensity laser beam. The material, such as tellurium suboxide, is irreversibly changed when it is burned. When the laser beam is very thin and is carefully focussed, a pit or hole can be 'writ ten' that is 5 microns (5 x 10 .3 mm) in diameter. These pits can be arranged in groups to represent a code of 'O's and ' l ' s used as computer-readable character codes to represent letters, numbers, etc. (e,g. the American Standard Code for Informat ion In terchange--ASCII) , where the letter 'A ' is represented as 1 0 0 0 0 0 1 .

Most of the optical disc products being developed or offered are such text-based systems using ASCII codes as a series of pits on the surface of the disc. To read the disc (which has been indelibly written by the producer so the information can not be changed) a laser beam similar to that used ' to write' the disc is passed across the disc surface. Where there is a pit, the electrical impulse is a ' 1' in the ASCII code, and where there is no pit, a '0 ' is read.

Whereas such a system can be used for text containing standard letters and numbers, the system is unable to handle special characters (outside the character set of the ASCII or similar systems) such as mathematical or chemical symbols,

graphs and illustrations such as half-tone photographs. To accommodate these it is necessary for optical disc systems to 'bit map ' the information. In this system the same fine focus laser beam scans the printed symbol or part of a picture many times, each time moving very slightly 'down the page ' . For example, if the im- age is the letter 'A ' , the beam can be mov- ed f rom left to right and register a 'miss or a hit' each time it encounters white space or the black printed image:

0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1

Clearly the finer the laser beam focus the more often it can scan f rom left to right and the greater the detail it can pick up. This brings us to the question of resolu- tion. Each time the laser scans a page from left to right it produces a bit stream of 'O's and ' l ' s : the more fine the detail needed, the more times it must scan across the page.

Scanning resolutions are expressed as either 'lines per m m ' or 'pels ' . The first

refers to the number of times the beam scans a character theoretically I mm high and the pel is t h e ' n u m b e r of picture elements per square inch which is equivalent to the number of 'misses and hits' that would occur in scanning a page area of 1 inch height and 1 inch breadth.

For our purposes we can use the equivalents of 8 l ines/mm equals 200 pel and 12 l ines/mm equals 300 pel.

For fineness of detail we can consider the size of typeface used in a printed page or for halftone illustrations the coarseness of the screen used.

In a recent study of journals that might be included in ADONIS (a European project concerned with document delivery), type sizes down to 1 millimeter (approximately 0.04 inches or just under 3 point type) and line widths down to 0.1 millimeters (4 mils or about 0.004 inches) were revealed. In addition the same study of 41 journals revealed a preponderance of graphic materials of all types as well as photographs. These results are not par- ticularly surprising and give credence to the establishment of a minimum stan- dard of 300 x 300 pel per square inch.

Capacity

Compact discs have a capacity of about 600 mbytes, which is equivalent to 200,000 average A4 typewritten pages or 80,000 printed pages in character mode. (For information that is to be searched via inverted file structures rather than merely displayed or printed, then the ef- fective capacity is halved.) The equivalent of bit mapped pages depends on the 'density' of the information on the page. The more halftones or the smaller the type size, the smaller the storage capacity as expressed in page equivalents.

For bit mapped information it is possi- ble to omit the white unprinted space on a page by using a software system that compresses the bits at a ratio of (com- monly) 8-12:1, thereby increasing the disc storage capacity. For purposes of storage estimates, it is reasonable to use a capacity of 5,000 printed pages scanned at 300 pel resolution for a CD-ROM. The larger 30 cm diameter optical discs (such as the gigadisc) are 2-4 times this capacity depending on the number of illustra- tions, compression ratios, single or dou- ble sided with similar increased capacity

200

Page 2: The conundrum of optical storage

ratios for character-coded material (although of course the compression ratio is not then a factor).

As the output equipment to read and print information stored on the discs costs $10-15,000 for CD-ROM and $200-250,000 for the larger discs, the ex- tra capacity of the larger disc is greatly offset by this cost differential added to which is the much lower cost of preparing copies of CD-ROM discs by pressing compared with laser-rewriting to make copies of larger discs.

The Publisher's Choice

In selecting the way in which the publisher represents his information in print, there is consideration given to dif- ferent typefaces, page layout, position- ing of graphs, tables and halftones, as well as binding, paper weight and size,

etc. For optical disc applications the criteria are much simpler:

- is it necessary to keep the typeface characteristics and other aspects of page layout or can the information be presented only in ASCII-coded character format?

- are the tables and illustrations vital? If the answer to these two questions is

no, then ASCII mode can be suitable. If the answer is yes, then bit mapping will need consideration, but then the volume of information (so many pages) will be less. -' as disc mastering costs at the moment

are at least $3,000 for CD-ROM, how many new discs per year should be issued? (This is a function of the need to have current information or only reference type material.)

In the future hybrid or 'mixed mode' systems will be available where the text is in ASCII and tables and illustrations are

bit mapped. Similarly, colour at output (rather than a black and white image of an original four colour halftone) will be possible, but as always the trade off be- tween quality and quantity are part of the cost calculations.

What emerges from this is that most current CD-ROM applications are of ASCII-coded character representations of reference type material that does not age quickly and where the intrinsic value of the information per printed page equivalent is high (e.g. data of collections of abstracts or indexes rather than full text journal articles).

This picture will change as more libraries have workstations installed that can also be used for functions other than document delivery, where the cost of disc mastering should drop by 30-50% in 2-3 years and possibly higher compression ratios can be achieved.

201