media types text image graphics audio video. representation operations hypertext structured text...

Media TypesMedia Types

Text Image Graphics Audio Video

Representation

OperationsHypertextStructured TextMarked-up Text

ASCIIISO Character Sets

Pattern-matching & searchingFormattingEditing

String Operations

EncryptionLanguage-specific operations

Character Operations

SortingCompression

TextText

ASCII 7-bit code 128 values in ASCII character set use of 8th bit in text editors/word processors creates

incompatibility

ISO character sets extended ASCII to support non-English text ISO Latin provides support for accented characters

à, ö, ø, etc. ISO sets include Chinese, Japanese, Korean & Arabic

UNICODE 16 bit format 32768 different symbols

Text - RepresentationText - Representation

Marked-up text nroff, troff LaTEX SGML

HTML HyTime XML, XSL, XLL

Structured Text structure of text represented in data structure, usually tree-

based ODA, structure embedded in byte-stream with content

Hypertext non-linear graph or “web” structure : nodes and links currently subject of intensive ISO standards activity

Text - RepresentationText - Representation

Character operations basic data type with assigned value permits direct character comparison (a<b)

String operations comparison concatenation substring extraction and manipulation

Editing perhaps the most familiar set of operations on text cut/copy/paste strings v. blocks, dependent on document structure

Text - OperationsText - Operations

Formatting interactive or non-interactive (WYSIWYG v. LaTEX) formatted output

bitmap page description language (Postscript, PDF)

font management typeface point size (1 point = 1/72 of an inch) TrueType fonts : geometric description + kerning

Pattern-matching and Searching search and replace wildcards regular expressions for large bodies of text, or text databases, use of inverted

indices, hashing techniques and clustering.


Sorting numerous varieties of sort, all of them extensively studied in

basic programming sort complexity is a major factor in data handling

performance

Compression ASCII uses 7 bits per character, though most word-processors

actually use the 8th bit to use up a byte per character Information theory estimates 1-2 bits per character to be sufficient

for natural language text This redundancy can be removed by encoding :

Huffman : varies the numbers of bits used to represent characters, shortest codes for highest frequency characters

Lempel-Ziv : identifies repeating strings and replaces them by pointers to a table

Both techniques compress English text at a ratio of between 2:1 and 3:1


Encryption text encryption is widely used in electronic mail and networked

information systems most widely-used techniques :

DES RSA public-key PGP

subject of major controversy : key escrow systems Clipper chip “strong” encryption now being legally outlawed in a number of

countries

Language-specific operations spell-checking parsing and grammar checking style analysis


Representation

Operations

InterlacingChannel DepthNumber of Channels

Colour ModelAlpha Channels

Point operationsEditing

CompressionPixel Aspect Ratio

Geometric transformationsConversion

Indexing

FilteringCompositing

ImageImage

Image - RepresentationImage - Representation

Colour Model 2 main types

colour production on output device theory of human colour perception

CIE colour space international standard used to calibrate other

colour models developed in 1931, as CIE XYZ, based on

tristimulus theory of colour specification

RGB numeric triple specifying red, green and blue intensities convenient for video display drivers since numbers can be easily

mapped to voltages for RGB guns in colour CRTs

HSB Hue - dominant colour of sample, angular value varying from red to

green to blue at 120° intervals Saturation - the intensity of the colour Brightness - the amount of gray in the colour

CMYK displays emit light, so produce colours by adding red, green and blue

intensities paper reflects light, so to produce a colour on paper one uses inks

that subtract all colours other than the one desired printers use inks corresponding to the subtractive primaries,

cyan, magenta and yellow (complements of RGB)


additionally, since inks are not pure, a special black ink is used to give better blacks and grays

YUV colour model used in the television industry also YIQ, YCbCr, and YPbPr Y represents luminance, effectively the black-and-white portion

of a video signal UV are colour difference signals, form the colour portion of a

video signal, and are called chrominance or chroma YUV makes efficient use of bandwidth as the human eye has

greater sensitivity to changes in luminance than chrominance, so bandwidth can be better utilised by allocating more to luminance and less to chrominance

Alpha Channels images may have one or more alpha channels defining regions

of full or partial transparency


can be used to store selections and to create masks and blends

Number of channels the number of pieces of information associated with each pixel usually the dimensionality of the colour model plus the number of

alpha channels

Channel depth number of bits-per-pixel used to encode the channel values commonly 1,2,4 or 8 bits, less commonly 5,6,12 or 16bits in a multiple channel image, different channels can have different

depths

Interlacing storage layout of a multiple channel image could separate channel

values (all R values, followed by all G, followed by all B) or could use interlacing (all RGB for pixel 1, all RGB for pixel 2.........)


Indexing pixel colours can be represented by an index in a colour map or a

colour lookup table (CLUT)

Pixel aspect ratio ratio of pixel width to height square pixels are simple to process, but some displays and scanners

work with rectangular pixels if the pixel aspect ratios of an image and a display differ the image

will appear stretched or squeezed

Compression a page-sized 24-bit colour image produced by a scanner at 300dpi

takes up about 20 Mbytes many image formats compress pixel data, using run-length coding,

LZW, predictive coding and transform coding many image formats : JPEG, GIF, TIFF, BMP most widely used


These operations can operate directly on pixel data or on higher-level features such as edges, surfaces and volumes

Operations on higher-level features fall into the domain of image analysis and understanding and will not be considered here

Editing changing individual pixels for image touch-up, forms the basis

of airbrushing and texturing cutting, copying and pasting are supported for groups of pixels,

from simple shape manipulation through to more complex foreground and background masking and blending

Point operations consists of applying a function to every pixel in an image

Image - OperationsImage - Operations

only uses the pixels current value, neighbouring pixels cannot be used

Thresholding a pixel is set to 1 or 0 depending on whether it is above or below

a threshold value - creates binary images which are often used as masks when compositing

Colour Correction modifying the image to increase or reduce contrast, brightness,

gamma effects, or to strengthen or weaken particular colours

Filtering like point operations, operate on every pixel in an image, but

use values of neighbouring pixels as well used to blur, sharpen or distort images, producing a variety

of special effects


Compositing the combining of two or more images to produce a new image generally done by specifying mathematical relationships between

the images

Geometric Transformations basic transformations involve displacing, rotating, mirroring or

scaling an image more advanced transformations involve skewing and warping

images

Conversions conversions between image formats are commonplace and a

number of p.d, shareware and commercial tools exist to support these

other forms of conversion include compression and decompression, changing colour models, and changing image depth and resolution


Graphics

Representation

Operations

Drawing ModelsEmpirical Models

Physically-based Models

Geometric ModelsSolid Models

ShadingStructural EditingPrimitive Editing

ViewingRendering

External formats for Models

MappingLighting

The central notion of graphics, as opposed to image data, is in the rendering of graphical data to produce an image. A graphics type or model is therefore the combination of a data type plus a rendering operation

Graphics Representation Please note - object in graphics modelling usually refers to

an element of the scene being modelled, unless you are using object-oriented graphics programming

Geometric Models consist of 2D and/or 3D geometric primitives 2D primitives include lines, rectangles, ellipses plus more

general polygons and curves 3D primitives include the above plus surfaces of various forms.

Curves and curved surfaces described by parameterised polynomials

Graphics - RepresentationGraphics - Representation

primitives are first described in local or object co-ordinates, then arranged in groups in a common world co-ordinate system by applying modelling transformations

transformations include rotation, translation and scaling primitives can be used to build structural hierarchies, allowing

each structure thus created to be broken down into lower-level structures and primitives (i.e. blueprinting)

Several standard device-independent graphics libraries are based on geometric modelling

GKS (Graphic Kernel System(ISO)) PHIGS (Programmers Hierarchical Interactive Graphic System (ISO)) -

see also PHIGS+ and PEX OpenGL - portable version of Silicon Graphics library

Solid Models Constructive Solid Geometry (CSG) : solid objects are combined

using the set operators union, intersection and difference.


Surfaces of revolution : a solid is formed by rotating a 2D curve about an axis in 3D space - lathing

Extrusion : a 2D outline is extended in 3D space along an arbitrary path

Using the above techniques will produce models much faster than building them up from geometric primitives, but rendering them will be expensive

Physically-based Models realistic images can be produced by modelling the forces,

stresses and strains on objects when one deformable object hits another, the resulting shape

change can be numerically determined from their physical properties

Empirical Models complex natural phenomena (clouds, waves, fire, etc.) are difficult

to describe realistically using geometric or solid modelling


while physically based models are possible, they may be computationally expensive or intractable

the alternative is to develop models based on observation rather than physical laws, such models do not embody the underlying physical processes that cause these phenomena but they do produce realistic images

fractals, probabilistic graph grammars (used for branching plant structures) and particle systems(used for fires and explosions) are examples of empirical models

Drawing Models describing an object in terms of drawing or painting actions the description can be seen as a sequence of commands to an

imaginary drawing device - Postscript, LOGO turtle graphics

External formats for Models need for export/import formats between graphics packages CGM & CAD are OK. Postscript and RIB are render-only


Primitive editing specifying and modifying the parameters associated with the model

primitives e.g. specify the type of a primitive and the vertex coordinates and

surface normals

Structural editing creating and modifying collections of primitives establish spatial relationships between members of collections

Shading the modelling techniques described so far have provided the means

to specify the shape of objects, but shading provides further information for the image in describing the interaction of light with the object. This interaction is described in terms of the colour of an object, how it reflects light and if it transmits light

Graphics - OperationsGraphics - Operations

several general-purpose methods exist to describe shading, most initially describe the surface of the object using meshes of small, polygonal surface patches

flat shading - each patch is given a constant colour Gouraud shading - colour information is interpolated across a patch Phong shading - surface normal information is interpolated across a

patch Ray tracing & Radiosity - physical models of light behaviour are used

to calculate colour information for each patch, giving highly realistic results

for photorealistic images extremely flexible shading is required, tools such as RenderMan actually provide programmable shaders which can be attached to objects, simulating different light effects and surface normals.

Mapping techniques for enhancing the visual appearance of objects


Texture mapping an image, the texture map, is applied to a surface requires a mapping from 3D surface coordinates to 2D image

coordinates, so given a point on the surface the image is sampled and the resulting value used to colour the surface at that point

shaders can also provide solid textures, where the texture is obtained from 3D rather than 2D space, and procedural textures, where the texture is calculated rather than sampled

Bump mapping as texture mapping, but used to change the vector of the surface

rather than the colour used to describe minor surface changes such as scratches or scrapes

Displacement mapping local modifications to the position of a surface produces ridges or grooves


Environment mapping also known as reflection mapping, used to handle limited forms of

reflection more primitive technique than ray-tracing

Shadow mapping similar to environment mapping in that it provides a primitive

lighting effect without the expense of ray-tracing produces shadows

Lighting within a model, in addition to the graphics objects, there are

lights to illuminate the scene. There are various forms of light source, each of which can be parametrically specified

ambient light - background lighting, comes from all directions with equal intensity

point lights - come from specific points in space, intensity governed by inverse square law


directional lights - located at infinity in some direction, intensity is constant

spot lights - illuminating a cone-shaped volume

Viewing to produce an image of a 3D model we require a transformation

which projects 3D world coordinates onto 2D image coordinates transformation applied to viewing volume, that part of the

model that appears in the image view specification consists of selecting the projection

transformation, usually from parallel or perspective projections although camera attributes can be specified in some renderers, and the view volume

Rendering rendering converts a model, including shading, lighting and

viewing information, into an image software allows selection and fine-tuning of control parameters


output resolution - the width and height of the output image in pixels, and the pixel depth

rendering time - quick and low-quality v. slow and high resolution


Representation

Operations

Frame rateData rate

Sample size and quantisation

Analog formats sampledSampling rate

RetrievalStorage

MixingConversion

Compression

SynchronisationEditing

Support for interactivityScalability

Digital VideoDigital Video

Analog formats sampled Digital video frames can obtained in two ways :

Synthesis - usually by a computer program Sampling - of an analog video signal. Since analog video

comes in various different flavours, according to frame rate, scan rate, composite v component, sampling rate and size vary.

Digital Video - RepresentationDigital Video - Representation

Sampling rate the value of the sampling rate determines the storage

requirement and data transfer rate the lower limit for the frequency at which to sample in order to

faithfully reproduce the signal, the Nyquist rate, is twice the highest frequency within the signal

video processing is simplified if each frame and each scan line give rise to the same number of samples, requiring the sampling frequency to be an integer multiple of the scan rate

Sample size and quantisation sample size is the number of bits used to represent sample

values quantisation refers to the mapping from the continuous range of

the analog signal to discrete sample values choice of sample size is based on :

signal to noise ratio of sampled signal sensitivity of medium used to display frames


sensitivity of the human eye digital video commonly uses linear quantisation, where

quantisation levels are evenly distributed over the analog range (as opposed to logarithmic quantisation)

Data rate high data rate formats can be reduced to lower data rates by a

combination of : compression reducing horizontal and vertical resolution reducing the frame rate

for example : start with broadcast quality digital video at 10Mbytes/s divide the horizontal and vertical resolutions by 2, giving VHS quality

resolution divide the frame rate by 2 compress at a rate of 10:1 data rate becomes 1Mbit/s, suitable for use on LANs and on optical

storage devices (i.e. CD-ROM)


Frame rate 25 or 30 fps equates to analog frame rate, or full-motion video at 10-15 fps motion is less accurately depicted and the image

flickers, but the data rate is much reduced

Compression we have already considered compression techniques, in digital

video we can compare methods by three factors : Lossy v. lossless Real-time compression - trade-off between symmetric models and

asymmetric models with real-time decompression Interframe (relative) v. Intraframe (absolute) compression (i.e.

MPEG-1 v. Motion JPEG)

Support for interactivity random access to frames differential rate and reverse playback cut and paste capability


Scalability scalable video allows control over video quality, we can identify

2 forms : Transmit scalability - encoded data rate is chosen at compression

time from a range of rates, governed by transmission and processing constraints and/or storage capacity. Currently in use for low rate digital video

Receive scalability - decoded data rate is chosen at decompression time to match playback requirements. Attractive concept but not yet available in current video coding standards

current approaches to low rate digital video include : DVI (Digital Video Interactive) - two forms, Production Level

Video (PLV) and Real-Time Video (RTV). PLV only really intended for playback, RTV produces poorer quality but is intended for compression. Both use interframe compression to achieve rates of 1Mbit/s, but require costly hardware.

MPEG-1 - 1Mbit/s


MPEG-2 - broadcast quality video at rates between 2-15Mbit/s

MPEG-4 - low data rate video MPEG-7 - metadata standard for video representation Motion JPEG px64 (CCITT H.261) - intended for video applications using

ISDN (Integrated Services Digital Network). Known as px64 since it produces rates that are multiples of ISDNs 64Kbits/s B channel rate. Uses similar techniques to MPEG but, since compressions and decompression must be real-time, quality tends to be poorer.

H.263 - based on H.261, but offers 2.5 times greater compression, uses MPEG-1 and MPEG-2 techniques.


Storage to record or playback digital video in real-time, the storage system

must be capable of sustaining data transfer at the video data rate 4 main forms of storage for digital video are :

Magnetic tape - at present only magnetic tape can provide the vary high capacity storage required for digital video at practical costs ( 1 hour of CCIR 601 4:2:2 uses 72 Gbytes, while 1 hour of digital HDTV requires nearly 1 Tbyte)

Special purpose magnetic storage systems - useful for short durations of high data rate digital video, can be connected direct to external equipment and are thus useful for capture and editing (see diagram)

Video memory boards - specialist boards with large amounts of semiconductor memory (several hundred Mbytes or more), capable of storing short durations of uncompressed digital video, useful for capture and editing.

Digital Video - OperationsDigital Video - Operations

General purpose magnetic and optical storage systems - most low data rate video representations (MPEG, etc.) were designed to support the use of conventional storage media for real-time video playback. Problem is size of storage, even using MPEG-1 13 minutes of video will fill a 100Mbyte disk.

Retrieval uses frame addressing, as in analog video, but there are some

problems : low data rate formats result in variable sized frames, so an index

giving frame offsets needs to be maintained to support random access

interframe compression techniques, i.e. MPEG, only code key frames independently, other frames are derived from these key frames. So random access requires to first find the nearest key frame and then use this to decode the desired frame, again using the index but enhancing it with key frame locations


Synchronisation suffers same problems as analog video, so uses same

techniques digital video also has some additional techniques not available in

analog video, such as changing resolution to maintain frame rate

Editing 2 types :

tape-based - same procedures as with analog video, except no generation loss and the players are on the same machine

nonlinear - basically a clips-library, using cut and paste techniques to build a video sequence

Mixing real-time effects, such as tumbles, wipes and fades, are

calculated in the same way as for analog video, in fact for the majority of such effects whether the original source is analog or digital, the effects are digitised


non-real-time effects are only possible using digital video, and obviate the need for specialist equipment, being only dependent on the speed of the processor and the patience of the user, storage considerations can be overcome with the use of pointers and single frame editing

Conversion variety of formats demands conversion formats real-time conversion requires specialist hardware compression/decompression within a single format also

requires specialist software/hardware


Representation

Operations

Negative samplesInterleaving

Number of channels (tracks)

Sampling frequencySample size and quantisation

RetrievalStorage

Effects and filteringConversion

Encoding

Editing

Digital AudioDigital Audio

Digital Audio Representation 2 main areas :

telecommunications entertainment (audio CD)

Produced by sampling a continuous signal generated by a sound source. An analog-to-digital converter (ADC) takes as input an electrical signal corresponding to the sound and converts it into a digital data stream. The reverse process, to generate the sound through an amplifier and speakers, involves a digital-to-analog converter (DAC)

Sampling frequency (rate) sampling theory shows that a signal can be reproduced without

error from a set of samples, providing the sampling frequency is at least twice the highest frequency present in the original signal

Digital Audio - RepresentationDigital Audio - Representation

telephone networks allocate a 3.4kHz bandwidth to voice-grade lines, thus a sampling rate of 8kHz is used for digital telecommunications

the human ear is sensitive to frequencies of up to about 20kHz, so to digitise any perceivable sound a sampling rate of over 40kHz is required

Sample size and quantisation during sampling, the continuously varying amplitude of the

analog signal is approximated by digital values, this introduces a quantisation error, being the difference between the actual amplitude and the digital approximation

quantisation error is apparent when the signal is reconverted to analog form as distortion, a loss in audio quality

quantisation error can be reduced by increasing the sample size, as allowing more bits per sample will improve the accuracy of the approximation


quantisation refers to breaking the continuous range of the analog signal into a number of unique digital intervals, based on one of a number of schemes :

linear quantisation - uses equally spaced intervals, so if the sample size is 3 bits and the maximum signal variation is 5.0 then the quantisation interval would be 0.625 units of signal amplitude

nonlinear quantisation (especially logarithmic quantisation) - uses non-equally spaced intervals, lower amplitude intervals are more closely spaced than higher amplitude, results in greater sensitivity to lower amplitude sound where the human ear is most sensitive

Number of channels (tracks) speech quality audio is mono (1 track) stereo audio requires 2 tracks some consumer audio equipment use 4 tracks (quadrophonic) professional audio equipment uses 16, 32 or more


Interleaving a multi-channel audio value can be encoded by interleaving

channel samples or by providing separate streams for each channel

the advantage of interleaving is in synchronisation, and it also offers some benefits in storage and transmission

the disadvantages of interleaving are that it can be wasteful of space or bandwidth if not all channels are needed, it freezes the synchronisation between channels thus preventing temporal shifts, and it may not allow variation in the number of channels

Negative samples the voltages found in analog audio signals alternate between

positive and negative values negative values can be encoded successfully for processing in

twos complement, ones complement or sign-magnitude representation


Encoding encoding audio data reduces storage and transmission costs,

and compressed audio also provides better quality when compared to uncompressed audio at the same data rate

2 commonly-used methods : PCM (Pulse Code Modulation) - uses the fact that a

digital signal can be formed from a series of pulses. PCM values are simply sequences of uncompressed samples, so they provide a reference format for comparison with more complex coding methods

ADPCM (Adaptive Delta Pulse Code Modulation) - reduces PCM data rate by encoding the differences between samples. ADPCM is widely used and is associated with some encoding standards, such as CCITT G.721.


Storage it is possible to record digital audio, even at the data rates of the

high quality formats, on general purpose magnetic storage theoretically, a magnetic disk with a sustainable transfer rate of

5 Mbytes per second could playback 50 channels of CD-quality digital audio. In practice this would not be possible without a highly optimised layout, but one or two channels are easily within the reach of small computer systems

since an hour of stereo digital audio, at the CD data rate, requires over half a Gigabyte of storage, tertiary storage in the form of DAT tapes, CD discs or optical disks is normally adopted, with the information being mounted onto the system manually or through a jukebox

Retrieval need to support random access and ensure continuous flow of

data to DAC

Digital Audio - OperationsDigital Audio - Operations

portions of audio sequences, segments, are identified by their starting time and duration, these can be located is by mapping the starting time to a segment address, which the file system then maps to a physical address on disk

where there is no direct mapping to enable segment location by time code, an index of segments must be separately maintained

continuous flow of data is easy to maintain with a dedicated storage system, but requires careful control where storage is scheduled for a number of such tasks

Editing as with digital video, 2 types :

tape-based disk-based

to avoid audible clicks when inserting one sample into another, cross-fades are used, where the amplitudes of the original segment and the inserted segment are added and scaled about the insertion point


digital audio also supports non-destructive editing, where the segments of data are accessed through a data structure known as a play-list, which essentially contains a set of pointers to the data and details on ordering and other forms of edit to be performed on the data when it is joined

Effects and filtering digital filtering techniques permit a number of effects on audio :

Delay Equalisation & Normalisation Noise reduction & Time compression and expansion Pitch shifting Stereoisation Acoustic environments

Conversion one format to another (uncompressing ADPCM->PCM) altering encoding parameters (i.e. resampling at lower frequency)


Representation

OperationsSMDL

Operational v. SymbolicMIDI

TimingPlayback & Synthesis

Editing & Composition

MusicMusic

The existence of powerful, low-cost, digital signal processors mean that many computers can now record, generate and process music.

Music is also widely used in multimedia applications, so we require a media type for music to focus on the computers musical capabilities.

Representation of Music Operational v. Symbolic

operational representations specify exact timings for music and physical descriptions of the sounds to be produced

symbolic representations use descriptive symbolism to describe the form of the music and allow great freedom in the interpretation

both types are described as structural representations, since instead of representing music by audio samples there is information about the internal structure of the music

Music - RepresentationMusic - Representation

To illustrate the structural representations, we can consider two : MIDI - a widely use protocol allowing the connection of computers

and musical equipment, an operational representation SMDL - a proposal for a standard structure for documents containing

musical information, having both operational and symbolic aspects

MIDI the Musical Instrument Digital Interface was developed in the

early ‘80s by musical equipment makers Devices :

electronic keyboards and synthesisers drum machines sequencers (to record and play back MIDI messages) music<->film and music<->video synchronisation equipment


Connection ports : MIDI OUT - allows a device to send MIDI messages it has produced to

other MIDI devices MIDI IN - receives MIDI messages from other MIDI devices MIDI THRU - repeats received messages, permitting daisy-chaining of

MIDI devices MIDI devices process MIDI messages differently, according to their

function or to the sound palette used by the device, hence different synthesisers can produce different sounds supplied with the same MIDI messages

MIDI Concepts: Channel - a MIDI connection has 16 message channels, devices can be

set to respond to all channels or only to specific channels Key number - notes are identified by key number, 128 compared with

a standard keyboard of 88 Controller - 128 different controllers are available under the MIDI

protocol, though not all are currently defined, changing the value of a controller typically alters sound production


Patch/program - an audio palette is called a program or patch, a synthesiser capable of having a number of patches active at the same time is called multi-timbral

Polyphony - the ability of a synthesiser to play many notes at a time Song - a recorded or preprogrammed MIDI sequence Timing clock - a MIDI sequencer timestamps messages using a

timebase measured in parts per quarter note (PPQ). Typical timebase values are 24, 96 and 480 PPQ. To convert the timebase into actual time you use the tempo, measured in beats per minute (BPM) where we assume that one beat is equal to a quarter note. Thus if we have a tempo of 180 BPM, a time base of 96PPQ = 1/3 x 1/96 = 3.47ms

MIDI synchronisation - MIDI devices can be set to internal synch or external synch, when set to internal synch a device is known as a master and produces a timing clock message on its MIDI OUT at 24PPQ which slave devices use for external synch

MTC - MIDI Time Code is used to synchronise MIDI with film or video, used to trigger sound effects or musical sequences


MIDI Protocol : based on 8-bit code for messages, each message consists of a single

command byte and possibly one or more data bytes (see table) Channel voice messages (8c-Ec) - determine the actual notes

played, speed of hit and release and the values of controllers Channel mode messages (Bc, with controllers 121-127) - selects the

mode of a synthesiser, responding to one channel or all channels, each channel separately voiced or all voices used for one channel

System messages (F0-FF) - general system functions, timing clock, MIDI time code messages, system reset, start device, stop device, etc.

Limitations of MIDI : operates at 31250bps, allows 500 notes per second which may not

be enough for complex pieces limited number of channels, lack of device addressing and other

flaws make configuring large MIDI networks difficult device dependence of MIDI data


SMDL the Standard Music Description Language was developed by the

MIPS committee of ANSI SMDL encompasses representation of music for electronic

dissemination and production by software, the representation of scores and musical examples in printed documents and the representation of musical annotation and attributes used for musical analysis or by music databases

SMDL is a DTD of SGML, based on a document type called musical works or works. Each work has 4 hierarchically structured sections:

core section - musical events, such as note sequences, which form the work

gestural section - performances of the core, which may differ in interpretation

visual section - displays the core in printed, includes formatting and lyrics

analytical section - allows a number of theoretical analyses on the core, its score and performances to be included in the work


In considering music representation, we can recognise several advantages over audio :

music representation will be more compact than audio it is portable and can be synthesised with the fidelity and

complexity appropriate to the output devices used while digital audio suffers from inherent noise, musical

representations are noise free many operations can be performed on music that would be

infeasible or require extensive processing on audio Playback & Synthesis

during audio playback, the listener has limited influence over the musical aspects of the performance, beyond changing the volume or processing the audio in some way. If music is produced by synthesis from a structural representation the listener can

Music - OperationsMusic - Operations

independently change pitch and tempo, increase or decrease individual instruments volumes or change the sounds they produce

musical representations offer greater potential for interactivity than audio

Timing structural representation makes timing of musical events explicit the ability to modify tempo makes it possible to alter the timing of

groups of musical events and adjust the synchronisation of those events with other events (film, video, etc.)

Editing & Composition basic editing allows the user to modify primitive events and notes more complex editing operations operate on musical aggregates (chords,

bars, etc.) to permit phrase-repetition, melody replacement and other such functions

composition software simplifies the task of generating and combining or rearranging tracks, and prints the score

Music - OperationsMusic - Operations

Representation

Operations

Articulated objects & hierarchical models

Key framesEvent-based models

Cel modelsScene-based models

Graphics operations

Physically-based & empiricalmodels

RenderingPlayback

Scripting & procedural models

Motion & parameter control

AnimationAnimation

Separating animation and video follows the same track we took in separating image and graphic, based on modelling.

Animation types provide models which are rendered to produce video.

Animation is distinct from graphic in that it is time-dependent, but as in the image<->video relationship, sampling an animation model at a particular time will result in a graphics model, which can be rendered to produce an image

Animation Representation Cel models

early animators drew on transparent celluloid sheets or cels, different sheets contained different parts of the scene, which was assembled by overlaying the sheets

in animation, cels are digital images with a transparency channel

Animation - RepresentationAnimation - Representation

scenes are rendered by drawing the cels back to front, with movement being added by changing the position of cels from one frame to the next

a cel model is therefore a set of images, their back to front order, and their relative position and orientation in each frame

Scene-based models simply a sequence of graphics models, each representing a

complete scene highly redundant and do not support continuity of activities

Event-based models expresses the difference between successive scenes as events

that transform one scene to the next still discrete rather than continuous, but permits the

management of scenes by input devices (i.e. mouse, tablet, etc.) rather than each scene having to be entered manually


Key frames in essence, the animator models the beginning and end frames of a

sequence and lets the computer calculate the others by interpolation

Articulated objects & hierarchical models attempt to overcome the problems of key frames by developing

articulated objects, jointed assemblies where the configuration and movement of sub-parts are constrained

ensures proper relative positioning and constraint maintenance during interpolation (will not allow solid objects to pass through other solid objects)

Scripting and procedural models current state-of-the-art animation modelling systems have tools

allowing the animator to specify key frames, preview sequences in real time and control the interpolation of model parameters

an additional feature in many such systems are scripting languages


scripting languages offer the animator the opportunity to express sequences in concise form, particularly useful for repetitive and structured motion and also provide high-level operations intended specifically for animation

Physically-based models & empirical models this approach is used to produce sequences depicting

evolving physical systems a mathematical model of the system is derived from

physical principles or empirical data and the model is then solved, numerically or through simulation, at a sequence of time points, each one resulting in a single frame for the sequence


Graphics operations since animation models are graphics models extended in time, all the

graphics operations we have already covered are applicable here

Motion and parameter control since the essential difference between graphics and animation

operations is the addition of the temporal dimension, graphics objects become animations through the assignment of complex trajectories or behaviours over time

commercial 3D animation systems provide modelling tools and animation tools, the modelling tools produce 3D graphic models and the animation tools add temporal transformations to these objects

Rendering 2 basic forms :

real-time - model is rendered as frames are displayed, 10+ frames per second are required to avoid jerkiness, so only appropriate for simple models or with special hardware

non-real-time -frames are pre-rendered, taking as long as necessary to do so, provides higher visual quality and consistency of frame-rate

Animation - OperationsAnimation - Operations

Playback non-real-time rendering offers the same operational

possibilities in playback as digital video, over rate and direction

real-time rendering is much more interactive and modifiable, objects can be added and removed, lights turned on and off, the viewpoint changed, and so on

Animation - OperationsAnimation - Operations

media types text image graphics audio video. representation operations hypertext structured text...

Documents

process music

text nroff

data structure

text editorsword processors

symbolic representations

nonenglish text iso

iso sets

ascii character set