sound can make multimedia presentations dynamic and interesting

MM Topic 3 - Audio Data 1

Sound can make multimedia presentations dynamic and interesting.


Sound is normally captured using a microphone. The microphone converts the analogue sound into an analogue electrical signal. However the computer still needs to understand this signal.

The sound card on a computer contains an Analogue to Digital Converter (ADC). An ADC is required to turn that analogue signal into a digital signal that can then be stored and manipulated by a computer.

This is known as digitising or sampling the sound.

Most sound cards also have the capability to take digital input from a CD or DVD (in this case the data would not pass through the ADC)


Remember the computer can only handle digital signals so the above analogue signal must be turned into 1s and 0s. This is done by taking measurements of the signal and we refer to this as sampling.

An example of an electrical sound signal.


In the figure 5 samples are taken in 1 second so we say the signal has been sampled using a sampling frequency of 5 Hz (Hertz).

The number of times each second that the signal is sampled (measured) is called the sampling frequency.

1

2 3

4

5

Example A sample frequency of 44100 Hz means that the sound has been sampled (measured) 44,100 times per second.

(CD quality uses a sampling frequency of 44100 Hz(44.1 KHz))


The number of bits used to record each measurement is called the sampling depth. (sometimes called the sampling resolution or sampling size)

In this example each measurement is stored using 2 bits, giving us 4 possible levels to record the signal. The sampling depth determines the dynamic range of the recording.

Example Audio CDs use a sample depth of 16 bits.


The span from the lowest amplitude to the greatest amplitude is known as the dynamic range.

The dynamic range of digital sound is determined by the sampling depth. The bigger the sampling depth then the bigger the dynamic range. For example, if you use a sampling depth of 16 bits then you will have a dynamic range of 65536 levels of signal. If you have a sampling depth of 8 bits then you will have a dynamic range of 256. Obviously the bigger the sampling depth the better quality of sound.


Remember the followingIncreasing the sampling depth will give a better sound but more memory is required.Recording in stereo increases sound quality but will double the file size.Increasing the sampling frequency will give a clearer sound but more memory will be used.


One method of converting analogue into digital sound is called PCM (Pulse Code Modulation).

The data can then be stored in a number of different ways

1 RAW files

RAW files are files which store the sound as it was during sampling. The data has not been altered in any way and has not been compressed. The file’s contents consist solely of a string of numeric data with no special processing or header.

RAW files are so called because no compression has taken place. RAW files can be extremely large.


2 RIFF

RIFF is an example of a container file format. It can contain various types of multimedia data. The first part of a RIFF file contains a header which informs the type of data the file stores. If a RIFF file contains sound the header is called WAVE. Wave files (on PCs with the suffix .wav) are RIFF files containing digitised sound files.

WAV files can still be very large

3 ADPCM (Adaptive Delta Pulse Code Modulation) is a codec that is used to compress sound data. This takes sound data that has been encoded into its normal PCM values and compresses the data so that it requires less disk space than its raw .WAV equivalent. It is a lossy method of compression. It works by taking standard audio and only storing the difference between each sample. These values only use 4 bits compared to 16 bits for the absolute value. It therefore reduce the amount of disk space required by about one quarter.


4 MP3

The most popular standard for general use is MP3 ( Moving Picture Expert Group Layer 3). This is a lossy compression format. MP3 compresses in the following ways:

1 An algorithm filters out sound beyond the frequencies of the human ear.

2 Then a technique called Huffman encoding is used. This technique looks for repeating data and stores this only once. (the data does not have to be continuous).

3 Next the sampling resolution is reduced and

4 Finally background noise or sounds that are drowned out are not stored.

MP3 can reduce file sizes to one twelfth of their original. MP3 is now a widely used format used by most mobile music players. (not iPods – they use a format called AAC (Advanced Audio Coding)).


As sound is played digital signals are constantly having to be converted to analogue in order for us to hear it. The bit rate is the number of bits that are sent per second to transmit a sound file to an output device like a speaker or headphone.

If the sound quality is high, then there will be a greater number of bits as there will be a greater number of samples each second to converted back to analogue.

The bit rate for sounds can be calculated as follows

Bit rate (bits per second) = sampling depth (bits) * sampling frequency (Hz)


A captured signal may not use the full dynamic range that is available. This means that the sound does not use the available range of volumes.

Normalisation means taking a file that wasn’t recorded at the full volume it could have been and making it as loud as possible without adding noise distortion.

Sound recorded

(not used full dynamic range)

After normalisation

(full dynamic range used)


We use the following formula to calculate the size of a uncompressed sound file

File size (bits) = Sampling Frequency (Hz) x Sound time (s) x Sampling Depth (bits) x Channels(1 for mono, 2 for stereo)



Example What is the file size, in megabytes of a ten second stereo clip recorded at 44KHz and using 16 bit resolution?

= 44000(sampling frequency) x 10 (time) x 16 (resolution or depth) x 2 (stereo)

= 14080000 bits

Convert to bytes (/ 8)

= 1760000 bytes

Convert to kilobytes(/1024)

= 1718.75 kilobytes

Convert to megabytes(/1024)

= 1.68 megabytes



Example A stereo has to be recorded at CD quality. The song is 3 minutes and 12 seconds long. How much disc space would be required to store the captured file?

Hint: Settings for CD quality are Sampling Frequency 44100 Sampling Depth 16 bits

File size (bits) = 44100 x 192 x 16 x 2

= 270950400 bits

Convert to bytes (/8)

= 33868800 bytes

Convert to kilobytes (/1024)

= 33075 kilobytes

Convert to megabytes (/1024)

= 32.3 megabytes


ClippingIf a sound file does not sound too good – perhaps part of the sound seems unclear or missing – the most probable cause of this is clipping.

Clipping can occur when a sound is recorded at too high a level and part of the waveform is cut off. What happens is that the sound goes out with the range of digital codes used to represent the sound i.e. out with the dynamic range.


StereoA sound recorded in stereo has been recorded over two channels. This is done by using two microphones – one for the left channel and one for the right channel. It will be of higher quality than a sound recorded in mono. It will also be double the file size.

When editing a stereo sound in a waveform editor, both sounds are displayed as separate waveforms.

Editing a mono sound

Editing a stereo sound


Surround SoundIn cinemas or home entertainment systems a combination of speaker are used to create an effect called surround sound. Surround sound make the file more life like.Surround sound is created by software in the sound card using special algorithms to create the effect. The algorithms filter the original sound into several different components based on how they will be heard from a number of different directions. The filtered sounds are then sent to the appropriate speaker.Dolby Surround sound – 4 speakers two in front of the listener and two behind the listener.5.1 – 5 speakers, forward left and right, rear left and right and centre forward speakers. The .1 refers to an extra device called a subwoofer that plays low frequencies. Eg making the room rumble when there is an explosion.


FadeThis means to slowly reduce the sound so that it dies away slowly rather than coming to a sudden stop. The rate at which the sound fades and its duration can be changed by the user. Most sound editing software, like Audacity, comes with fade settings. Fade settings can be referred to as envelopes.

No fade Fade


A sound card is needed to generate sounds that can be fed to speakers or headphones.

DAC takes the digital data and changes it to an analogue signal which is fed out through the sound line out socket on the sound card. (green).DSP (Digital Signal Processor) is an integrated circuit designed for high speed data manipulation. Main function to compress and decompress sound files as well as to provide enhancements to sounds.


MIDI (Musical Instrument Digital Interface) is an interface which allows a PC to control an instrument or to communicate with it.

MIDI has also become a standard file type for musical files that contain data that enable musical instruments to recreate music.

MIDI files do NOT contain any recording of sound, no digitised sound is stored. All they contain is the data required by an instrument to synthesise the sound.


A MIDI file can be a text file. The file can be edited in text editor like Notepad. Just as SVG graphics are defined a MIDI text file has similarities.

MIDI instructions contain a description of each note in terms of its pitch, timing, instrument type, duration, channel, tempo and volume.

MIDI files are not as high quality as recordings and MIDI cannot create some sounds such as speech.


Advantages of MIDI

• Smaller file size

• All aspects of the music can be edited

• Effect can be applied to individual instruments

• There is no interference or background noise from the recordingDisadvantages of MIDI

• Dependent on sound card for quality of sound

• Cannot contain vocals

• Fewer effects can be applied

sound can make multimedia presentations dynamic and interesting

Documents