mpeg-7 audio overview ichiro fujinaga mumt 611 mcgill university

18
MPEG-7 Audio MPEG-7 Audio Overview Overview Ichiro Fujinaga Ichiro Fujinaga MUMT 611 MUMT 611 McGill University McGill University

Upload: holly-fields

Post on 29-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MPEG-7 Audio MPEG-7 Audio OverviewOverview

Ichiro FujinagaIchiro Fujinaga

MUMT 611MUMT 611

McGill UniversityMcGill University

Page 2: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 2 / 18

ContentContent MPEG-7 overviewMPEG-7 overview

Objectives and scopeObjectives and scope Main elements and organizationMain elements and organization

MPEG-7 audioMPEG-7 audio Low-level featuresLow-level features High-level features and toolsHigh-level features and tools

Page 3: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 3 / 18

IntroductionIntroduction (formally) Multimedia Content Description (formally) Multimedia Content Description InterfaceInterface

MPEG-1, 2, 4: Content coding and MPEG-1, 2, 4: Content coding and representationrepresentation

MPEG-7: Metadata (1998-2001)MPEG-7: Metadata (1998-2001) standardized descriptions and description schemes of structures and content of multimedia

a language to specify such descriptions and description schemes

Interoperable interface that defines syntax Interoperable interface that defines syntax and semanticsand semantics Modalities: audio, visual, or multimediaModalities: audio, visual, or multimedia Aspects: media, meta, structural, or semanticAspects: media, meta, structural, or semantic Applications: searching, filtering, navigationApplications: searching, filtering, navigation

Page 4: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 4 / 18

ScopeScope The goal is to provide The goal is to provide interoperability among interoperability among multimedia applications in multimedia applications in GenerationGeneration ManagementManagement DistributionDistribution ConsumptionConsumption

Page 5: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 5 / 18

Application domainsApplication domains Broadcast media selection (radio channel, TV channel)Broadcast media selection (radio channel, TV channel) Digital libraries (film, video, audio and radio Digital libraries (film, video, audio and radio

archives)archives) E-Commerce (personalized advertising)E-Commerce (personalized advertising) Education (repositories of multimedia courses, Education (repositories of multimedia courses,

multimedia search for support material)multimedia search for support material) Home Entertainment (management of personal multimedia Home Entertainment (management of personal multimedia

collections, including manipulation of content, e.g. collections, including manipulation of content, e.g. karaoke). Journalism (searching speeches of a certain karaoke). Journalism (searching speeches of a certain politician using his name, his voice or his face)politician using his name, his voice or his face)

Multimedia directory services (yellow pages)Multimedia directory services (yellow pages) Surveillance and remote sensingSurveillance and remote sensing

Page 6: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 6 / 18

Components (XML)Components (XML) MPEG-7 SystemsMPEG-7 Systems MPEG-7 Description Definition LanguageMPEG-7 Description Definition Language MPEG-7 VisualMPEG-7 Visual MPEG-7 AudioMPEG-7 Audio MPEG-7 Multimedia Description SchemesMPEG-7 Multimedia Description Schemes Reference Software: the eXperimentation Reference Software: the eXperimentation Model Model (test)(test)

MPEG-7 Conformance MPEG-7 Conformance (syntax checking)(syntax checking) MPEG-7 Extraction and use of descriptions MPEG-7 Extraction and use of descriptions

(technical report)(technical report)

Page 7: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 7 / 18

Other StandardsOther Standards SMPTESMPTE EBUEBU TV-AnytimeTV-Anytime DIG-35DIG-35 Dublin CoreDublin Core OCLC/RLGOCLC/RLG

Page 8: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 8 / 18

MPEG-7 ObjectivesMPEG-7 Objectives

Information Information aboutabout the content the content Form: Form: e.g. the coding format usede.g. the coding format used

Conditions for accessing the material:Conditions for accessing the material: Intellectual property rights / priceIntellectual property rights / price

Classification: Classification: e.g. parental ratinge.g. parental rating

Links to other relevant materialsLinks to other relevant materials Context: Context: e.g. “Olympic Games 1996, final of 200 e.g. “Olympic Games 1996, final of 200 meter hurdles, men”meter hurdles, men”

Information Information presentpresent in the content: in the content: Combination of low-level and high-level Combination of low-level and high-level descriptorsdescriptors

Page 9: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 9 / 18

Where do the Where do the descriptions come from?descriptions come from?

PreservationPreservation of existing descriptive of existing descriptive data through the production/deliverydata through the production/delivery

Generated automatically by Generated automatically by capture capture devicesdevices (e.g. time or GPS location in a (e.g. time or GPS location in a camera)camera)

ExtractedExtracted automatically & semi- automatically & semi-automaticallyautomatically

ManuallyManually produced produced (e.g. for legacy (e.g. for legacy material such as existing film archives)material such as existing film archives)

Page 10: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 10 / 18

Main Elements of MPEG-7Main Elements of MPEG-7

Description Tools: Description Tools: ( textual / binary )( textual / binary ) Descriptors (D): define the syntax and the Descriptors (D): define the syntax and the semantics of each semantics of each featurefeature (metadata element) (metadata element)

Description Schemes (DS): Description Schemes (DS): relationshipsrelationships between componentsbetween components

Description Definition Language (DDL):Description Definition Language (DDL): Define the syntax of the MPEG-7 Description Define the syntax of the MPEG-7 Description ToolsTools

Creation , extension ,and modification of DSs Creation , extension ,and modification of DSs System tools:System tools:

Storage and transmission, synchronization of Storage and transmission, synchronization of descriptions with content, multiplexing of descriptions with content, multiplexing of descriptions, etc.descriptions, etc.

Page 11: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 11 / 18

Main Elements Main Elements of MPEG-7of MPEG-7

Salembier and Avaro (2001)

Page 12: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 12 / 18

Description ToolsDescription Tools Creation and production processes: (director, title)Creation and production processes: (director, title) Usage: (broadcast schedule)Usage: (broadcast schedule) Storage featuresStorage features Structural information: (spatial-temporal components)Structural information: (spatial-temporal components)

SegmentationsSegmentations Low-level features: (sound timbres, melody Low-level features: (sound timbres, melody description)description)

Conceptual information: (objects and events, Conceptual information: (objects and events, interactions)interactions)

Navigation and access: (summaries, variations)Navigation and access: (summaries, variations) Collections of objectsCollections of objects User-content interactions: (user preferences, usage User-content interactions: (user preferences, usage history)history)

Page 13: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 13 / 18

MPEG-7 AudioMPEG-7 Audio Audio provides structures—building Audio provides structures—building upon some basic structures from the upon some basic structures from the MDS (Multimedia Description MDS (Multimedia Description Schemes)—for describing audio Schemes)—for describing audio content.content.

Low-level featuresLow-level features audio features that cut across many audio features that cut across many applicationsapplications

High-level features and toolsHigh-level features and tools more specific to a set of applicationsmore specific to a set of applications

Page 14: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 14 / 18

Low-level FeaturesLow-level Features Two low-level descriptor types Two low-level descriptor types (for sample (for sample and segment)and segment) Scalar : (e.g. power or fundamental frequency)Scalar : (e.g. power or fundamental frequency) Vector : (e.g. spectra)Vector : (e.g. spectra)

Hierarchical, consistent interfaceHierarchical, consistent interface Any descriptor inheriting from these types can Any descriptor inheriting from these types can be instantiated, describing a segment with a be instantiated, describing a segment with a single summary value or a series of sampled single summary value or a series of sampled values, as the application requires.values, as the application requires.

Scalable series Scalable series (hierarchical re-sampling)(hierarchical re-sampling) Progressively down-sample the data contained Progressively down-sample the data contained in a series (application-oriented)in a series (application-oriented)

Page 15: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 15 / 18

Low-level FeaturesLow-level Features

Salembier and Avaro (2001)

Page 16: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 16 / 18

High-level FeaturesHigh-level Features

Exchange some generality for descriptive Exchange some generality for descriptive richness:richness: a smaller set of audio features (as compared to a smaller set of audio features (as compared to visual features) that may canonically represent visual features) that may canonically represent a sound without domain-specific knowledge.a sound without domain-specific knowledge.

Audio SignatureAudio Signature (DS) (DS)

Musical Instrument TimbreMusical Instrument Timbre MelodyMelody General Sound Recognition and IndexingGeneral Sound Recognition and Indexing Spoken ContentSpoken Content

Page 17: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 17 / 18

Recent DevelopmentRecent Development New audio description tools New audio description tools specified specified (MPEG-7 version 2): (MPEG-7 version 2): Audio signal qualityAudio signal quality Audio tempoAudio tempo Chord patternChord pattern Rhythm patternRhythm pattern Multi-channelMulti-channel

Page 18: MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University

MUMT611 Fujinaga 18 / 18

ReferencesReferences

Chang, S., T. Sikora, and A. Puri, 2001. Chang, S., T. Sikora, and A. Puri, 2001. OOverview of MPEG-7 Standard. verview of MPEG-7 Standard. IEEE IEEE Transactions on Circuits and Systems for Transactions on Circuits and Systems for Video TechnologyVideo Technology 11 (6): 688-95. 11 (6): 688-95.

Matinez, J. 2004. Matinez, J. 2004. MPEG-7 Overview.MPEG-7 Overview. http://www.chiariglione.org/mpeg/standardshttp://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm/mpeg-7/mpeg-7.htm

Quackenbush, S. and A. Lindsay. 2001. Overview of MPEG-7 audio. IEEE Transactions on Circuits and Systems for Video Technology 11 (6): 725-9.

Salembier, P., andSalembier, P., and O. Avaro. 2000O. Avaro. 2000. . MPEG-7: MPEG-7: Multimedia Content Description interface.Multimedia Content Description interface. http://gps-tsc.upc.es/imatge/_Philippe/demo/MPEG21_MPEG7.pdf