the case for mxf-embedded ebucore metadata in archiving applications | dieter van rijsselbergen,...

17
WORLD CONFERENCE 2013 OCTOBER 25 - 28 2013 DUBAI, UAE Author Name(s) (2013). Paper Title. THE CASE FOR MXF-EMBEDDED EBUCORE METADATA IN ARCHIVING APPLICATIONS Dieter VAN RIJSSELBERGEN *a , Jean-Pierre EVAIN b , Marco DOS SANTOS OLIVEIRA b and Maarten VERWAEST a a Limecraft; b European Broadcasting Union A solution to current descriptive metadata delivery problems is the use of metadata embedded in the audio-visual essence containers themselves. This way, metadata can no longer get lost and needs no separate out-of-band delivery mechanism. Using EBUCore metadata embedded in essence files using a freely available reference SDK can ease the adoption of embedded metadata significantly and can help archive systems in supporting such standards-compliant embedded descriptive metadata. In this paper we describe the proceedings and lessons learnt from a development project of EBU and Limecraft, in which we investigated the use of MXF-embedded EBUCore metadata as way to support feeding metadata- enriched MXF files to a variety of media production and archiving systems. Keywords: Embedded Metadata | EBUCore | Material Exchange Format INTRODUCTION Despite many years of digitization efforts and iterations of file-based storage, networking and enterprise services technologies, the relationship between audio-visual essence and its metadata remains a tense one. When exchanges and storage of audio-visual material require accompanying metadata to describe its semantics and origins, integration complications about interfaces, databases and indexes, metadata formats and delivery mechanisms (busses, SOAP interfaces, FTP servers, etc.) still manifest themselves regularly. A viable solution is to embed the descriptive metadata in the essence containers themselves such that it can no longer get lost and needs no separate out-of-band delivery mechanism; if the essence arrives, so does the metadata. Archiving particularly is a field that can benefit from the use of embedded metadata, as the correct association of essence with its metadata is crucial for future successful retrievals of the essence. Even after * Corresponding author: Limecraft Sint-Salvatorstraat 18b/301 | Ghent 9000 | Belgium e-mail: [email protected] Copyright © of this paper is the property of the author(s). FIAT/IFTA is granted permission to reproduce copies of this work for purposes relevant to the above conference and future communication by FIAT/IFTA without limitation, provided that the author(s), source and copyright notice are included in each copy. For other uses, including extended quotation, please contact the author(s).

Upload: fiatifta

Post on 22-Jan-2015

276 views

Category:

Technology


0 download

DESCRIPTION

A solution to current descriptive metadata delivery problems is the use of metadata embedded in the audio-visual essence containers themselves. This way, metadata can no longer get lost and needs no separate out-of-band delivery mechanism. Using EBUCore metadata embedded in essence files using a freely available reference SDK can ease the adoption of embedded metadata significantly and can help archive systems in supporting such standards-compliant embedded descriptive metadata. In this paper we describe the proceedings and lessons learnt from a development project of EBU and Limecraft, in which we investigated the use of MXF-embedded EBUCore metadata as way to support feeding metadata-enriched MXF files to a variety of media production and archiving systems.

TRANSCRIPT

  • 1. WORLD CONFERENCE 2013 OCTOBER 25 - 28 2013 DUBAI, UAEAuthor Name(s) (2013). Paper Title.THE CASE FOR MXF-EMBEDDED EBUCORE METADATA IN ARCHIVING APPLICATIONS Dieter VAN RIJSSELBERGEN1a, Jean-Pierre EVAINb, Marco DOS SANTOS OLIVEIRAb and Maarten VERWAESTa aLimecraft; bEuropean Broadcasting UnionA solution to current descriptive metadata delivery problems is the use of metadata embedded in the audiovisual essence containers themselves. This way, metadata can no longer get lost and needs no separate out-of-band delivery mechanism. Using EBUCore metadata embedded in essence files using a freely available reference SDK can ease the adoption of embedded metadata significantly and can help archive systems in supporting such standards-compliant embedded descriptive metadata. In this paper we describe the proceedings and lessons learnt from a development project of EBU and Limecraft, in which we investigated the use of MXF-embedded EBUCore metadata as way to support feeding metadata-enriched MXF files to a variety of media production and archiving systems. Keywords: Embedded Metadata | EBUCore | Material Exchange FormatINTRODUCTIONDespite many years of digitization efforts and iterations of file-based storage, networking and enterprise services technologies, the relationship between audio-visual essence and its metadata remains a tense one. When exchanges and storage of audiovisual material require accompanying metadata to describe its semantics and origins, integration complications about interfaces, databases and indexes, metadata formats and delivery mechanisms (busses, SOAP interfaces, FTP servers, etc.) still manifest themselves regularly. A viable solution is to embed the descriptive metadata in the essence containers themselves such that it can no longer get lost and needs no separate out-of-band delivery mechanism; if the essence arrives, so does the metadata. Archiving particularly is a field that can benefit from the use of embedded metadata, as the correct association of essence with its metadata is crucial for future successful retrievals of the essence. Even after metadata is read from essence containers to be indexed for faster and easier search and retrieval operations, the metadata remains processable directly in the original media files as long as they can be read and interpreted. In this paper, we discuss the following topics. First, we argue the case for embedded metadata and how it can simplify metadata delivery. In particular, we introduce the MXF file format and the EBUCore metadata standard as the main subjects of our embedding efforts. Next, we discuss how EBUCore metadata can be embedded in MXF files, and which lessons we learnt from this research. Then, we introduce the reference implementation SDK we have built to enable users to start embedding metadata quickly, and we provide a short description about a proof-of-concept demonstrator to show how the SDK can be incorporated into custom application. Finally, we provide our conclusions.1Corresponding author: Limecraft Sint-Salvatorstraat 18b/301 | Ghent 9000 | Belgium e-mail: [email protected] Copyright of this paper is the property of the author(s). FIAT/IFTA is granted permission to reproduce copies of this work for purposes relevant to the above conference and future communication by FIAT/IFTA without limitation, provided that the author(s), source and copyright notice are included in each copy. For other uses, including extended quotation, please contact the author(s).

2. Authors name(s)THE CASE FOR EMBEDDED METADATAA strong case can be made for the use of essence-embedded descriptive metadata for archiving purposes because it can simplify the delivery and archiving processes in a number of ways. The delivery process of essence and metadata is made much simpler as the need for an out-of-band delivery of the metadata is averted; when the essence arrives, so does the metadata. Theres no need for separate files to be delivered along with the essence, no hassle with file naming conventions or dealing with missing metadata in cases transfers were aborted or not even performed. This makes the delivery considerably simpler for both archives and producers, as illustrated in Figure 1. Out-of-band delivery of metadata (Figure 1a) requires three layers of processing components. One layer is for the processing of essence and its associated metadata, e.g., MXF parsers and metadataspecific XML processors that can handle specific instances of metadata documents for a given metadata specification. On the other end, a delivery technology is required for actually transferring files or bit streams to an archive. Examples of the latter include FTP and HTTP (or more secure variants that employ encryption for all bytes transferred). In between these two layers, a (preferably automated) delivery interpretation layer must be setup that will relate the individual files delivered via the delivery technology to one another to form the input for the top layer which interprets the actual metadata and essence. This interpretation layer can become quite complex as exotic use cases and error handling must be supported. E.g., atomic delivery must be supported such that no processing is done on essence for which the metadata has not arrived yet. Also, in cases that multiple metadata files must be delivered, the correct file must be identified for each required set of metadata elements. Essence + Metadata MXF + Metadata XMLEssence + Metadata MXF MetadataDelivery Interpretation (file01.mxf + file01.xml)(a) Figure 1Delivery Technology (FTP)(b)Delivery Technology (FTP)The required processing layers in (a) delivery with out-of-band metadata and (b) delivery with embedded metadata.When considering the delivery of essence with embedded metadata, there is no longer the need for a delivery interpretation layer. When transfers are completed and validated at the transport level, control can be handed directly to the top essence and metadata parsing layer as all information required to relate essence and metadata are contained in that one single file (cf. Figure 1b). Such a setup can be beneficial for archiving environments. The delivery and ingesting process is significantly simplified and no complex delivery chains based on various possibly ad-hoc conventions need to be setup at either side of the content exchange. Additionally, when the essence container format in question is chosen right, many instances of various metadata kinds can be embedded simultaneously (or appended one after the other in the same container), resolving the issues at hand for many ingesting organisations and archives in a single effort. Naturally, if the use of embedded metadata is meant to replace the arbitrary parts of existing side-channel metadata delivery, it is crucial that the mechanism for embedding such metadata be properly defined, and based on best-practices and standards used in 2FIAT/IFTA World Conference 2013 in Dubai 3. Title of paperthe video production and archiving ecosystem. This is where the combination of MXF and EBUCore comes into the picture, as described in the next section. THE CASE FOR MXF-EMBEDDED EBUCORE METADATAOver the past decade, the Material Exchange Format (MXF) (SMPTE-377M, 2004) (SMPTE, 2009) has become the dominant container format for file-based audio-visual essence storage for professional video production. Its design incorporates many essence encoding standards and extensive support for various types of metadata embedded into an MXF container. The MXF format and its use have been defined extensively in various SMPTE and EBU standards and recommendations and form a solid foundation for proposing mechanism for embedded metadata exchanges. Figure 2 shows the overall structure of an MXF file container. The file consists of a header, body and footer, spread over a number of partitions (one header partition, one footer partition and any number of body partitions). Metadata can be stored in various places in the MXF container, first and foremost in the file header, but it can also be repeated and updated in subsequent partitions, for reasons of redundancy or to support growing files in which finalized metadata (e.g., duration) can only be appended at the end of the file. In Figure 2, this metadata is contained within the blocks labelled Header Metadata.Figure 2Structure of an MXF file container. The file contains a number of partitions, of which the header and footer partition can contain header metadata sets.Two kinds of metadata can be stored in an MXF file: structural and descriptive metadata. The structural metadata defines the structure of the essence and its timeline in the MXF file. It defines the various tracks of essence contained in the file, declares the encoding parameters defined for each such track and specifies how each track relates to one another in the overall timeline represented by the MXF file. As such, the structural metadata is required for the correct interpretation of the MXF file. Descriptive metadata can be hooked onto the structural metadatas elements (e.g., it can be used to describe a track) to describe the semantics of the essence involved, or to provide references to the specifics of the production process that produced the essence in question. Embedded MXF metadata is fully supported by the MXF specifications. In fact, a number of targeted specifications have been ratified that define instances of descriptive metadata for MXF, of which the Descriptive Metadata Scheme-1 (DMS-1) (SMPTE-380M, 2004) is the most prominent. DMS-1 defines three frameworks in which production metadata can be associated with the contents of the MXF file. These frameworks can describe titles, publication events, rights information, contacts associated with the creation of the essence, etc. However, the use of DMS-1 has remained very limited, not due to the qualities of the standard, but rather because the standard has been defined only within the isolation of the MXF ecosystem. No official outside-of-MXF serialization of DMS-1 exists, which makes its adoption harder, because each inclusion or extraction of DMS-1 metadata always requires a conversion between DMS-1 and another metadata format. Such conversions are typically performed by software that is complex as it deals with the equally complex MXF standards, and possible hard to integrate, as it must process MXF files efficiently and is hence often written in low-level languages such as C/C++.FIAT/IFTA World Conference 2013 in Dubai3 4. Authors name(s)The use of MXF-embedded metadata becomes much more likely if we can incorporate existing metadata standards used outside of MXF containers and embed those into MXF containers in a standards-conforming fashion. Because the adoption threshold of such a standard is much lower, it is likely to have been used more extensively, and will have been revised and optimized more often because of actual use. This way, the existing user base and deployments and all tools already available for creating, updating and processing the metadata can be reused as-is. Embedding logic only has to deal with translating external metadata to a lossless MXF representation, while any conversion to and from other metadata standards is performed with other MXF-agnostic tools. One such metadata standard is employed in the AS-11 application specification of MXF files for program contribution (AMWA, 2012). AS-11 defines a limited set of flat custom metadata fields for embedding into MXF files. At the same time, these same metadata fields can easily be written in simple text files (e.g., as is the case in the reference code implementation of the AS-11 specification), which eases its adoption significantly, e.g., by members of the UKs Digital Production Partnership (DPP). Even though the AS-11 case shows an attractive approach to embedded metadata, a more interesting metadata standard to consider is EBUs EBUCore (EBU, 2013). EBUCore extends greatly upon the Dublin Core standard (DCMI, 2004) and provides the framework for descriptive and technical metadata for use in archiving, service-oriented architectures and also in ontologies for semantic web and linked data developments related concerning audio-visual media. EBUCore has seen a number of revisions since its inception and is currently adopted by various broadcasters worldwide, supported by a toolset that has been increasing and maturing along with the specification. In the remainder of this paper, we discuss the efforts and results of a development track executed by Limecraft and EBU to investigate how EBUCore metadata can be embedded into MXF file containers in a compliant and optimized fashion, as a means of facilitating easier metadata exchanges to and from production facilities and archives. EMBEDDING EBUCORE IN MXF CONTAINERSThe regular representation of EBUCore is in the form of XML documents, for which the structure is defined by an XML Schema document (W3C, 2004). EBUCore documents are typically edited using tools and frameworks optimized for dealing with XML documents, e.g., the MINT tool for mapping between EBUCore and other metadata standards (cf. the concerning publication also in this session). The challenge in our case has been to define a translation and serialization method for placing this XML metadata in MXF files in such a way that the philosophy and best practices in MXF container processing were adhered to. In particular, the requirements were as follows: 1. EBUCore metadata should be embedded into MXF files using best practices and compliant with the MXF specification; 2. EBUCore metadata should be embedded into MXF files without loss of information; 3. MXF files with EBUCore embedded should remain compatible with EBUCoreagnostic MXF processors. Referring back to the previous section, metadata to be included in MXF files should be contained in the MXF header metadata. The MXF file format specifies a flexible mechanism for serializing metadata elements into MXF file bytes, which we will briefly summarize in order to provide the necessary context for this paper. An MXF file is actually a sequence of many so-called Key-Length-Value (KLV) packets. Each such KLV packet can contain many types of data, incl. metadata, essence, and index table entries. The key of the packet identifies the packets intention and the length field enables MXF parsers to tell when one packet ends and the next one begins, allowing traversal of the 4FIAT/IFTA World Conference 2013 in Dubai 5. Title of paperentire file from front to back. A subset of all KLV packets in a file, typically at the beginning of the header partition, forms the header metadata. These packets are formatted in such a way that their value part contains the information for a single metadata object, the format of which can be determined by the unique key assigned to the packet. E.g., Figure 3 shows a part of the structure of an Identification metadata object class identified by the key 06.0e.2b.34.02.53.01.01.0d.01.01.01.01.01.30.00. Figure 3 illustrates that the Identification class contains a variety of fields, each of which is strongly typed (incl., numeric types, character string types, dates, and arrays of each of these types). Additionally, thanks to one specific metadata field type, the reference, relations between KLV metadata packets can be establish, which allows for the construction of complex metadata structures. As far as standard MXF metadata concerns, these structures are normatively defined by SMPTE specifications and are typically conveyed in a structured fashion in software using dictionaries, of which Figure 3 depicts a small excerpt. Figure 3Excerpt of the MXF metadata dictionary. Displayed is the definition of the Identification metadata set, with its member fields, data types and representative Universal Labels.Even though standard MXF metadata has been meticulously defined, it can also be easily extended, provided that the same mechanism is used to describe metadata extensions. Custom fields can be appended to pre-defined metadata classes and new classes can be defined such that a seamlessly integrated metadata extension of the original metadata is obtained. SERIALIZATION STRATEGIESEven given the requirements and MXF file structure listed in the previous section, a number of serialization strategies can be pursued to interpret the KLV structure of MXF files for metadata serialization. We list them next, and provide a summary of advantages and disadvantages of each strategy. The first strategy of metadata serialization involves embedding a single KLV packet in which the EBUCore XML metadata document is written as-is, and we refer to this strategy as dark (cf. Figure 4a). The KLV packet is inserted as the last packet at the end of the regular header metadata and is identified by a specific EBUCore dark metadata key. No further modifications are done to the MXF file metadata. This method is described as dark because, even if the KLV elements key is known, the metadata doesnt participate in the overall structure of metadata and remains a blob of abstract data with respect to the MXF file. The second strategy encompasses the creation of an exhaustive representation of the metadata entity structure (e.g., object classes, fields and references) such that each logical entity (i.e., metadata object) becomes a single KLV packet (cf. Figure 4b). The structure of each metadata entity is described using a dictionary as described in the previous section. In this case references between objects are implemented using special MXF-designated reference data types. Additionally, this mechanism can be coupled to existing MXF structural metadata using the same mechanism. We refer to this strategy as the full KLV serialization strategy.FIAT/IFTA World Conference 2013 in Dubai5 6. Authors name(s)The third strategy of serialization writes only a minimal set of KLV packets to the MXF file, just sufficient to ensure a reference can be placed in the MXF metadata to an external file in which the actual metadata is stored and that acts as a side-car to the MXF container (cf. Figure 4c). Like with the KLV serialization strategy, this side-car metadata reference can also be linked to existing MXF structural metadata.KLV(a) Dark Figure 4(b) KLV(c) Side-carEmbedded metadata serialization strategies for serializing metadata into the header metadata of an MXF file container; (a) dark, (b) KLV, and (c) side-car.The dark serialization strategy is the simplest one, and involves writing the metadata file into the MXF file directly and in a single place. However, while this does conform to the MXF specification (namely, the packet is written as a conformant KLV packet and should be ignored by MXF parsers if unknown), it is not considered best-practice, as the metadata inserted requires further processing once read from the file, and it is in no intrinsic way related to the MXF header metadata, except through custom interpretation not associated with any of the MXF specifications. For example, any relationship between the metadata and a limited part of the MXF timeline can only be determined by fully processing the metadata. On the other hand, this technique requires only little effort to implement and requires only simple modifications for embedding metadata into existing and legacy MXF files. Additionally, due to the fact that the metadata is serialized exactly as is, bit-per-bit, we can be assured that the metadata is stored in a lossless way. The KLV serialization strategy, on the other hand, follows a reverse approach, and models each and every metadata object of the original metadata scheme as an individual KLV packet, with each of its fields described exhaustively using MXF metadata data types, and fully integrated with the MXF structural metadata. Clearly, this technique requires the most preparation in advance to provide a mapping between the original representation of the metadata (e.g., XML described by a schema or other document type) and to implement as each object must be translated to its counterpart. However, this strategy provides the best approach in terms of best practices in the MXF ecosystem and seamlessly fuses existing structural metadata with newly embedded descriptive metadata; in the same way as it is done for other nominative MXF descriptive metadata standards such as DMS-1. The sidecar method can be used in scenarios where metadata updates are likely to occur frequently and the recurring modification of MXF files is not feasible. The downside of this approach is that metadata file must be transferred and kept together with the MXF file throughout the production and distribution process. Unfortunately, this would again require a delivery interpretation layer to be part of the ingesting process. Its complexity would be reduced however, as the MXF files metadata would refer unambiguously to the correct external metadata files. In an archiving context, the sidecar strategy is less convenient, because metadata is not likely to change often, if at all. Additionally, it requires a delivery interpretation layer at the ingest point of the archive, in addition to a parsing effort of the MXF file itself, and as such 6FIAT/IFTA World Conference 2013 in Dubai 7. Title of paperis not the best candidate for use in archival situations. Concerning the KLV approach or dark approach, the trade-off must be made between convenience of implementation and richness of metadata. In any case, in our research, we have investigated both approaches and have built tools to support all three strategies such that implementers can pick the one best suited for their workflows. MAPPING EBUCORE METADATA TYPESWhenever metadata is written using the KLV or side-car strategy, a translation must be made between the original metadata representation format and the metadata provisions defined in the MXF file format. In order to meet requirement #2, this translation must be done meticulously and using the native data types and structures available, as an answer to requirement #1. We constructed such a translation between the EBUCore XML schema and a KLV representation, in which the types of EBUCore were mapped to their equivalent counter-parts in terms of native MXF data types and classes, each of which is identified by a SMPTE Universal Label (UL) as illustrated in Figure 5 (cf. the key and globalKey attributes). The result of this translation effort has been submitted to SMPTE for inclusion in the Class 13 section of the SMPTE Metadata Dictionary and will be publicly registered such that it is available for interested parties to study and employ in individual implementations. As we tried to ensure that MXF-compliant data types and structures were used in the serialization of EBUCore metadata, an important goal was to ensure that information could be translated between both representations in a lossless way. Note however, that the result of a serialization back and forth between XML and KLV will not be bit-wise identical, but will be semantically equivalent (e.g., spaces, redundant namespace declaration, etc. will not be needlessly included in the KLV serialization). Figure 5Excerpt of the EBUCore XML-to-KLV mapping defined for the full KLV serialization strategy. Displayed is the mapping for the CoreMetadata metadata set.In translating between both representations of metadata we have learnt a number of lessons that we list here. As a general rule, types in EBUCore could be mapped to a KLV version mostly unchanged and with an identical purpose (e.g., a PublicationEventType is mapped to a similar KLV object type). Relationships between objects (incl., one-to-one, one-to-many, ) can be translated except for a number of differences in possible cardinalities. E.g., while XML Schema defines a 1..n cardinality, there is no such equivalent for MXF. However, such issues are easily resolved by including simple work-arounds, e.g., by FIAT/IFTA World Conference 2013 in Dubai7 8. Authors name(s)using a required field and then optional list of additional entries to deal with the previous case. Furthermore, while MXF supports many common numeric and character data types, a number of more advanced types have not been included. E.g., there is a single timestamp type in MXF but this type does not support time zones. Additionally, best practices in MXF dictate a preferred way of handling other types of time-related data using edit-units (i.e., frames) whenever possible, while EBUCore supports a variety of other time measuring types (e.g., seconds, frame numbers). In these cases, the tools used for the translation and embedding of the metadata must ideally perform the required conversions such that native MXF data types and constructs can be employed. Mapping constructs in which opaque data types are stored as character strings, and then reinterpreted when retrieved from the MXF file should be avoided whenever possible. Right from the start of the mapping effort, special attention was given to the backward compatibility and support for future versions of EBUCore. It is crucial, especially in archival contexts, that older versions of embedded EBUCore metadata remain readable by newer EBUCore-aware MXF parsers. In order to realize this, versioning of the KLVtranslated EBUCore metadata was incorporated at each level and has been incorporated explicitly within the SMPTE registration the KLV representation of EBUCore. In fact, the current SMPTE registration applies to version 1.4 of EBUCore, and every subsequent version will also be fully registered and standardized as such. This is done by assigning each unique element UL that identifies a KLV metadata class or field a version byte that is used to identify the EBUCore standard it conforms to. Additionally, a unique label is attached to the beginning of the MXF metadata, identifying the version of EBUCore metadata stored in that container. This allows MXF parsers to load the correct version and vocabulary of an EBUCore processor early, before any EBUCore metadata elements are encountered. UNION OF EBUCORE AND MXF METADATAWhen a full KLV serialization strategy is employed, the added descriptive metadata can be setup to fully interact with the structural metadata. In particular, EBUCore and standard MXF metadata are united by means of the MXF essence timeline. The MXF file container defines the essence contained in the file as a package of essence and data tracks, each of which forms part of the timeline. Some tracks are plain picture or sound tracks that represent actual video or audio streams, while others define custom time codes or ancillary data related to the imagery. Finally, tracks can be Descriptive Metadata tracks that reference a set of descriptive metadata elements also contained in the header metadata. EBUCore metadata is inserted into the MXF metadata in this way, such that it properly interacts with the timeline model (SMPTE-EG42, 2004), as illustrated in Figure 6. EBUCore metadata that describes the entire essence file is referenced from a Static Track that covers the entire length of the essence uniformly. In particular, an ebucoreMainFramework bridging object is used to link the static track to the actual EBUCore CoreMetadata object. On the other hand, EBUCore Parts that describe only a limited section of the essence are also properly modelled on the MXF timeline by using a Descriptive Metadata Event Track on which temporal segments are assigned for each of the Part objects in the EBUCore metadata.8FIAT/IFTA World Conference 2013 in Dubai 9. Title of paperFigure 6Interaction between the embedded EBUCore metadata and the MXF timeline concept.This kind of interaction enables a very powerful expression of the relationship between the descriptive metadata and the essence in the MXF container. Temporal segmentation and description of the essence can be done using native MXF data structures, and if needed, multiple instances of descriptive metadata (possibly originating from different metadata standards) can be combined on a single timeline. Note finally that the presented association of metadata is also relevant with respect to the side-car serialization strategy. In this case, only the timeline elements (the Static Track and its segments) and the ebucoreMainFramework are written. However, the ebucoreMainFramework only contains a field into which the location of the side-car file is written and no other EBUCore metadata (i.e., CoreMetadata object) is present. TOOLS FOR EMBEDDED METADATA: THE EBU MXF SDK REFERENCE IMPLEMENTATIONWhile we have argued the case for MXF-embedded EBUCore metadata and have highlighted its potential advantages, an important factor in the adoption of such a technology requires processing tools to be easily available. For this reason, we have developed a freely available reference software implementation such that interested users can get started right away integrating embedded metadata. The software is provided as a freely available open source Software Development Kit (SDK) which provides both ease of use through a number of pre-packaged tools and flexibility in the form of a software library (Van Rijsselbergen, 2013). Figure 7 shows how the SDK can be used. A number of command line tools are available that use the functionalities of the SDK to provide end user functions such as embedding EBUCore metadata in an existing MXF file (ebu2mxf.exe), extracting EBUCore metadata from an MXF file (mxf2ebu.exe) and embedding EBUCore metadata in an newly created MXF file (raw2bmx.exe). Additionally, applications can use the SDK as a software library to incorporate its features by calling a variety of functions. For those cases, the SDK is provided with documentation for the public function API that the SDK exposes to external programs.eb e . f x m 2 ura e x e . x m b 2 wmx e x e . u b e 2 fCustom Tools and System IntegrationsEBU MXF SDK Figure 7Uses of the EBU MXF SDK: using command-line tools or as a function library for custom tools.FIAT/IFTA World Conference 2013 in Dubai9 10. Authors name(s)HOW THE SDK WORKSThis section describes the EBUCore processing functionality of the SDK (illustrated in Part in Figure 8Error: Reference source not found). The SDK can read and write two representations of EBUCore; the XML variant is read from and written to XML documents that conform to the EBUCore XML schema, the MXF variant is read and written to KLV packets, the native encoding of information units in MXF files. For both XML and MXF representations, the EBUCore metadata is read (or written to) an in-memory representation (i.e., an instantiated object model) first and then translated to the other representation through the bi-directional mapping discussed in the previous section of this paper.Figure 8Features of the EBU MXF SDK: Writing and reading EBUCore metadata (1), processing audiovisual essence (2), reading and extending existing MXF files (3) and offering base functionality of the incorporation of other embedded metadata standards besides EBUCore (4).In the second mode of operation, the SDK writes EBUCore metadata into an existing MXF file, the path depicted in Part in Figure 8. This mode requires more complex application logic, as the existing file must be modified as efficiently as possible, and the existing metadata must be modified in such a way as to remain fully compliant with the MXF file format specification. MXF files may carry multiple instances of the files metadata (i.e., each new partition can contain an updated set of metadata). This way, streaming and growing file scenarios can be supported in which increasingly accurate metadata is continuously inserted as the file being is extended, resulting in an MXF file that contains the most complete metadata in its footer partition. Partitions marked as open and incomplete can instruct MXF interpreters to ignore early sets of metadata and only consider a final closed and complete metadata set as the definitive MXF file structure description. Unless explicitly instructed otherwise, the SDK uses this mechanism to append the updated metadata in the footer partition of the MXF file. This involves a rewrite of only the footer partition, which requires only limited writing operations since footer partitions contain no essence. Most of the header and (bulky) body partitions remain unchanged, except for an update of the small partition header KLV pack to signal an as of now open and incomplete metadata set. Note that, when selecting the metadata to extend, the SDK also interprets partition flags to select only the finalized metadata for extension with EBUCore elements. Considering the complexity of the MXF file format specification, it is not unlikely that certain implementations of MXF interpreters will lack support for selection of metadata beyond the header partition, and will expect this partition to contain only a single 10FIAT/IFTA World Conference 2013 in Dubai 11. Title of papercomplete metadata set. To support these systems, the SDK can be explicitly instructed to write the EBUCore metadata to the header partition, at the expense of a byte shift operation across the remainder of the MXF file. Finally, we wish to point out the fact that the SDK has been constructed in such a way that the code used for embedding metadata using the various strategies discussed above has been separated from the code that performs the mapping between representations of EBUCore such that it can easily be reused for embedding non-EBUCore metadata. The serialization code can be reused, with only a new mapping effort that needs to be implemented for each additionally supported metadata standard, as illustrated in Part in Figure 8. EMBEDDED EBUCORE METADATA: PROOF-OF-CONCEPT DEMONSTRATORTo illustrate the use of the SDK and embedded EBUCore metadata, we have built a proof-of-concept demonstrator for the ingest of EBUCore metadata in Limecraft Flow, an on-line collaboration and media production environment built for the production and archiving of various media production formats such as drama and factual television programs (Limecraft, 2013). The SDK tools were integrated in such a way that they naturally extend the existing ingest process with a metadata extraction step that analyses incoming MXF files and reads the EBUCore metadata (if any). The EBUCore metadata extracted is then made available to users of the application as searchable metadata to aid them in retrieving media assets. Search ApplicationMedia ProbingUserMetadata Index Feature Detectionmxf2ebu.exeRetrieval ApplicationEBU MXF SDK Media RepositoryFigure 9Functional overview of the proof-of-concept demonstrator in Limecraft Flow. Depicted is the ingest process that was extended with an incorporation of the EBU MXF SDK for extraction and indexing of EBUCore metadata.Figure 9 shows a breakdown of the components involved in the demonstrator. MXF files are ingested and delivered into a folder from which new files are analyzed (i.e., the type of file is determined, and a number of feature detection procedures, incl. shot cut detection, are executed). As an extension, we have added mxf2ebu.exe, one of the tools powered by the reference SDK, which reads the embedded EBUCore metadata and extracts it in the form of an XML document. This document is then added to a metadata index which can be queried by a Search Application accessible to end users. This way, all information present in the EBUCore metadata embedded in the MXF file can be searched for immediately and without the need for complicated software layers for delivery interpretation. Based on queries and their search results, users can instruct the system to retrieve found assets, which in turn still contain the embedded metadata so that downstream in the production chain systems can benefit from the presence of embedded metadata without requiring out-of-band delivery mechanisms. A possible further extension to this demonstrator could include the serialization of metadata in MXF files for which metadata is known, but not yet embedded in the essence files. Upon retrieval after a users research, the Retrieval Application could invoke another tool in the SDK, ebu2mxf.exe, to embed the metadata as EBUCore when the FIAT/IFTA World Conference 2013 in Dubai11 12. Authors name(s)asset files are retrieved from the media repository. Because the SDK has been optimized to perform serialization with a minimal of file processing operations, this could be performed on the fly when files are being delivered to the systems of end users. CONCLUSIONSIn this paper, we have discussed advantages of using of embedded metadata in essence containers for the transportation of descriptive metadata. In particular, we have shown that EBUCore metadata, a standard published by EBU and endorsed by many adopters, embedded in MXF container files can be a powerful mechanism for metadata delivery in the ingest process of archival systems. We have illustrated various strategies that can be employed to embed the metadata such that the serialization is done in a standardscompliant and semantically lossless fashion. To aid the adoption of the techniques discussed, we have built a freely available open-source SDK that can be used to build applications that support embedded metadata, and finally, we described a proof-ofconcept example use of the SDK in real-world scenario. REFERENCESSMPTE-377M, 2004. Standard for Television Material Exchange Format (MXF) File Format Specification. SMPTE 377M-2004. SMPTE, 2009. Standard for Television Material Exchange Format (MXF) File Format Specification. SMPTE S377-1-2009. SMPTE-380M, 2004. Standard for Television Material Exchange Format (MXF) Descriptive Metadata Scheme-1. SMPTE 380M-2004. Advanced Workflow Association, AMWA, 2012. AMWA Application Specification AS-11 MXF Program Contribution. AS-11. Available from http://www.amwa.tv. EBU, 2013. EBU CORE METADATA SET (EBUCore) Version 1.4. EBU Tech 3293. Dublin Core Metadata Initiative, DCMI, 2004. Dublin Core Metadata Element Set, version 1.1: Reference Description. W3C, 2004. World Wide Web Consortium XML Schema, Second Edition. Available from http://www.w3.org/standards/techs/xmlschema. SMPTE-EG42, 2004. Engineering Guideline for Television Material Exchange Format (MXF) MXF Descriptive Metadata. SMPTE EG42-2004. Van Rijsselbergen, D., Dos Santos Oliveira, M.,Evain, J-P. (2013, 28 May). EBU MXF SDK An SDK for MXF embedded EBUCore metadata processing and analysis. Retrieved 20 September, 2013, from https://github.com/Limecraft/ebu-mxfsdk/. Limecraft (2013, 1 July). Limecraft Flow Your online media production office. Retrieved 20 September, 2013, from http://www.limecraft.com.12FIAT/IFTA World Conference 2013 in Dubai