live orchestral piano, a system for real-time orchestral ... ?· the orchestral score. in order to...

Download Live Orchestral Piano, a system for real-time orchestral ... ?· the orchestral score. In order to rank…

Post on 08-Aug-2018




1 download

Embed Size (px)


  • HAL Id: hal-01577463

    Submitted on 25 Aug 2017

    HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

    Larchive ouverte pluridisciplinaire HAL, estdestine au dpt et la diffusion de documentsscientifiques de niveau recherche, publis ou non,manant des tablissements denseignement et derecherche franais ou trangers, des laboratoirespublics ou privs.

    Live Orchestral Piano, a system for real-time orchestralmusic generation

    Lopold Crestel, Philippe Esling

    To cite this version:Lopold Crestel, Philippe Esling. Live Orchestral Piano, a system for real-time orchestral musicgeneration. 14th Sound and Music Computing Conference 2017, Jul 2017, Espoo, Finland. pp.434,2017, Proceedings of the 14th Sound and Music Computing Conference 2017, 978-952-60-3729-5.

  • Live Orchestral Piano, a system for real-time orchestral music generation

    Leopold CrestelIRCAM - CNRS UMR 9912

    Philippe EslingIRCAM - CNRS UMR 9912


    This paper introduces the first system performing auto-matic orchestration from a real-time piano input. We castthis problem as a case of projective orchestration, wherethe goal is to learn the underlying regularities existing be-tween piano scores and their orchestrations by well-knowncomposers, in order to later perform this task automati-cally on novel piano inputs. To that end, we investigate aclass of statistical inference models based on the RestrictedBoltzmann Machine (RBM). We introduce an evaluationframework specific to the projective orchestral generationtask that provides a quantitative analysis of different mod-els. We also show that the frame-level accuracy currentlyused by most music prediction and generation system ishighly biased towards models that simply repeat their lastinput. As prediction and creation are two widely differ-ent endeavors, we discuss other potential biases in evalu-ating temporal generative models through prediction tasksand their impact on a creative system. Finally, we providean implementation of the proposed models called Live Or-chestral Piano (LOP), which allows for anyone to play theorchestra in real-time by simply playing on a MIDI key-board. To evaluate the quality of the system, orchestrationsgenerated by the different models we investigated can befound on a companion website 1 .


    Orchestration is the subtle art of writing musical piecesfor the orchestra, by combining the properties of variousinstruments in order to achieve a particular sonic rendering[1, 2]. As it extensively relies on spectral characteristics,orchestration is often referred to as the art of manipulatinginstrumental timbres [3]. Timbre is defined as the propertywhich allows listeners to distinguish two sounds producedat the same pitch and intensity. Hence, the sonic palette of-fered by the pitch range and intensities of each instrumentis augmented by the wide range of expressive timbres pro-duced through the use of different playing styles. Further-more, it has been shown that some instrumental mixturescan not be characterized by a simple summation of their


    Copyright: c 2017 Leopold Crestel et al. This is

    an open-access article distributed under the terms of the

    Creative Commons Attribution 3.0 Unported License, which permits unre-

    stricted use, distribution, and reproduction in any medium, provided the original

    author and source are credited.

    spectral components, but can lead to a unique emergingtimbre, with phenomenon such as the orchestral blend [4].Given the number of different instruments in a symphonicorchestra, their respective range of expressiveness (tim-bre, pitch and intensity), and the phenomenon of emergingtimbre, one can foresee the extensive combinatorial com-plexity embedded in the process of orchestral composition.This complexity have been a major obstacle towards theconstruction of a scientific basis for the study of orchestra-tion and it remains an empirical discipline taught throughthe observation of existing examples [5].

    Among the different orchestral writing techniques, oneof them consists in first laying an harmonic and rhythmicstructure in a piano score and then adding the orchestraltimbre by spreading the different voices over the variousinstruments [5]. We refer to this operation of extending apiano draft to an orchestral score as projective orchestra-tion [6]. The orchestral repertoire contains a large num-ber of such projective orchestrations (the piano reductionsof Beethoven symphonies by Liszt or the Pictures at anexhibition, a piano piece by Moussorgsky orchestrated byRavel and other well-known composers). By observing anexample of projective orchestration (Figure 1), we can seethat this process involves more than the mere allocation ofnotes from the piano score across the different instruments.It rather implies harmonic enhancements and timbre ma-nipulations to underline the already existing harmonic andrhythmic structure [3]. However, the visible correlationsbetween a piano score and its orchestrations appear as afertile framework for laying the foundations of a computa-tional exploration of orchestration.

    Statistical inference offers a framework aimed at auto-matically extracting a structure from observations. Theseapproaches hypothesize that a particular type of data isstructured by an underlying probability distribution. Theobjective is to learn the properties of this distribution, byobserving a set of those data. If the structure of the data isefficiently extracted and organized, it becomes then possi-ble to generate novel examples following the learned dis-tribution. A wide range of statistical inference models havebeen devised, among which deep learning appears as apromising field [7, 8]. Deep learning techniques have beensuccessfully applied to several musical applications andneural networks are now the state of the art in most musicinformation retrieval [911] and speech recognition [12,13] tasks. Several generative systems working with sym-bolic information (musical scores) have also been success-fully applied to automatic music composition [1418] andautomatic harmonization [19].


  • Pianoscore



    French horns




    Figure 1. Projective orchestration. A piano score is pro-jected on an orchestra. Even though a wide range of or-chestrations exist for a given piano score, all of them willshare strong relations with the original piano score. Onegiven orchestration implicitly embeds the knowledge of thecomposer about timbre and orchestration.

    Automatic orchestration can actually cover a wide rangeof different applications. In [20, 21], the objective is tofind the optimal combination of orchestral sounds in or-der to recreate any sonic target. An input of the systemis a sound target, and the algorithm explores the differentcombination using a database of recorded acoustic instru-ments. The work presented in [22] consists in modifyingthe style of an existing score. For instance, it can gener-ate a bossa nova version of a Beethovens symphony. Toour best knowledge, the automatic projective orchestrationtask has only been investigated in [23] using a rule-basedapproach to perform an analysis of the piano score. Notethat the analysis is automatically done, but not the alloca-tion of the extracted structures to the different instruments.Our approach is based on the hypothesis that the statisti-cal regularities existing between a corpus of piano scoresand their corresponding orchestrations could be uncoveredthrough statistical inference. Hence, in our context, thedata is defined as the scores, formed by a series of pitchesand intensities for each instrument. The observations is aset of projective orchestrations performed by famous com-posers, and the probability distribution would model theset of notes played by each instrument conditionally on thecorresponding piano score.

    It might be surprising at first to rely solely on the sym-bolic information (scores) whereas orchestration is mostlydefined by the spectral properties of instruments, typicallynot represented in the musical notation but rather conveyedin the signal information (audio recording). However, wemake the assumption that the orchestral projection performedby well-known composers effectively took into account thesubtleties of timbre effects. Hence, spectrally consistentorchestrations could be generated by uncovering the com-posers knowledge about timbre embedded in these scores.

    Thus, we introduce the projective orchestration task thatis aimed at learning models able to generate orchestrations

    from unseen piano scores. We investigate a class of mod-els called Conditional RBM (cRBM) [24]. Conditionalmodels implement a dependency mechanism that seemsadapted to model the influence of the piano score overthe orchestral score. In order to rank the different mod-els, we establish a novel objective and quantitative evalua-tion framework, which is is a major difficulty for creativeand systems. In the polyphonic music generation field,a predictive task with frame-level accuracy is commonlyused by most systems [15, 17, 25]. However, we showthat this frame-level accuracy is highly biased and maxi-mized by models that simply repeat their last input. Hence,we introduce a