modifiable drone thermal imaging analysis framework for mob … · 2018. 10. 15. · plugin system....

124
Brecht Verhoeve mob detection during open-air events Modifiable drone thermal imaging analysis framework for Academic year 2017-2018 Faculty of Engineering and Architecture Chair: Prof. dr. ir. Bart Dhoedt Department of Information Technology Master of Science in Computer Science Engineering Master's dissertation submitted in order to obtain the academic degree of Counsellors: Pieter-Jan Maenhaut, Jerico Moeyersons Supervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck

Upload: others

Post on 16-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Brecht Verhoeve

    mob detection during open-air eventsModifiable drone thermal imaging analysis framework for

    Academic year 2017-2018Faculty of Engineering and ArchitectureChair: Prof. dr. ir. Bart DhoedtDepartment of Information Technology

    Master of Science in Computer Science Engineering Master's dissertation submitted in order to obtain the academic degree of

    Counsellors: Pieter-Jan Maenhaut, Jerico MoeyersonsSupervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck

  • Brecht Verhoeve

    mob detection during open-air eventsModifiable drone thermal imaging analysis framework for

    Academic year 2017-2018Faculty of Engineering and ArchitectureChair: Prof. dr. ir. Bart DhoedtDepartment of Information Technology

    Master of Science in Computer Science Engineering Master's dissertation submitted in order to obtain the academic degree of

    Counsellors: Pieter-Jan Maenhaut, Jerico MoeyersonsSupervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck

  • iv

    Permission for usage

    ”The author gives permission to make this master dissertation available for consultation and to copy parts of this master dis-

    sertation for personal use. In the case of any other use, the copyright terms have to be respected, in particular with regard to

    the obligation to state expressly the source when quoting results from this master dissertation.”

    Brecht Verhoeve

    Ghent, June 2018

  • v

    Preface

    This master dissertation is submitted as a completion of the academic degree as Master of Science in Computer Science Engi-

    neering at Ghent University. The dissertation investigates the upcoming combination of drones and thermal cameras, their use

    cases and supporting technologies. The dissertation led me through various fields such as software architecture, microservices,

    software containerization, GPUs and neural networks. I wrote the dissertation focussing on the business and technological

    aspects that could lead to increasing industry adoption of these technologies.

    I would like to thank my supervisors and counsellors for their continuous support this year. You were always there for a quick

    meeting during which the atmosphere was always positive, jokes were always around the corner but with a focus on results.

    Prof. Volckaert for the quickest replies on emails I have yet witnessed to this day and guiding me through the complex journey

    of this dissertation. Jerico Moeyersons for the office hop-ins and help during that annoying CUDA installation. Pieter-Jan

    Maenhaut for his questions and reviews during meetings, which provided me with new insights and things to write about. Nils

    Tijtgat for the support in the early days of the thesis, I’ve read your tutorial on YOLO more than I would like to admit. And finally

    Prof. De Turck for the opportunity of working on this topic.

    I am grateful for the company I had this year when working on the dissertation. Ozan Catal, Joran Claeys, Stefan Wauters, Dries

    Bosman, Pieter De Cleer, Igor Lima de Paula, Laura Van Messem, Lars De Brabandere, Stijn Cuyvers, Stijn Poelman, thank you for

    the fun times, spontaneous beers and support this year!

    Special thanks go out to the people of the VTK and FK students associations. You provided me with unforgettable experiences,

    friendships, teachings andmemories. With a special mention to Stéphanie, Anna and Nick from Career & Development, everyone

    from Delta, and finally Stijn Adams and Sander De Wilde for their continuous support throughout the years.

    Finally, I want to thank my parents and Marjolein Hondekyn for their advise and massive support. Without you, I wouldn’t have

    been able to wrestle myself through the tough periods and finish the dissertation!

    Brecht Verhoeve

    Ghent, June 2018

  • vi

    Modifiable drone thermal imaging analysis framework for mob detection during

    open-air events

    Brecht Verhoeve

    Supervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck

    Counsellors: Pieter-Jan Maenhaut, Jerico Moeyersons

    Master’s dissertation submitted in order to obtain the academic degree of

    Master of Science in Computer Science Engineering

    Department of Information Technology

    Chair: Prof. dr. ir. Bart Dhoedt

    Faculty of Engineering and Architecture

    Ghent University

    Academic year 2017-2018

    Abstract

    Drones and thermal cameras are used in combination for many applications such as search and rescue, fire fighting, etc. Due to

    vendor specific hardware and software, applications are hard to develop and maintain. Therefore a modifiable drone thermal

    imaging analysis framework is proposed that enables users to more easily develop such image processing applications. It

    implements a microservice plugin architecture. Users can build image processing applications with the framework by building

    media streams using plugins that are either thermal cameras or image analysis software modules. The framework is evaluated

    by building a proof of concept implementation which is tested on the initial requirements. It achieves the modifiability and

    interoperability requirements at the cost of performance and security. The framework is applied for detecting large crowds of

    people (mobs) during open-air events. A new dataset containing thermal images of such mobs is presented, on which a YOLOv3

    neural network is trained. The trained model is able to detect mobs on new thermal images in real-time achieving frame rates

    of 55 frames per second when deployed on a modern GPU.

    Keywords: Drone thermal imaging, Video streaming, Framework, Microservices, Object de-

    tection, Plugin

  • Modifiable Drone Thermal Imaging AnalysisFramework for Mob Detection during Open-Air

    EventsBrecht Verhoeve

    Supervisors: prof. dr. Bruno Volckaert, prof. dr. ir. Filip De Turck, Pieter-Jan Maenhaut, Jerico Moeyersons

    Abstract— Drones and thermal cameras are used in combination formany applications such as search and rescue, fire fighting, etc. Due to ven-dor specific hardware and software, applications are hard to develop andmaintain. Therefore a modifiable drone thermal imaging analysis frame-work is proposed that enables users to more easily develop such imageprocessing applications. It implements a microservice plugin architecture.Users can build image processing applications with the framework by build-ing media streams using plugins that are either thermal cameras or imageanalysis software modules. The framework is evaluated by building a proofof concept implementation which is tested on the initial requirements. Itachieves the modifiability and interoperability requirements at the cost ofperformance and security. The framework is applied for detecting largecrowds of people (mobs) during open-air events. A new dataset containingthermal images of such mobs is presented, on which a YOLOv3 neural net-work is trained. The trained model is able to detect mobs on new thermalimages in real-time achieving frame rates of 55 frames per second whendeployed on a modern GPU.

    Keywords— Drone thermal imaging, Video streaming, Framework, Mi-croservices, Object detection, Plugin

    I. INTRODUCTION

    THROUGHOUT history, having an overview of the environ-ment from high viewpoints held many benefits. The adventof drones and advanced cameras provides low-cost aerial imag-ing that creates numerous opportunities for new applications.Traditional visual cameras for the visible light spectrum offerhigh quality images, but are limited to daytime or artificiallylighted scenes. Thermal cameras measure thermal radiation ofobjects in a scene and thus can operate in utter darkness, reveal-ing information not visible to the normal eye [1]. The combi-nation of drones and thermal cameras is used in many differentapplications such as geography [2,3], agriculture [4], search andrescue [5], wildlife monitoring [6], disaster response [7], main-tenance [8], etc.

    Several vendors offer thermal camera products, some specif-ically designed for drone platforms. These cameras often usedifferent image formats, color schemes and interfaces [1,9–11].This leads to issues if applications want to change the camerathat is used or when the camera is no longer supported by thevendor, because different software needs to be built to inter-act with the new camera, which often is a non-negligible cost.This leads to a problem called vendor lock-in that makes cus-tomers dependent on a certain vendor as they cannot switchproduct without making substantial costs, a problem alreadyvery tangible for cloud-based applications [12]. Applicationsacross various fields often have slightly different functional andnon-functional requirements. For this dissertation several Bel-gian fire fighting departments were asked for requirements for

    a thermal drone platform application. It quickly became clearthat they had various problems that needed to be solved, suchas finding hot explosives, measuring temperatures in contain-ers, identifying hot entrances, detecting invisible methane fires,finding missing persons, etc. Some use cases need to be eval-uated in real-time (during fires), others need to be extremelyaccurate. A hypothetical application should be able to quicklyimplement new detection and analysis features to meet all theserequirements. Due to the current solutions not being modifiableenough, current applications built with aerial thermal imagingtechnology remain in the niche use case for which they were ini-tially developed [13]. Applications could benefit from a back-bone framework to aid in this modifiability/interoperability is-sue, aiding in developing end-to-end solutions connecting ther-mal cameras to various analysis/detection modules.

    This dissertation explores the requirements for such a frame-work and its potential software architecture. To test the viabil-ity of the architecture, a proof of concept prototype is imple-mented and evaluated against the initial requirements. To verifyif it aids in developing detection applications, the specific usecase for detecting large crowds of people, so-called mobs dur-ing open-air events is investigated. Monitoring crowds duringopen-air events is important, as mobs can create potentially dan-gerous situations through bottlenecks, blocking escape routes,etc. Through monitoring and detecting these mobs, these situa-tions can be avoided before they become problematic [14, 15].

    The remainder of this paper is organized as follows. Sec-tion II presents similar projects on the topic of modifiable imag-ing analysis frameworks and thermal object detection. SectionIII presents the requirements of the framework and the soft-ware architecture designed from these requirements. Section IVpresents the implementation of the framework prototype. Themob detection experiment is described in Section V. The testsand results to evaluate the framework and the mob detection ex-periment are presented in Section VI. Finally, Section VII drawsconclusions from this research and indicates where future effortsin this field should go to.

    II. RELATED WORK

    The Irish start-up DroneSAR [16] developed a search-and-rescue (SAR) drone platform allowing users to stream live im-ages and video from a drone as it conducts a search for miss-ing persons. The platform works with any camera, visual andthermal but focuses on drones from vendor DJI, DroneSARs in-dustry partner. Amazon introduced the Amazon Kinesis Video

  • Streams platform in January 2018 as a new service for the Ama-zon Web Services (AWS) cloud platform. It allows users tostream live video from devices to the AWS cloud and buildapplications for real-time video processing [17]. The VIPERproject by EAVISE, KU Leuven researched how thermal andvisual video images could be used for real-time detection ofpersons using object detection algorithms such as deep learning[18]. The framework presented in this work combines elementsfrom all three of these examples.

    III. REQUIREMENTS AND SOFTWARE ARCHITECTURE

    A. Functional requirements

    Three general actors are identified for the framework: an end-user that wants to build a image processing application for a spe-cific use case, camera developers integrating new cameras intothe framework, and analysis/detection module developers inte-grating new analysis/detection algorithms into the framework sothat end-users can use them to build their applications. An end-user should be able to construct a image processing applicationusing the integrated cameras and analysis modules. He shouldbe able to adapt this application with the framework for newuse cases. Camera and analysis module developers should beable to add and distribute new integrations for new cameras andanalysis modules to the framework. This allows the end-users tofocus on the use case, not the technical details of the hardwareplatforms or algorithms and have a wider selection of hardwareand algorithms.

    B. Non-functional requirements

    Interoperability, modifiability, and peformance are identifiedas the architecturally significant requirements. Interoperabilityspecifies that the framework should be able to interact with var-ious cameras and analysis software modules via interfaces. Theamount of systems the framework can successfully interact withadding to the business value of the framework, as end-users canuse more devices via the framework to build applications. Theframework needs to be extendable with new thermal camerasand analysis modules. Applications built with the frameworkshould be modifiable to integrate new hardware and software.The available hardware on-site for use cases such as forest firemonitoring is not always powerful enough to support heavy im-age analysis software. The framework should be able to deployin a distributed fashion, to allow more computationally expen-sive operations to be executed on more powerful remote devices.Some use cases require real-time streaming of video and manip-ulation of these video streams, which should be supported forthe framework to be relevant.

    C. Software architecture

    An architectural pattern analysis based on the requirementspresented in Section III-B was conducted, from which a com-bination of the microservices and microkernel pattern was se-lected as the base pattern for the software architecture. The mi-crokernel pattern enables the framework to be extended via aplugin system. The microservices framework enables interop-erability through the use of standard interfaces and a communi-cation scheme. It also allows for the framework to be deployed

    in a distributed fashion [19–21]. The software architecture isdocumented in static views, sequence diagrams and deploymentviews. Figure 1 presents an overview of the architecture.

    Fig. 1: Component-connector overview of the framework. Theclear components are the core components of the framework thateach user needs to install to use the framework. The coloredcomponents are used for the distribution of plugins.

    End-users interact with the framework via the Client Inter-face, a graphical or textual interface. Commands are forwardedto the Stream module that manages the layout and state of thecomponents in the media pipeline (stream) that the end-userbuilds for his use case. To activate and place the componentsin a certain layout, the Stream module connects with the Pro-ducer and Consumer components that control the plugins thatprocess the media. Producer Plugins are devices that producemedia, such as thermal cameras. Consumer Plugins process andconsume media, such as analysis software and displays. Oncea stream is established, the plugins forward media to each otherin the layout specified by the Stream module. New support forcameras and analysis modules can be added as plugins to theProducer/Consumer Distribution components that distribute thissoftware so that end-users can download and install the plugins.Each module in the architecture is a microservice, allowing fora distributed deployment and enabling the modifiability require-ments through interchangeable modules. Cameras and analy-sis modules are realized as plugins for the Producer/Consumermodules implemented as a microkernel. This allows the frame-work to easily build new streams for different use cases and in-terchange components when needed.

    C.1 Plugin model

    Figure 2 depicts the model of a general framework plugin.The plugin defines three interfaces: a source media endpoint toreceive media from different sources, a listener endpoint to for-ward the processed media to other listening plugins and an APIfor framework control. The framework uses the API to changewhich sources and listener a plugin has and its state. By linkingplugins together by setting the sources and listeners resources,the framework can build a media processing stream. ProducerPlugins have no sources since they produce media. The statesare used stop and start the media processing of the plugins in

  • Fig. 2: Schematic overview of a plugin.

    the stream. The REST paradigm is selected to build this API,with /state, /sources and /listeners resources that need to be min-imally implemented. A plugin has the following states: INAC-TIVE, PLAY, STOP and PAUSE. Figure 3 depicts the state tran-sition diagram. A plugin implements the visible states STOP,PAUSE and PLAY describing if the media process of the pluginis stopped, paused or processing respectively. The INACTIVEstate is only visible to the framework as it indicates that their isno active process that runs the plugin. This is the initial state ofa plugin in the framework. When a plugin is added to a stream,the plugin microservice is started, transitions to the STOP stateand waits for commands.

    Fig. 3: State transition diagram of a plugin.

    C.2 Network topology and communication protocol

    The microservices of the framework and the plugins need acommunication protocol to exchange commands and video. Forsending the commands the HTTP/TCP protocol is used, a syn-chronous protocol that blocks on the response of a request. Thisensures that commands are always received and acknowledgedincreasing the reliability of the communication at the cost ofperformance [22]. The asynchronous RTP/UDP protocol is se-lected to transfer media between the plugins to ensure low la-tency video transfer between plugins to enable real-time videostreams. The recommended codec for transmitting video mediais MJPEG, which transmits video frames as separately encodedJPEG images [23]. Other codecs such as MPEG-4 encode onlycertain frames as keyframes, and the other frames as B-framesthat encode differences from the keyframe [24]. This impliesthat when receiving images from a stream, a keyframe must firstbe received before the video can be decoded. Using MJPEG

    plugins receiving frames can directly perform analysis on eachkeyframe and do not have to wait for additional frames. An ex-ample of this network topology for a stream with 2 Producerplugins and 3 Consumer plugins is depicted in Figure 4

    Fig. 4: Network topology. The full lines represent HTTP/TCPcommuncations, the dashed line RTP/UDP communications.

    IV. PROTOTYPE IMPLEMENTATION

    The goal of the prototype implementation is to provide a proofof the concept architecture and meet the requirements presentedin Section III. The core framework components are imple-mented, the distribution components are left out of scope as theyfocus primarily on the distribution of supported plugins. Thecore microservices as well as the plugins are implemented usingthe Docker software containerization framework [25]. Contain-ers virtualize on the operating system and allow for portable,lightweight software environments for processes with a minorperformance overhead. Using this technology the core modulesand plugins can be deployed in a local and distributed fashionand be setup and shutdown easily [26]. The microservice con-tainers communicate via the protocols presented in Section III-C.2. The REST APIs are built with the Flask framework [27], alightweight Python web development framework ideal for pro-totyping. The Producer/Consumer components need access tothe Docker daemon running on the Docker host in order to spinup and shutdown Docker containers containing the plugin pro-cesses. This is achieved by mounting the Docker client socket inthe Producer/Consumer containers. This gives the container rootaccess to the host, a significant security threat [28,29]. Two sam-ple plugins were implemented: Filecam a plugin that producesvideo read in from a file, and Display a plugin that forwardsmedia to the display of the local device. The plugins transmitmedia using the video streaming framework GStreamer [30].

    V. MOB DETECTION

    A. Dataset

    Several publicly available datasets for thermal images exists[31–34]. None of these include large crowds of people, so anew dataset called the Last Post dataset was created. It consistsof thermal video captured at the Last Post ceremony in Ypres,

  • (a) Thermal view of the square. (b) Visual view of the square. (c) Thermal view of the bridge. (d) Visual view of the bridge.

    Fig. 5: Last Post dataset main scenes.

    Belgium [35]. The videos were captured using the Flir One Prothermal camera for Android [36] using the Iron colorscheme.Two main scenes are present in the dataset, depicted in Figure 5.Mobs are present in the thermal images, not in the visual imagesdue to the images being made on separate days. The imagesused for the experiment were manually annotated, outliers wereremoved and the dataset was randomly split in a training andvalidation set.

    B. Model

    Detecting and classifying objects of interest in images isknown as the object detection problem in machine learning [37].Several object detection algorithms and frameworks have beenimplemented in the past years. A distinction is made betweentraditional models [31, 38–40], deep learning two-stage net-works [41–46] and deep learning dense networks [47–49]. Thetraditional and two-stage methods make predictions relativelyslow (order of seconds on GPU) when compared to the densenetworks (order of milliseconds on GPU) [47]. Since the goalis to use the framework in real-time use cases the latter is pre-ferred. The YOLOv3 model is selected as it achieves state of theart prediction performances, can make real-time predictions andis available via the open source neural network framework dark-net [50, 51]. The model is pre-trained on the ImageNet dataset[52]. The model is trained on a NVIDIA Geforce 980 TX GPUand optimizes the SSE loss using batch gradient descent [50].To select the best weights, the average Intersection of Union(IoU) and mean Average Precision (mAP) [53] are calculatedon predictions on the validation set. The weights that achievethe highest mAP are selected as the final weights.

    VI. RESULTS

    A. Framework

    To evaluate the framework acceptance tests for the require-ments from Section III were conducted. Common frameworkoperations such as manipulating and building a stream have anaverage execution time of 0.84 seconds with a standard devia-tion of 0.37 seconds. Less common operations such as deacti-vating a plugin, starting up the framework and shutting downthe framework have an average execution time of 3.58, 8.40 and24.02 seconds respectively with standard deviations 4.67, 0.50and 0.48 respectively. Deactivating plugins (STOP to INAC-TIVE transitions) takes a long time, as the container running theprocess needs to be removed. Real-time streaming could not betested due to the GStreamer framework having no readily avail-

    able testing endpoints. However when streaming and displayinga video with the framework human users could not differenti-ate between a streaming video and a video played using a nativemedia player, making it plausible the framework streams in real-time. Great care must be taken when building plugins, as theirprocessing speed has a direct impact on the real-time streamingperformance. Interoperability is achieved with the REST APIsand plugin model presented in Section III-C. The interoperabil-ity is tested by having the framework exchange information witha mock plugin implementing the specified interface and count-ing the number of correct exchanges. The average successfulexchange ratio is 99.998%. The framework can install and de-tect new plugins at runtime, achieving runtime modifiability atplugin level. Different deployment schemes were not tested forthe prototype.

    B. Mob detection

    The weights generated at the 15700th training epoch achievedthe highest mAP value, 90.52% on the validation set. For com-parison performance of other models on benchmark datasetsachieve an average mAP of 74.8% [54]. The reason the model isachieving such high values is because the validation set is tem-porally correlated with the training set, as both sets are extractedfrom videos in which frames have a temporal correlation. Per-formance when predicting on new datasets will be worse. Figure6 depicts some predictions of the model. When predicting on avideo, the model generated predictions at an average frame rateof 55 frames per second an a GPU.

    Fig. 6: Model predictions on validation set.

    VII. CONCLUSION AND FUTURE WORK

    In this dissertation a modifiable drone thermal imaging anal-ysis framework is proposed to allow end-users to build flexiblevideo processing pipelines using different thermal cameras andanalysis modules. The framework implements a microservice

  • container plugin architecture. Users can build image process-ing applications with the framework by building media streamsusing plugins that are either thermal cameras or image analy-sis software modules. The framework is evaluated by building aproof of concept implementation which is tested on the initial re-quirements. The proposed framework achieves the modifiabilityand interoperability requirements at the cost of performance andsecurity. The framework is applied for detecting large crowdsof people (mobs) during open-air events. A new dataset con-taining thermal images of such mobs is presented, on which aYOLOv3 neural network is trained. The trained model is ableto detect mobs on new thermal images in real-time achievingframe rates of 55 frames per second when deployed on a modernGPU. Some extensions to this research are: deploying a detec-tion model using the framework, testing the other deploymentconfigurations, testing the framework with end-users in prac-tice, and building new object detection models specifically forthermal images.

    REFERENCES[1] R. Gade and T. B. Moeslund, “Thermal cameras and applications: a sur-

    vey,” Machine Vision and Applications, vol. 25, pp. 245–262, 2014.[2] M. C. Harvey, J. V. Rowland, and K. M. Luketina, “Drone with thermal

    infrared camera provides high resolution georeferenced imagery of theWaikite geothermal area, New Zealand,” 2016.

    [3] S. Amici, M. Turci, S. Giammanco, L. Spampinato, and F. Giulietti, “UAVThermal Infrared Remote Sensing of an Italian Mud Volcano,” vol. 2,pp. 358–364, 2013.

    [4] J. Bendig, A. Bolten, and G. Bareth, “INTRODUCING A LOW-COSTMINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGING,”2012.

    [5] A. J. Rivera, A. D. Villalobos, J. C. Monje, J. A. Mariñas, and C. M.Oppus, “Post-disaster rescue facility: Human detection and geolocationusing aerial drones,” IEEE Region 10 Annual International Conference,Proceedings/TENCON, pp. 384–386, 2017.

    [6] P. Christiansen, K. A. Steen, R. N. Jørgensen, and H. Karstoft, “Auto-mated detection and recognition of wildlife using thermal cameras.,” Sen-sors (Basel, Switzerland), vol. 14, pp. 13778–93, jul 2014.

    [7] S. Chowdhury, A. Emelogu, M. Marufuzzaman, S. G. Nurre, and L. Bian,“Drones for disaster response and relief operations: A continuous approx-imation model,” 2017.

    [8] Workswell, “Pipeline inspection with thermal diagnostics,” 2016.[9] DJI, “Zenmuse H3 - 2D.”[10] Workswell, “Applications of WIRIS - Thermal vision system for drones.”[11] Therm-App, “Therm-App - Android-apps op Google Play,” 2018.[12] B. Satzger, W. Hummer, C. Inzinger, P. Leitner, and S. Dustdar, “Winds of

    change: From vendor lock-in to the meta cloud,” IEEE Internet Comput-ing, vol. 17, no. 1, pp. 69–73, 2013.

    [13] J. Divya, “Drone Technology and Usage: Current Uses and Future DroneTechnology,” 2017.

    [14] B. Steffen and A. Seyfried, “Methods for measuring pedestrian density,flow, speed and direction with minimal scatter,” Physica A: Statistical Me-chanics and its Applications, vol. 389, pp. 1902–1910, may 2010.

    [15] M. Wirz, T. Franke, D. Roggen, E. Mitleton-Kelly, P. Lukowicz, andG. Tröster, “Inferring crowd conditions from pedestrians’ location tracesfor real-time crowd monitoring during city-scale mass gatherings,” Pro-ceedings of the Workshop on Enabling Technologies: Infrastructure forCollaborative Enterprises, WETICE, pp. 367–372, 2012.

    [16] L.-L. Slattery, “DroneSAR wants to turn drones into search-and-rescueheroes,” 2017.

    [17] Amazon Web Services Inc., “What Is Amazon Kinesis Video Streams?,”2018.

    [18] T. Goedeme, “Projectresultaten VLAIO TETRA-project,” tech. rep., KULeuven, Louvain, 2017.

    [19] L. Bass, P. Clements, and R. Kazman, Software Architecture in Practice.Addison-Wesley Professional, 3rd ed., 2012.

    [20] M. Richards, Software Architecture Patterns. O’Reilly Media, first edit ed.,2015.

    [21] C. Richardson, “Microservice Architecture pattern,” 2017.[22] C. De La Torre, C. Maddock, J. Hampton, P. Kulikov, and M. Jones, “Com-

    munication in a microservice architecture,” 2017.

    [23] On-Net Surveillance Systems Inc., “MJPEG vs MPEG4 Understandingthe differences, advantages and disadvantages of each compression tech-nique,” 2006.

    [24] D. Bull, Communicating Pictures: A Course in Image and Video Coding.Elsevier Science, 2014.

    [25] Docker Inc., “Docker - Build, Ship, and Run Any App, Anywhere,” 2018.[26] D. Merkel, “Docker: Lightweight Linux Containers for Consistent Devel-

    opment and Deployment,” 2014.[27] A. Ronacher, “Welcome to Flask Flask Documentation (0.12),” 2017.[28] Lvh, “Don’t expose the Docker socket (not even to a container),” 2015.[29] R. Yasrab, “Mitigating Docker Security Issues,” tech. rep., University of

    Science and Technology of China, Hefei.[30] GStreamer, “GStreamer: open source multimedia framework,” 2018.[31] J. W. Davis and M. A. Keck, “A Two-Stage Template Approach to Per-

    son Detection in Thermal Imagery,” Proc. Workshop on Applications ofComputer Vision, 2005.

    [32] S. Hwang, J. Park, N. Kim, Y. Choi, and I. S. Kweon, “MultispectralPedestrian Detection: Benchmark Dataset and Baseline,” CVPR, 2015.

    [33] Z. Wu, N. Fuller, D. Theriault, and M. Betke, “A Thermal Infrared VideoBenchmark for Visual Analysis,” IEEE Conference on Computer Visionand Pattern Recognition Workshops, 2014.

    [34] S. Z. Li, R. Chu, S. Liao, and L. Zhang, “Illumination Invariant FaceRecognition Using Near-Infrared Images,” IEEE Transactions on PatternAnalysis and Machine Intelligence, vol. 29, no. 4, pp. 627–639, 2007.

    [35] Last Post Association, “Mission,” 2018.[36] FLIR, “FLIR One Pro,”[37] E. Alpaydin, Introduction to machine learning. MIT Press, 3 ed., 2014.[38] F. X. F. Xu, X. L. X. Liu, and K. Fujimura, “Pedestrian detection and

    tracking with night vision,” IEEE Transactions on Intelligent Transporta-tion Systems, vol. 6, no. 1, pp. 63–71, 2005.

    [39] H. Nanda and L. Davis, “Probabilistic template based pedestrian detectionin infrared videos,” IEEE Intelligent Vehicles Symposium, Proceedings,vol. 1, pp. 15–20, 2003.

    [40] R. Appel, S. Belongie, P. Perona, and P. Doll, “Fast Feature Pyramids forObject Detection,” Pami, vol. 36, no. 8, pp. 1–14, 2014.

    [41] J. R. R. Uijlings, K. E. A. Van De Sande, T. Gevers, and A. W. M. Smeul-ders, “Selective Search for Object Recognition,” tech. rep., 2012.

    [42] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Region-Based Convolu-tional Networks for Accurate Object Detection and Segmentation,” IEEETransactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1,pp. 142–158, 2014.

    [43] R. Girshick, “Fast R-CNN,” in Proceedings of the IEEE InternationalConference on Computer Vision, vol. 2015 Inter, pp. 1440–1448, 2015.

    [44] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans-actions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6,pp. 1137–1149, 2016.

    [45] K. He, Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” arXiv, 2018.[46] J. Dai, Y. Li, K. He, and J. Sun, “R-FCN: Object Detection via Region-

    based Fully Convolutional Networks,” tech. rep., 2016.[47] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look

    Once: Unified, Real-Time Object Detection,” 2015.[48] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-y. Fu, and A. C.

    Berg, “SSD : Single Shot MultiBox Detector,” arXiv, 2016.[49] T.-y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal Loss for

    Dense Object Detection,” arXiv, 2018.[50] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,”

    axXiv, 2018.[51] J. Redmon, “Darknet: Open source neural networks in c.”

    http://pjreddie.com/darknet/, 2013–2016.[52] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet:

    A Large-Scale Hierarchical Image Database,” in CVPR09, 2009.[53] M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn,

    and A. Zisserman, “The Pascal Visual Object Classes Challenge: A Ret-rospective,” International Journal of Computer Vision, vol. 111, no. 1,pp. 98–136, 2014.

    [54] A. Ouaknine, “Review of Deep Learning Algorithms for Object Detec-tion,” 2018.

  • xii

    Contents

    1 Introduction 1

    1.1 Drones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.2 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.2.1 Thermal Cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.2.2 Aerial thermal imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.3 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.3.1 Industry adoption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.3.2 Crowd monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.3.3 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.3.4 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    2 System Design 5

    2.1 Requirements analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    2.1.1 Functional requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    2.1.2 Non-functional requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    2.2 Patterns and tactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    2.2.1 Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.2.2 Event-driven architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.2.3 Microkernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.2.4 Microservices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    2.2.5 Comparison of patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    2.3 Software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    2.3.1 Static view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    2.3.2 Dynamic views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    2.3.3 Deployment views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

  • 3 State of the art and technology choice 27

    3.1 Thermal camera options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    3.1.1 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    3.1.2 Comparative analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    3.2 Microservices frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    3.2.1 Flask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    3.2.2 Falcon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    3.2.3 Nameko . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    3.2.4 Vert.x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    3.2.5 Spring Boot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    3.3 Deployment framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    3.3.1 Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    3.3.2 LXC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    3.3.3 Docker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    3.3.4 rkt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    3.4 Object detection algorithms and frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    3.4.1 Traditional approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    3.4.2 Deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    3.4.3 Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    3.5 Technology choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    3.5.1 Thermal camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    3.5.2 Microservices framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    3.5.3 Deployment framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    3.5.4 Object detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    4 Proof of Concept implementation 43

    4.1 Goals and scope of prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    4.2 Overview of prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    4.2.1 General overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    4.2.2 Client interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    4.2.3 Stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    4.2.4 Producer and Consumer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    4.2.5 Implemented plugins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    4.3 Limitations and issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    4.3.1 Single client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    4.3.2 Timeouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    4.3.3 Exception handling and testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

  • 4.3.4 Docker security issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    4.3.5 Docker bridge network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    4.3.6 Single stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    4.3.7 Number of containers per plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    5 Mob detection experiment 53

    5.1 Last Post thermal dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    5.1.1 Last Post ceremony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    5.1.2 Dataset description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    5.2 Object detection experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    5.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    5.2.2 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    6 Results and evaluation 58

    6.1 Framework results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    6.1.1 Performance evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    6.1.2 Interoperability evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    6.1.3 Modifiability evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

    6.2 Mob detection experiment results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

    6.2.1 Training results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    6.2.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    6.2.3 Validation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    7 Conclusion and future work 67

    7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

    7.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    7.2.1 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    7.2.2 Implementing a detection plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    7.2.3 Different deployment configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    7.2.4 Multiple streams with different layouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    7.2.5 Implementing the plugin distribution service (Remote Producer/Consumer) . . . . . . . . . . . . 70

    7.2.6 Using high performance microservices backbone frameworks . . . . . . . . . . . . . . . . . . . . 70

    7.2.7 New object detection models and datasets specifically for thermal images . . . . . . . . . . . . . 70

    A Firefighting department email conversations 81

    A.1 General email sent to Firefighting departments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

    A.2 Conversation with Firefighting department of Antwerp, Belgium . . . . . . . . . . . . . . . . . . . . . . . 82

    A.3 Converstation with Firefighting department of Ostend, Belgium . . . . . . . . . . . . . . . . . . . . . . . . 83

  • A.4 Conversation with Firefighting department of Courtrai, Belgium . . . . . . . . . . . . . . . . . . . . . . . . 83

    A.5 Conversation with Firefighting department of Ghent, Belgium . . . . . . . . . . . . . . . . . . . . . . . . . 83

    B Thermal camera specifications 85

    C Last Post thermal dataset summary 94

    C.1 24th of March 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

    C.2 2nd of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

    C.3 3th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

    C.4 4th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

    C.5 5th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

    C.6 9th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

    C.7 10th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

    C.8 11th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

    C.9 12th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

  • xvi

    List of Figures

    2.1 Use case diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    2.2 Overview of the framework software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    2.3 Framework network topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.4 Client Interface detailed view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.5 Stream detailed view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    2.6 Stream model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    2.7 Plugin model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.8 Plugin state transition diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    2.9 Component-connector diagrams of the Producer and Consumer module. . . . . . . . . . . . . . . . . . . . 21

    2.10 Producer and Consumer Distribution component-connector diagrams. . . . . . . . . . . . . . . . . . . . . 22

    2.11 Add plugin sequence diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    2.12 Link plugins sequence diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    2.13 Deployment diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    3.1 Thermal image and MSX image of a dog. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    3.3 Rethink IT: Most used tools and frameworks for microservices results [54] . . . . . . . . . . . . . . . . . . 32

    3.4 Containers compared to virtual machines [66] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    4.1 filecam GStreamer pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    4.2 local plugin GStreamer pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    5.1 Last Post ceremony panorama . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    5.2 Last Post filming locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    5.3 Main scenes in the Last Post dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

    5.4 Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    6.1 Average training loss per epoch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    6.2 Validation metrics per epoch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    6.3 Predictions of the model on images in the validation set. . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

  • 7.1 GStreamer pipeline for a plugin with a detection model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

  • xviii

    List of Tables

    2.1 Performance utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    2.2 Interoperability utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    2.3 Modifiability utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    2.4 Usability utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    2.5 Security utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    2.6 Availability utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.7 Architecture pattern comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    6.1 Acceptance tests results summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

    6.2 Performance test statistics summary, measured in seconds . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    6.3 Resource usage of the framework in several conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    6.4 Total size of framework components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    6.5 Interoperability tests results (S.: Source, L.: Listener) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

    B.1 Compared cameras, their producing companies and their average retail price. . . . . . . . . . . . . . . . . 86

    B.2 Physical specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

    B.3 Image quality IR: InfraRed, SD: Standard, FOV: Field of View . . . . . . . . . . . . . . . . . . . . . . . . . . 88

    B.4 Thermal precision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

    B.5 Interfaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

    B.6 Energy consumption. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

    B.7 Help and support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

    B.8 Auxiliary features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

  • xix

    List of Listings

    1 Minimal Flask application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    2 Vert.x example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    3 Spring Boot example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    4 docker-compose.yml snippet of the prototype. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

    5 Mounting the Docker socket on the container . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

    6 Starting a plugin container. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

    7 Dynamic linking of the decodebin and jpegenc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

  • xx

    List of Abbreviations

    ACF Aggregated Channel Features

    AMQP Advanced Message Queuing Protocol

    API Application Programming Interface

    AS Availability Scenario

    ASR Architecturally Significant Requirement

    CLI Command Line Interface

    CNN Convolutional Neural Networks

    CRUD Create Read Update Destroy

    DNS Domain Name System

    FR Functional Requirement

    GPU Graphical Processing Unit

    H High

    HTTP Hyper Text Transfer Protocol

    ICF Integral Channel Features

    IoU Intersection of Union

    IS Interoperability Scenario

    IT Interoperability Tactic

    JVM Java Virtual Machine

    L Low

  • xxi

    LXC Linux Containers

    M Medium

    mAP mean Average Precision

    Motion-JPEG MJPEG

    MS Modifiability Scenario

    MSX Multi Spectral Dynamic Imaging

    MT Modifiablity Tactic

    NFR Non-Functional Requirement

    ONNX Open Neural Network Exchange Format

    OS Operating System

    PS Performance Scenario

    PT Performance Tactic

    QAR Quality Attribute Requirement

    REST Representational State Transfer

    RNN Recurrent Neural Network

    RPN Region Proposal Network

    RTP Real-time Transport Protocol

    SS Security Scenario

    SSE Sum of Squared Errors

    SVM Support Vector Machine

    TCP Transmission Control Protocol

    UDP User Datagram Protocol

    UI User Interface

    US Usability Scenario

    YOLO You Only Look Once

  • INTRODUCTION 1

    Chapter 1

    Introduction

    Throughout history, having an overview of the environment from high viewpoints held many benefits. Early civilizations used

    hills to monitor their surroundings, population and spot possible attackers. The discovery of flight meant that environments

    could now be viewed from a bird’s-eye view, offering even more visibility, revealing much more of the world below. Recently a

    much more smaller type of aircraft was developed: the drone. Ranging from large plane-like to almost insect-like devices and

    having a wide variety of uses, drones are quickly taking over the sky. Drones would not be as effective without proper cameras

    providing a detailed view on the world below. With digital videocameras offering superb quality for steadily decreasing costs,

    almost every scene can be captured in great detail. However, these cameras are limited to the visible light spectrum, which

    hinders drones to operate in all circumstances like nightly flights. Thermal cameras measure the emitted heat of a scene and

    can reveal information not visible to the eye, such as hidden persons or animals, pipelines, malfunctioning equipment, etc. The

    combination of these two technologies certainly holds many exciting opportunities for the future.

    1.1 Drones

    Drones are flying robots that can fly remotely or autonomously and don’t carry a human operator. They can carry a variety of

    payloads: video cameras, delivery parcels, fluid containers, sensors, lights but also lethal explosives [1].

    Drones are classified in different categories based on varying parameters such as the physical characteristics (diameter, weight),

    aerial movement techniques, application domains, etc. Based on diameter, drones are classified as smart dust (1 mm to 0.25

    cm), pico air vehicles (0.25 cm - 2.5 cm), nano air vehicles (2.5 cm - 15 cm), micro air vehicles (15 cm - 1 m), micro unmanned

    aerial vehicles (1 m - 2 m), and unmanned aerial vehicles (2 m and larger). Often depending on their diameter the weight

    of these devices ranges from less than a gram up to more than 2000 kg. Drones have different flight techniques such as

    propulsion engines with wings, rotors in various amounts, flapping wings and even balloons. They are used for all kinds of

    purposes, ranging from search and rescue missions, environmental protection, delivery, recon, etc. Hassanalian et al. provide

    an excellent overview of most types of drones [2].

    Due to the increasing interest in commercial drone platforms [3], a variety of payloads were developed, specifically tailored for

    these aerial robots such as gimbals to mount action video cameras [4], gimbals for delivering packets [5] and thermal imaging

  • 1.2 Concepts 2

    platforms [6].

    1.2 Concepts

    1.2.1 Thermal Cameras

    Thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above

    absolute zero degrees Kelvin. In contrast to visible light cameras, thermal cameras do not depend on an external energy

    source for visibility and colors of objects or scenes. This makes captured images independent of the illumination, colors etc.

    Furthermore, images can be captured in the absence of visible light [7]. Originally thermal camera technology was developed

    for night vision purposes for the military and the devices were very expensive. Later, the technology was commercialized

    and has developed quickly over the last few decades, resulting in better and cheaper cameras [7]. This led to access for a

    broader public, and the technology is now introduced to a wide range of different applications, such as building inspection, gas

    detection, industrial appliances, medicinal science, agriculture, fire detection, surveillance, etc. [7] Thermal cameras are now

    being mounted on drones to give an aerial thermal overview.

    1.2.2 Aerial thermal imaging

    Aerial thermal imaging is defined as the creation of thermal images using a flying device. This dissertation focuses on the usage

    of drones for aerial thermal imaging. There are many applications for aerial thermal imaging. Some examples are: geography

    [8, 9], agriculture [10, 11], search and rescue operations [12], wildlife monitoring [13], forest monitoring [14, 15], disaster response

    [16], equipment and building maintenance [17–20], etc. In the past few years, several industry players have developed thermal

    cameras specifically aimed at these drone applications. Examples are FLIR [6], Workswell [21] and TEAX Technology [22].

    1.3 Problem statement

    1.3.1 Industry adoption

    The implementation of thermal cameras on drone platforms faces some issues for wide adoption by the industry. Several

    vendors offer thermal camera products, some specifically designed for drone platforms, that often implement different image

    formats, color schemes and interfaces (e.g. [23–25]). This leads to issues if users want to modify their applications by changing

    the camera that is used because the applicationmust implement new software to interact with the camera, or when the camera

    is no longer supported by the vendor, leaving the application with outdated hardware and software. This leads to a problem

    called vendor lock-in that makes customers dependent on a certain vendor as they cannot switch product without making

    substantial costs, a problem already very tangible for cloud-based applications today [26].

    Applications across various fields often have different functional and non-functional requirements Some applications have hard

    real-time deadlines (such as firefighting, search and rescue, security, etc.) that must be respected, other applications require

  • 1.3 Problem statement 3

    highly accurate predictions (e.g. person detection, agriculture, etc.). A single application domain can even have many different

    use cases.

    Several firefighting departments in Belgium were contacted to get an overview of their needs for a drone thermal imaging ap-

    plication. It quickly became clear they had various detection problems, such as finding missing persons, locating hot explosives,

    measuring temperatures in silos, detecting invisible methane fires, etc. Equipment also wears down more quickly due to usage

    in harsh environments such as fires in close proximity. A drone thermal application for them needs to be able to exchange

    functionality and hardware easily and have high performance constraints to deliver value for them. The email conversations

    can be read in Appendix A.

    Other drone thermal imaging applications are mostly only used in the niche domain for which they were developed, because

    they aren’t designed for flexibility [27]. These proprietary applications have some disadvantages: the development and support

    potentially has a large cost, vendor lock-in can occur when products are no longer supported, security issues could arise and

    customization is difficult [28, 29]. Applications could benefit from a backbone framework to aid in this modifiability/interoper-

    ability issue, aiding in developing end-to-end solutions connecting thermal cameras to various analysis/detection modules for

    various use cases.

    1.3.2 Crowd monitoring

    Festivals and other open air events are popular gatherings that attract many people. For every event organizer it is important to

    ensure safety and avoid incidents. Large groups of people, so-called mobs can create potentially dangerous situations through

    bottlenecks, blocking escape routes, etc. Therefore having the ability to monitor crowds and predict their behavior is very

    important to avoid such scenarios. Data can be obtained by evaluating video footage from past comparable events or real time

    video monitoring of current events [30]. By analyzing this footage potentially dangerous situations can be avoided by acting

    on the mob formation, and safety regulations can be improved to help planning future events. Vision-based approaches face

    several limitations: mounted cameras cannot capture elements outside of their field of view, can’t see in some conditions (for

    example during night time) and it is difficult to infer information from the raw footage [31].

    Thermal cameras could help for crowd monitoring because they can operate in any condition. Having precise and detailed

    object recognition for the images produced by these cameras is crucial to extract information correctly. In this context clas-

    sifying images is not satisfactory, localization of the objects contained within the images is needed. This problem is known

    as object detection [32]. There are several challenges for object detection in thermal images: the image quality is very low

    when compared to visible light images, there is a lack of color and texture information, and temperature measures are relative

    measures, etc. This makes extracting discriminative information from these images difficult [33]. Most efforts towards object

    detection on thermal images has gone towards human detection. Most of the proposed algorithms focus on feature extraction

    using the Aggregated Channel Features technique and boosting algorithms for learning [33–35]. Novel approaches make use

    of so-called deep learning with neural networks that achieve very good results, given enough data [36].

  • 1.4 Outline 4

    1.3.3 Goal

    The goal of the dissertation is to explore the requirements of the possible backbone framework suggested in Section 1.3.1 and its

    potential software architecture. The architecture is evaluated by building a proof of concept implementation of the framework

    and evaluating it against the proposed requirements. To verify its use in developing drone thermal imaging applications the

    specific mob-detection use case is investigated.

    1.3.4 Related work

    The Irish start-up DroneSAR [37] developed a search-and-rescue (SAR) drone platform allowing users to stream live images

    and video from a drone as it conducts a search for missing persons. The platform works with any camera, visual and thermal

    but focuses on drones from vendor DJI, DroneSARs industry partner. Amazon introduced the Amazon Kinesis Video Streams

    platform in January 2018 as a new service for the Amazon Web Services (AWS) cloud platform. It allows users to stream live

    video from devices to the AWS cloud and build applications for real-time video processing [38]. The VIPER project by EAVISE,

    KU Leuven researched how thermal and visual video images could be used for real-time detection of persons using object

    detection algorithms such as deep learning [36]. The framework presented in this work combines elements from all three of

    these examples.

    1.4 Outline

    The remainder of this dissertation is organized as follows. Chapter 2 presents the requirements for the framework and the

    software architecture. Chapter 3 explores several state of the art technologies that can serve as backbone technologies for

    the framework. To test the viability of the software architecture, a prototype is implemented. Chapter 4 presents the different

    aspects of this prototype. Chapter 5 describes the experiment that is conducted to research the detection of mobs in thermal

    images. The results of both the framework and the detection experiment are presented and evaluated in Chapter 6. Finally the

    conclusion and future research efforts are presented in Chapter 7.

  • SYSTEM DESIGN 5

    Chapter 2

    System Design

    Finding out what users actually expect from a software system and what makes it valuable for them is of key importance for the

    success of that system. This chapter first explores the functional and non-functional requirements of the hypothetical frame-

    work suggested in Chapter 1, to find out what makes building the framework worthwhile. Well known architectural patterns

    enable certain software requirements very well and can be used for building the software architecture of the framework. The

    framework software architecture combines some of these patterns and is presented in several documents.

    2.1 Requirements analysis

    Requirements are the stated life-cycle customer needs and objectives for the system, and they relate to how well the system

    will work in its intended environment. They are those aspects of the framework that will provide value to the users.

    2.1.1 Functional requirements

    Functional requirements (FR) describe the necessary task, action or activity that must be accomplished by the system , often

    captured in use cases and/or user stories [39, 40]. Use cases provide a summary of the features described in the user stories.

    Several external people and/or systems, defined as actors, interact with the framework to achieve a certain goal [40]. Three

    actors are identified for the framework: an end-user that uses the framework in order to build an image processing applica-

    tion for a specific use case such as the ones described in Section 1.2.2, a camera developer who creates support software for

    a specific thermal camera for the framework so that the end-user can buy and use their product, and an analysis software

    developer that creates analysis software for a specific use case (tracking object, detecting objects, etc.) so that the end-user

    can use their software to build his 1 application. The camera and analysis software developers are generalized to an actor called

    plugin developer, who develops plugins to extend the functionality of the framework. These plugins are the building blocks

    with which the end-user can build image processing applications.

    The general user scenario for the framework proceeds as follows. An end-user wants to build an image processing application,

    1To avoid unnecessary gender specific pronoun clutter, the male pronoun is used by default.

  • 2.1 Requirements analysis 6

    e.g. to detect fires in a landscape using a drone. He has a thermal camera for this and has read about hot-spot detection in

    video. The user looks for a plugin for the framework that can read video from his thermal camera and for a plugin that does

    the hot-spot detection. If the user finds these plugins, he can add them to the framework and use them for the application he

    is building. He connects both plugins with the framework in a specific order to finish his application. For this simple example,

    the thermal camera plugin is connected to the hot-spot detection plugin so that video coming from the thermal camera is

    transmitted to the detection plugin to find the fires in the landscape. The plugins in the application and the specific order in

    which they are connected is defined as a stream. This stream should be easily modifiable if additional or other functionalities

    are required. Suppose that the thermal camera delivers very high quality video and the detector cannot work with this as it

    can only operate on low quality images. The end-user searches for a plugin that scales the high quality video down to an

    accepted quality for the detector. This plugin is placed in between the thermal camera and the detector, and the application

    can work again. By continuously adding plugins to the framework, the number of possible applications that can be built with

    the framework increase, making the framework useable for more aerial thermal imaging use cases.

    Instead of developing the application from scratch, users can use the already implemented plugins to build the applications in

    an ad hoc fashion. Because of this, the development time for such applications can be reduced and users can switch hardware

    and/or algorithms easily. The FRs are summarized in a use case diagram that connects each actor with their respective require-

    ments and the relationship among them [40], depicted in Figure 2.1. Trivial functionalities such as launching and shutting down

    the framework are omitted. The red use cases represent use cases to extend the functionality of the framework, the blue use

    cases represent use cases for building streams, white use cases modify the media processing of the stream. Some use cases

    depend on others; the blue and white use cases work with plugins of the framework, their prerequisite use case is ”Add plugin”

    as a plugin must be a part of the framework for a user to use it, the ”(Un)Link plugins”, ”Stop/Pause/Play stream” use cases

    depend on ”Add plugins to stream”, as a stream must contain plugins before they can be manipulated.

    2.1.2 Non-functional requirements

    A non-functional requirement (NFR) specifies how the framework is supposed to be, or in what manner it should execute its

    functionality [41]. These qualifications typically cover business and system quality requirements. A distinction is made between

    quality attribute requirements (QAR) and constraints. QARs are qualifications of the FRs or of the overall product, e.g. how

    fast a certain function must be executed or how resilient it must be to erroneous input. They are closely related to business

    requirements, which are specifications that once delivered, provide value to the actors [40]. The QARs are captured in a utility

    tree [40] that has a root node representing the system. This root node is elaborated by listing the major QARs that the system

    is required to exhibit. Each QAR is subdivided into more specific QARs. To make the specific requirements unambiguous and

    testable, a scenario for the system or a specific function is written and they are evaluated against the business value and the

    architectural impact [40]. The QAR can either have High (H) , Medium (M) and Low (L) business value and architectural impact

    respectively. The business value is defined as the value for the end user if the QAR is enabled. High designates a must-have

    requirement. Medium is for a requirement which is important but would not lead to project failure. Low describes a nice to have

    QAR but not something that is worth much effort. Architectural impact defines how much the architecture must be designed

    towards the QAR to enable it. High means that meeting this QAR will profoundly affect the architecture. Medium means

  • 2.1 Requirements analysis 7

    Figure 2.1: Use case diagram.

    that meeting this QAR will somewhat affect the architecture. Low means that meeting this QAR will have little effect on the

    architecture. The following QARs are discussed: performance, interoperability, modifiability, usability, security and availability.

    Performance

    Performance indicates the frameworks ability to meet timing requirements. It characterizes the events that can occur and the

    frameworks time-based response to those events. Latency is defined as the time between the arrival of the stimulus and the

    system’s response to it [40]. The system latency is the latency between the initialization of an action sequence and the first

    change to the system noticeable by the user. Streaming latency is defined as the time between the arrival of a video frame and

    the arrival of the next video frame. The jitter of the response is the allowable variation in latency. Scalability is the number

    of users that can use the framework at the same time. The utility tree is presented in Table 2.1. The software industry has

    not defined a quantified ’good latency’ for end-users, but a 4 second latency rule is often used as a rule-of-thumb [42]. The

    average response time for general framework commands should then be less than 2 seconds, with a standard deviation of 1

    seconds, ensuring most execution times respect the 4 second bound. As stated in Chapter 1, some use cases require real-time

    video streaming, such as fire fighting. The notion of low latency real-time video loosely defines that video should be streamed

    almost simultaneously; if a camera is filming and a human user does not notice a latency between the video of the camera

    and the real world, the video stream is considered real-time. Real-time is thus a human time perception, and for visual inputs

    this bound is as low as 13 milliseconds. Anything above 13 milliseconds becomes noticeable, anything above 100 milliseconds

    hinders human performance [43, 44]. However, the framework focusses on the use of thermal cameras most of which most

    don’t produce frames faster than 8 frames per second, or 125 milliseconds per frame (see Section 3.1). More expensive cameras

  • 2.1 Requirements analysis 8

    can shoot at 25 frames per second, corresponding to a latency of 40 milliseconds, and this bound is selected for the streaming

    latency, with a standard deviation of 20 milliseconds, remaining below the frame rate of less expensive cameras. The number

    of users that can use the framework at the same time is assumed to be low, as current aerial thermal image applications are

    currently operated by only one user or a few. The assumption is that a maximum of five users can use the framework at the

    same time. All of these requirements are quantified as relatively ’good’ values. These bounds should be evaluated for user

    satisfaction by having users use a prototype of the framework in practice.

    Attribute refinement Id Quality attribute scenario

    LatencyPS-1 The average execution time of all framework commands does not exceed 2 sec-

    onds. (H, M)

    PS-2 A playing stream should have an upper limit of 40ms streaming latency. (H, H)

    JitterPS-3 The average standard deviation of the execution time of all framework com-

    mands should not exceed 1 second under normal operation. (H, M)

    PS-4 The average standard deviation in streaming latency should not exceed 20ms

    under normal operation. (H, H)

    Scalability PS-5 The system should usable by five users at the same time. (M, M)

    Table 2.1: Performance utility tree

    Interoperability

    Interoperability is the degree to which two or more independently developed systems can usefully exchange meaningful infor-

    mation via interfaces in a particular context [40]. The framework will interoperate with cameras and analysis modules via the

    framework plugins. Henceforth the camera and analysis plugins will be referred to as a Producer plugin and a Consumer plugin.

    A Producer plugin is thus a plugin that represents a camera that produces video and a Consumer plugin a plugin that represents

    a module that processes or consumes video. The framework will thus interact with the Producer and Consumer plugins, with

    which the framework exchanges requests to link them together, control their media process, etc. The more correct exchanges

    there are between the two, the better the user can use the plugin for building applications with the framework. This QAR is

    quantified by the ratio of requests made by the framework that were correctly interpreted by the plugin (successful requests)

    and the total number of requests during a runtime of the framework [40]. Intuitively one argues that the framework must

    achieve perfect interoperability with a perfect exchange success rate of 100%. Reality however tends to not agree with perfec-

    tion and it can never be excluded that exchanges will always be correct. Therefore it is better to aim for a good interoperability

    measure and prepare for failed exchanges instead of naively assuming the framework will be perfect. An exchange success

    rate of 99.99% is selected, the motivation for this bound is as follows. A plugin is assumed to be always correct up to first the

    mistake, after which the plugin is faulty and the fault needs to be identified and ensured that it won’t occur again. An exchange

    success rate of 99.99% means that if 10000 plugins are installed and used by the framework, only one will fail during uptime.

    For one plugin during framework up time, the mean time between failures is then 10000 exchanges. It is suspected that this

  • 2.1 Requirements analysis 9

    amount of exchanges are very high for normal framework use. Because the possibility of faulty exchanges is acknowledged,

    the framework will need to implement a fallback mechanism to compensate. The utility tree is presented in Table 2.2.

    Attribute refinement Id Quality attribute scenario

    Syntactic interoperabilityIS-1 The framework exchanges requests with a Producer plugin (known at runtime),

    with a success ratio of 99.99%. Incorrect requests are undone by the framework

    and logged. (H,H)

    IS-2 The framework exchanges requests with a Consumer plugin (known at runtime),

    with a success ratio of 99.99%. Incorrect requests are undone by the framework

    and logged. (H,H)

    Table 2.2: Interoperability utility tree

    Modifiability

    Modifiability is the cost and risk of changing functionality of the system [40]. One of themost important values of the framework

    is modifiability of the supported thermal cameras and analysis modules. The framework needs to be extendable for new

    functionalities by enabling developers to add their support software in the form of a plugin. End-users should be able to

    modify the components that they use for their image processing applications easily and quickly to allow for interchangeable

    hardware and software, and quickly set up new applications. Modifiability is defined in two environments: runtime defined as

    periods during which the system is up and running and downtime defined as the time periods during which the system is not

    active. The utility tree is presented in Table 2.3.

    To enable users to choose the extensions they need, the framework will need a distribution service that contains all plugins

    available for the framework, from which a user can select and install plugins for their local version of the framework. Adding

    new plugins to the distribution service should not affect versions of the frameworks installed by the user. When a user adds a

    plugin from the distribution to his version of the framework, the framework should only reload once before making the plugin

    useable for user comfort. Deployability is defined as the different device configurations that specify how the framework can be

    deployed. If the framework can be deployed in different fashions this can increase the value for the end-user. Suppose a fire

    fighting use case in which a forest fire is monitored on site. Computationally powerful devices might not be available on site,

    so moving some plugins processing media to a remote server or cloud could still allow usage of the framework. Perhaps the

    device processing the media is already remote, for example a drone on security patrol, in this case access via a remote device

    such as a smartphone is desirable. This leads to the deployment configurations described in the utility tree.

    Usability

    Usability indicates how easy it is for the user to accomplish a desired task and the kind of user support the system provides.

    Learnability indicates how easy it is for a user to gain knowledge on how to use the framework. Errors are the amount of errors

  • 2.1 Requirements analysis 10

    Attribute refinement Id Quality attribute scenario

    Run time modifiability

    MS-1 Support for a new Producer plugin should be added to the distribution service

    within one day, without the framework having to restart. (H, H)

    MS-2 Support for a new Consumer plugin should be added to the distribution service

    within one day, without the framework having to restart. (H, H)

    MS-3 End-users should be able to extend their framework with new functionalities

    by installing new Consumer and Producer Plugins. (H,H)

    MS-4 End-users should be able tomodify the plugins used to build their stream. (H,H)

    Down time modifiabilityMS-5 New Producer plugins can be installed to the local framework at runtime, the

    framework can only reload once before the plugin is useable. (H, H)

    MS-6 New Producer plugins can be installed to the local framework at runtime, the

    framework can only reload once before the plugin is useable. (H, H)

    Deployability

    MS-7 The system should be deployable on a combination of a smartphone and

    cloud/remote server environment. (H, H)

    MS-8 The system should be deployable on a personal computer or laptop. (H, H)

    MS-9 The system should be deployable on a smartphone, laptop and cloud environ-

    ment. (H, H)

    Table 2.3: Modifiability utility tree

    a user can make when trying to execute certain functions [40]. The utility tree is presented in Table 2.4.

    Security

    Security is a measure of the system’s ability to protect data and information from unauthorized access while still providing

    access to users and systems that are authorized. An action taken against the system to cause it harm is called an attack.

    Security has three main characteristics. Confidentiality is the property that data or services are protected from unauthorized

    access. Integrity is the property that data or services are protected from unauthorized manipulation. Availability is the property

    of the systemmaintaining its functionality during an attack. Authentication verifies the identities of the parties of an interaction,

    checks if they are truly who they claim to be and gives or provokes access [40]. Security is important for the framework if it is

    deployed on multiple devices that use a public network to communicate. The utility tree is presented in Table 2.5.

    Availability

    Availability in a general context (not only security) refers to howavailable the software is to carry out its functionality. Downtime

    is a measure of the time that the system is unavailable to carry out its functions. The utility tree is presented in Table 2.6.

    Availability is specified for the part of the framework that distributes the plugins.

  • 2.2 Patterns and tactics 11

    Attribute refinement Id Quality attribute scenario

    Learnability

    US-1 A user should be able to learn how to build an image processing application in

    at most one hour. (H, L)

    US-2 An experienced developer should be able to start developing a Consumer plugin

    for the system within one day. (H, L)

    US-3 An experienced developer should be able to start developing a Producer plugin

    for the system within one day. (H, L)

    Errors US-4 A user should not make more than 3 errors to build an image processing appli-

    cation. (H, L)

    Table 2.4: Usability utility tree

    Attribute refinement Id Quality attribute scenario

    Confidentiality SS-1 Streams created by a user can only be accessed by that user and not by any

    other entity. (H, L)

    Integrity SS-2 Streams can’t be manipulated without authorization by the user that made the

    streams. (H, L)

    Availability SS-3 During an attack, the core functionality is still available to the user. (H, M)

    AuthenticationSS-4 Users should authenticate with the system to perform functions. (H, L)

    SS-5 Developers should authenticate their plugins before adding them to the frame-

    work. (H, L)

    Table 2.5: Security utility tree

    Architecturally significant requirements

    Architecturally Significant Requirements (ASR) are the requirements that are themost important to realize according to business

    value and have the most impact on the architecture. From the utility trees and the measures of the quality attribute scenarios,

    the QARs modifiability, interoperability and performance are identified as ASRs.

    2.2 Patterns and tactics

    An architectural pattern is a package of design decisions that is found repeatedly in practice, that has known properties that

    permit reuse and describe a class of architectures. Architectural tactics are simpler than patterns which typically use just a

    single structure or computational mechanism. They are meant to address a single architectural force. Tactics are the ”building

    blocks” of design and an architectural pattern typically comprises one or more tactics [40]. Based on the ASRs several tactics

    are listed in Table 2.7 that are used for the base pattern selection. The explored patterns are: layers, event-driven architecture,

  • 2.2 Patterns and tactics 12

    microkernel and microservices.

    2.2.1 Layers

    The layered pattern divides the software into units called layers, that each perform a specific role within the application. Each

    layer is allowed to use the layer directly beneath it via its interface. Changes in one layer are isolated if the interfaces don’t

    change, enablingMT-1 andMT-2,MT-5 [40]. While changes can be isolated by the isolated layers, they remain difficult due the

    monolithic nature of most implementations of this pattern. Layers contribute to a performance penalty due to the ”architecture

    sinkhole phenomenon” in which requests are simply propagating through layers for the sake of layers [45].

    2.2.2 Event-driven architecture

    This pattern consists of several event publishers that create events and event subscribers that process these events. The pub-

    lishers and subscribers are decoupled by using an event channel to which the publishers publish events that the event channel

    forwards to the event subscribers. The subscribers should have a single purpose and execute asynchronously. Since the publish-

    ers and subscribers are single-purpose and are completely decoupled from other components via the event channel, changes

    are isolated to one or some components, enabling MT-1, MT-2 , MT-4, MT-5 and MT-7. If the event ch