a performance evaluation approach openmodeller: a framework for species distribution modelling

17
A performance evaluation approach openModeller: A Framework for species distribution Modelling

Upload: mervyn-poole

Post on 13-Jan-2016

219 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: A performance evaluation approach openModeller: A Framework for species distribution Modelling

A performance evaluation approach

openModeller:A Framework for species distribution Modelling

Page 2: A performance evaluation approach openModeller: A Framework for species distribution Modelling

Agenda

openModeller Framework1

The Performance Evaluation Process2

Infra-structure Implemented3

Architecture Proposed4

Simulation Model5

Conclusions6

Page 3: A performance evaluation approach openModeller: A Framework for species distribution Modelling

openModeller is a fundamental ecological niche modelling framework. A number of fundamental niche modelling algorithms are provided as plug-ins, including GARP, Minimum Distance, Climate Space Model, Bioclimatic Envelopes, and others.

The software includes facilities for reading species occurrence and environmental data, selection of environmental layers on which the model should be based, creating a fundamental niche model and projecting the model into an environmental scenario.

openModeller Framework: an overview

Page 4: A performance evaluation approach openModeller: A Framework for species distribution Modelling

The process:

1. a set of occurrence points for a species;2. a set of environmental layers (rainfall, temperature etc.);3. choose an algorithm to be used to construct the niche model, and select

appropriate parameters for the algorithm;4. generate the ecological niche; 5. use the generated model to calculate a probability of occurrence surface by

projecting the model into a set of set of environmental layers for a given region.

openModeller Framework: an overview

Page 5: A performance evaluation approach openModeller: A Framework for species distribution Modelling

The process:

openModeller Framework: an overview

OM Console

Page 6: A performance evaluation approach openModeller: A Framework for species distribution Modelling

To produce the model and generate the projected probability surface of species occurrences, the system uses computational resources in an intensive way. The modelling process is complex and demands a lot of processing and time.

Prior to starting the performance evaluation process, a detailed study of the openModeller execution flow was carried out. The scientists and researchers that actually use the system were interviewed.

The first step in conducting a performance evaluation was to divide the openModeller Framework into separate components. These components were then each installed and configured within a COM+ environment host.

The Performance evaluation Process

Page 7: A performance evaluation approach openModeller: A Framework for species distribution Modelling

Thus a methodology was devised to enable analysis of the performance of component-based applications and to collect data from components. To implement this methodology it is necessary to acquire performance parameters for the openModeller framework.

The Performance evaluation Process

Page 8: A performance evaluation approach openModeller: A Framework for species distribution Modelling

The AOP techniques were used because no severe performance penalties were introduced in the openModeller application. Through AOP techniques it is possible to intercept, for instance, a method call and get the time that this method takes to execute.

The AOP approach, in conjunction with Object Oriented Programming techniques and asynchronous messages processing architecture approach, proved to be an efficient way to collect performance data for each component and method. These techniques offer control over the granularity of instrumentation and decrease the level of overhead due to instrumentation.

The Performance evaluation Process

Page 9: A performance evaluation approach openModeller: A Framework for species distribution Modelling

After running the application and recording performance metrics, the analysis of the evidence collected can be carried out without having to understand the work flow of the entire solution execution. Additionally evidence can be analyzed on a per module bases removing the need to process enormous amount of performance information collected during the application execution. The other metrics like processor and memory utilization was obtained from WMI API provided in the .NET Framework.

A back office visualization tool was developed to allow the consolidation of the collected results. The visualization tool also enables the localization of the most important performance problems that occurred during the execution of the application. After locating the candidate bottlenecks, the next step is to understand the causes of those performance problems.

The Performance evaluation Process

Page 10: A performance evaluation approach openModeller: A Framework for species distribution Modelling

The results consolidation tool

The Performance evaluation Process

Page 11: A performance evaluation approach openModeller: A Framework for species distribution Modelling

Through the analysis of the results it was possible to determine the methods with high call time. Based on the evidences collected, four categories were created with the purpose to discover and divide the bottlenecks in the program workflow. The next graph shows the result:

The Performance evaluation Process

Furcata Boliviana – 1Layer – BioClim Algorithm

Page 12: A performance evaluation approach openModeller: A Framework for species distribution Modelling

To minimize the impact of performance penalties with the introduction of instrumentation, an asynchronous message processing approach was implemented.

In this way, the user starts the process, and as the component methods are called, the AOP class implementations are called in a separate thread. A message is built containing the time consumed by the method. The WMI API is then called to profile the metrics used and, in sequence, puts this message in the Microsoft Message Queue. Parallel with this execution thread, the main thread, which is executing the openModeller algorithm, continues processing normally.

Infra-structure implemented

Page 13: A performance evaluation approach openModeller: A Framework for species distribution Modelling

The next illustration shows the workflow of the architecture and infra structure implemented to enable the gathering of evidence related to the performance evaluation process.

Infra-structure implemented

Page 14: A performance evaluation approach openModeller: A Framework for species distribution Modelling

The second layer is the core of the openModeller request processing subsystem and can be composed of several servers hosting different modeling algorithms. This layer represents the main module of the architecture.

The following diagram represents a logical division of the openModeller system in a distributed way:

Architecture proposed

Page 15: A performance evaluation approach openModeller: A Framework for species distribution Modelling

The main idea of this model is to allow the simulation of the system and analyze different component system configurations over the high performance infra-structure. This model is a tool for support decision to define the better distribution of the components of openModeller and identify possible bottlenecks.

Based on this architecture, this research represents the simulation model of the software components. This model is based on closed network queues, where each queue represents a software component of the system.

Simulation Model

Page 16: A performance evaluation approach openModeller: A Framework for species distribution Modelling

This research was very useful to provide the openModeller application with architecture and infra-structure highly scalable and available and in agreement with the biological researches and scientists necessities, through parallel and distributed processing.

The AOP technique to instrument the code has proved to be an efficient way to collect performance data as it could control the granularity of data collected through interception mechanism provided by the COM+. The asynchronous message processing used in the AOP implementation decreased the interference of this instrumentation in the end results. The application back office helped the analysts to quickly find the bottlenecks in the systems and to propose a way to turn the code more efficient.

Conclusions

Page 17: A performance evaluation approach openModeller: A Framework for species distribution Modelling

The next stage of this research will be the simulation of the model of the component architecture created. It will turn possible the architecture optimization through better component distribution in manner to decrease the response time to the user and decrease the use of the computational resources by the openModeller system.

AcknowledgmentThe authors are grateful to FAPESP, The São Paulo State Research

Foundation, Brazil for the support to the openModeller project.

Conclusions