asean-japan workshop 2014

Asean-Japan W

orkshop 2014

Table of Content

i

Logic of Agent Communication Satoshi Tojo

1

raSAT, SMT for Nonlinear Constraints Over Reals Mizuhito Ogawa

2

A Semantic Approach for Malware Detection on Android Tuan Nguyen

3

Game Theory – Paradigm Shift from Winning Strategy to Designing Strategy Hiroyuki Iida

4

Behavior Modeling in Physical and Adaptive Intelligent Services Kunihiko Hiraishi

5

Ontology Based Information Retrieval Shahrul Azman Mohd Noah, Nazlia Omar, Mohd Juzaiddin A. Aziz, Masnizah Mohd, Juhana Salim, Saidah Saad, Sabrina Tiun, Shereena Mohd Arif, Lailatulqadri Zakaria, Akmal Aris and Maryati Mohd Yusof

6

Human-Computer Interaction and Human-Robot Interaction in Collaborative Environment: Features and Performances to Support Team-based Approaches Nabil Elmarzouqi

7

BE-PUM: A tool of Binary Emulation for PUshdown Model Generation Quan Thanh Tho

8

Microwave Techniques as Diagnostic Tools for Medical Applications Amin Abbosh

9

A Compact and Robust WBN Applicable for Real-Time Febris Monitoring Intan Sari Areni, Elyas Palantei, Irfan Efendi, Khaerunnisa, Santi Samsul, Sri Wahyuni, Amil Ahmad Ilham, Merna Baharuddin, and

10

Asean-Japan W

orkshop 2014

ii

Novy Nurrahmillah Ayu Mokobombang

Logico-Semantic Framework Search Engine Mohamad Fauzan Noordin

12

Water Flow Like Algorithm for Graph Based Problem Solution Zulaiha Ali Othman

14

Ubiquitous Wireless Computing: Current Research Progress, Challenging, and Future Directions Elyas Palantei

16

Industrial Informatics: Research Issues and the Path Forward Riza Sulaiman

18

A Framework for Cyber Security Strategy for Developing Country Khosraw Salamzada, Zarina Shukur and Marini Abu Bakar

20

A Hybrid Model of Genetic Algorithm and Neural Network for Predicting Dengue Outbreak Nor Azura Binti Husin

22

Stacking of Texture Based Filters for Visual Place Categorization Azizi Abdullah

24

Colloborative Decision Making on Multi-Display Interactive Visualisation Environment Using Haptic Horizontal Surface (Mi-VHTabletop™) Halimah Badioze Zaman

26

Customer Process Reference Model for Designing Innovative Service Encounters Muriati Mukhtar

28

A Hybrid Approach for Semantic Similarity Measurement Amirah Ismail

32

Semantic Harmonisation for Collective Connected Wisdom in Islamic Finance and Banking

36

Asean-Japan W

orkshop 2014

iii

Roslina Othman

A Model of Software Testing As a Service (STaaS) In Cloud Computing: A Case of Knowledge Management Perspective Rusli Abdullah

41

Records and Information Management VS Knowledge Sharing: Is There Possibility to Harmonizing the Conflicting Interest? Zawiyah M. Yusof

45

Prediction Model Using GA-NN for Medium Density Fiberboard Testing Nordin Abu Bakar and Faridah Sh Ismail

49

An Enhanced Resampling Technique for Imbalanced Datasets Maisarah Zorkeflee, Aniza Mohamed Din and Ku Ruhana Ku-Mahamud

53

Machine Learning for Big Data Problems Siti Mariyam Shamsuddin and Shafaatunnur Hasan

56

Inductive Proof of Invariant Properties for State Machines with Search Kazuhiro Ogata

63

Asean-Japan W

orkshop 2014

Speakers

iv

Prof. Dato' Dr. Halimah Badioze Zaman Universiti Kebangsaan Malaysia, MALAYSIA

Professor Dr. Satoshi Tojo Japan Advanced Institute of Science & Technology, JAPAN

Professor Dr. Mizuhito Ogawa Japan Advanced Institute of Science & Technology, JAPAN

Professor Dr. Kazuhiro Ogata Japan Advanced Institute of Science & Technology, JAPAN

Professor Dr. Kunihiko Hiraishi Japan Advanced Institute of Science & Technology, JAPAN

Professor Dr. Hiroyuki Iida Japan Advanced Institute of Science & Technology, JAPAN

Professor Dr. Shahrul Azman Mohd Noah Universiti Kebangsaan Malaysia, MALAYSIA

Professor Dr. Zarina Shukur Universiti Kebangsaan Malaysia, MALAYSIA

Professor Dr. Zawiyah Mohd Yusuf Universiti Kebangsaan Malaysia, MALAYSIA

Professor Dr. Nabil Elmarzouqi University of Cadi Ayyad, MOROCCO

Professor Dr. Siti Mariyam Shamsudin Universiti Teknologi Malaysia, MALAYSIA

Professor Ir. Dr. Riza Sulaiman Universiti Kebangsaan Malaysia, MALAYSIA

Professor Ir. Dr. Elyas Palantei Universitas Hasanuddin, SULAWESI SELATAN

Associate Professor Dr. Intan Sari Areni Universitas Hasanuddin, SULAWESI SELATAN

Associate Professor Dr. Rusli Abdullah Universiti Putra Malaysia, MALAYSIA

Asean-Japan W

orkshop 2014

v

Associate Professor Dr. Mohamad Fauzan Noordin International Islamic University Malaysia, MALAYSIA

Associate Professor Dr. Roslina Othman International Islamic University Malaysia, MALAYSIA

Associate Professor Dr. Nordin Abu Bakar Universiti Teknologi Mara, MALAYSIA

Associate Professor Dr. Amin Abbosh The University of Queensland, AUSTRALIA

Associate Professor Dr. Muriati Mukhtar Universiti Kebangsaan Malaysia, MALAYSIA

Associate Professor Dr. Zulaiha Ali Othman Universiti Kebangsaan Malaysia, MALAYSIA

Associate Professor Dr. Quan Thanh Tho Ho Chi Minh City University of Technology, VIETNAM

Dr. Tuan Nguyen University of Information Technology, VIETNAM

Dr. Nor Azura Husin Universiti Putra Malaysia, MALAYSIA

Dr. Amirah Ismail Universiti Kebangsaan Malaysia, MALAYSIA

Dr. Azizi Abdullah Universiti Kebangsaan Malaysia, MALAYSIA

Dr. Aniza Mohamed Din Universiti Utara Malaysia, MALAYSIA

Asean-Japan W

orkshop 2014

Asean-Japan Workshop on Information Science and Technology 2014, FTSM, UKM

1

Logic of Agent Communication

By

Satoshi Tojo School of Information Science

Japan Advanced Institute of Science & Technology, JAPAN Email: [email protected]

Abstract

We first present the logical formalism on belief change of agents via public announcement, illustrating several logic puzzles. Then, we realize the belief change by restricting the accessibility to possible worlds. We show that the declaration `I don't know' is also informative to other agents, and that by such public announcement these logical teasers can be solved.

Asean-Japan W

orkshop 2014


2

raSAT, SMT for Nonlinear Constraints Over Reals

By

Mizuhito Ogawa School of Information Science

Japan Advanced Institute of Science and Technology, JAPAN Email: [email protected]

Abstract

Recently, an SMT solver becomes a popular tool, e.g., for symbolic execution, invariant generation, and automatic termination proof. Typical backend theories include linear arithmetic (both on reals and integers), uninterpreted function symbols, and arrays, of which implementation designs are quite established. Compared to them, nonlinear constraints (over reals and integers) are still under investigation. We proposed raSAT (refinement of abstraction for satisfiability) loop, which refines over and under approximation each other, which is implemented as raSAT. This talk overviews the techniques and current status.

Asean-Japan W

orkshop 2014


3

A Semantic Approach for Malware Detection on Android

By

Tuan Nguyen Faculty of Computer network & Communication University of Information Technology, VIETNAM

Email: [email protected]

Abstract

Every application on Android platform must have an AndroidManifest.xml file in its root directory. One of the functions of the manifest file is declaring which permissions the application must have in order to access protected parts of the API and interact with other applications. We have found that many applications do not need specific permissions in order to run properly but they still require those permissions. In this talk, we introduce the semantic approach for detecting malware base on permission requested in the manifest file.

Asean-Japan W

orkshop 2014


4

Game Theory – Paradigm Shift from Winning Strategy to Designing Strategy

By

Hiroyuki Iida

School of Information Science Japan Advanced Institute of Science & Technology, JAPAN


Abstract

In this talk, we present the basic idea of game refinement theory and its development. Game refinement theory has been proposed in previous research to determine sophistication of games. Many works have been done such as in domain of board games, application of Mah Jong, and sports games. Although there are still many types of games to cover, this theory has performed big contributions and generalized fundamental concept. By using its sophistication measurement, many facts have been revealed regarding changes of attractiveness of games in decades. In fact, there are still some challenging research questions, especially in applying game refinement theory to other type of games or even social activities. Game theory originated in the idea regarding the existence of mixed-strategy equilibria in two-person zero-sum games, which has been widely recognized as a powerful tool in many fields such as economics, political science, psychology, logic and biology. It is expected that the game refinement theory will be widely applied in many domains to increase the quality of games. This presentation gives an overview of the new game theory from the game creator’s point of view.

Asean-Japan W

orkshop 2014


5

Behavior Modeling in Physical and Adaptive Intelligent Services

By

Kunihiko Hiraishi

School of Information Science Japan Advanced Institute of Science & Technology, JAPAN


Abstract

A type of services that require human physical actions and intelligent decision making exists in various real fields, such as nursing in hospitals and care giving in nursing homes. The authors’ group call such services "physical and adaptive intelligent services," and is developing an IT-based system that aims to assist cooperation and knowledge sharing among workers. In this paper, we propose a new method for analysing changes in the behaviour of workers before or after introducing the system. The method is based on probabilistic modeling of workers’ behaviour. Using event-logs recorded by the system, behaviour models are learned in the form of N-gram models. The method is also used for detection of unusual behaviour in the event-logs.

Asean-Japan W

orkshop 2014


6

Ontology Based Information Retrieval

By

Shahrul Azman Mohd Noah, Nazlia Omar, Mohd Juzaiddin A. Aziz, Masnizah Mohd, Juhana Salim, Saidah Saad, Sabrina Tiun, Shereena Mohd Arif, Lailatulqadri Zakaria, Akmal Aris and

Maryati Mohd Yusof Faculty of Information Science and Technology

Universiti Kebangsaan Malaysia, MALAYSIA Email: [email protected], [email protected]

Abstract

Semantic search seeks to improve search accuracy through an understanding of searcher intent and the contextual meaning of terms as they appear in the searchable data space. In this case, semantic search goes beyond the conventional ‘bag of words’ model that provides simple representation of text for retrieval and classification. Various approaches have been proposed to support semantic search such as: feature-based models, term dependence models and entity-based models. In this talk, we represent ontology based information retrieval approach to support semantic search. Ontology is defined as “a conceptual representation of a domain” and has shown potential application in the area of query expansion for conventional information retrieval systems. However, the use of ontology to represent the semantic index of documents collection is still widely open for further research. We will show in this talk some of the research that we have embarked or currently embarked on ontology-based information retrieval, namely: semantic digital library; crime news retrieval and multimodal ontology retrieval. We then conclude the talk with the proposed requirement and architecture of ontology based information retrieval as well future research direction in this area.

Asean-Japan W

orkshop 2014


7

Human-Computer Interaction and Human-Robot Interaction in Collaborative Environment: Features and Performances

to Support Team-based Approaches

By

Nabil Elmarzouqi National School of Applied Sciences of Marrakech

Cadi Ayyad University, MOROCCO Email: [email protected], [email protected]

Abstract

Nowadays, humans are continually expected to experience new ways of interaction with his surroundings which involve computers and robots. With the advent of new technologies for computers and robots, distributed systems have progressed immensely through different systems and services to support distributed interactivities. During past decades, Human-Computer Interaction (HCI) has emerged as a focal area for computer science research, and has made great strides toward understanding and improving interactions with computer-based technologies. Currently, advances in computer technology are leading breakthroughs in robotic technology that offer significant implications of Human-Robot Interaction (HRI) field. Mainly, HRI and HCI researchers are striving to develop systems that allow multiple robots and multiple humans to interact within a collaborative environment. The goal of this presentation is to introduce significant features and performances needed to cover improved interactivities in the collaborative system. Furthermore, this presentation describes such interactivities from multiple perspectives with an eye toward identifying properties and challenges that cross-collaborative environment to consider independently and mutually HCI and HRI mechanisms.

Asean-Japan W

orkshop 2014


8

BE-PUM: A Tool of Binary Emulation for PUshdown Model Generation

By

Quan Thanh Tho

Ho Chi Minh City University of Technology, VIETNAM Email: [email protected]

Abstract

In this talk, we present the tool BE-PUM (Binary Emulation for PUshdown Model generation) for binary analysis. As suggested by its name, BE-PUM generates pushdown model, which is considered as a control flow graph (CFG) combined with a memory execution model. BE-PUM also introduces a concolic approach to handle indirect jumps in binaries, in order to generate CFG in a more precise manner. As such, we are able to experimentally produce models for around 1700 samples of real malware. Compared to JakStab and IDA Pro, two state-of-the-art tools in this field, BE-PUM shows better tracing ability, sometimes with significant differences. Besides, to the best of our knowledge, this is the sole tool supporting pushdown model generation for binaries.

Asean-Japan W

orkshop 2014


9

Microwave Techniques as Diagnostic Tools for Medical Applications

By

Amin Abbosh School of ITEE

The University of Queensland, AUSTRALIA Email : [email protected]

Abstract

In recent years, microwave techniques have attracted a considerable interest as new diagnostic tools for many medical applications, such as the detection of breast cancer, brain stroke, and congestive heart failure. The basis for using microwaves as a medical diagnostic tool is the contrast between the electrical properties of normal and abnormal human tissues in the microwave frequency region. The motivation for building a new generation of microwave-based diagnostic systems is the need for portable, low-cost, non-ionized, and real-time diagnosis and monitoring tool to complement the traditional bulky & expensive tools that can only be used for screening in the hospital or major-clinic environment. In this talk, Prof. Abbosh will explain the microwave imaging systems that have been built at the University of Queensland, the Microwave Imaging research group of the School of Information Technology and Electrical Engineering. The systems to be explained include the UWB radar for breast cancer detection and location using different techniques, such as hybrid and/or differential imaging, head imaging for brain stroke detection, and torso imaging for heart failure detection. The talk will give details of the utilized antenna elements and arrays, artificial phantoms, data acquisition, and image formation algorithms. The recent results achieved in the aforementioned research areas and future challenges will be explained.

Asean-Japan W

orkshop 2014


10

A Compact and Robust WBN Applicable for Real-Time Febris Monitoring

By

Intan Sari Areni, Elyas Palantei, Irfan Efendi, Khaerunnisa, Santi

Samsul, Sri Wahyuni, Amil Ahmad Ilham, Merna Baharuddin, and Novy Nurrahmillah Ayu Mokobombang

Faculty of Engineering Hasanuddin University (UNHAS), INDONESIA

E-mail: [email protected], [email protected], and [email protected]

Abstract

The wireless body sensor network was previously constructed for a real-time febris monitoring suitable to be applied hospitals, homes, and health service centres required several optimizations. The current developed febris monitoring system consists of a WBN based Arduino module connected with two electronic medical sensors (i.e. a temperature sensor and a pulse sensor) soldered on one end of the separated tiny cable; a smart phone device, internet access and the end user terminal. In practice, two kind of electronic boards (Arduino Uno board and 2.4 GHz seed Bluetooth shield board) were stacked in such a way to assemble a typical WBN system as depicted in Fig.1. The WBN operates and connects to the internet infrastructure via the registered mobile phone using the Bluetooth data transfer mechanism. Through the optimization of the attractive technology allows the data transfer up to 460,800 bps. In the study, the whole WBN design would be integrated to the intelligent E-Health network. The current development of E-Health network allows the patient and the physician or doctor to

Asean-Japan W

orkshop 2014

mailto:[email protected]�




11

interactively communicate anytime from anywhere at a remote location. Its medical database unit is supported by the Fuzzy algorithm to perform the predictive data processing regarding the patient ongoing medical status.

Figure 1: A Compact and Robust WBN Construction

Temperature Sensor

Pulse Sensor

WBN based Arduino module

Asean-Japan W

orkshop 2014


12

Logico-Semantic Framework Search Engine

By

Mohamad Fauzan Noordin Kulliyyah of Information and Communication Technology

International Islamic University Malaysia, MALAYSIA Email: [email protected]

Abstract

Web 2.0 has changed the strategy of the world. The virtual world has a large impact on the society. There is enormous data on the web but the knowledge behind the data has not been utilized even to the slightest in comparison to its size. Web 3.0 aims at knowledge extraction from the data, there is need to develop means and ways to extract the knowledge behind the data. In this area of research, Muslim researchers have directed their works towards the availability of digital resources for Al-Quran and books of Hadith since they form the foundations of Islam. However, the research done so far has not gone deep into the area of knowledge representation of Al-Quran and Hadith. The current work looks into development of knowledge representation formalism for Al-Quran using the logical base as it is expressive in nature and has proven successful previously even in complex situations. The logical base needs indexing in order for efficient retrieval as well. The current work also looks into enhancing the logical based indexing for better retrieval. It also aims at the development of a query and question-answering system based on the logical formalism and enhanced indexing developed. The current work has a large significance, as it will ease the process of information access to the Muslim community. Not only that, the work will be beneficial for Non-

Asean-Japan W

orkshop 2014


13

Muslims to know more about Al-Quran easily and thus gaining more and more information about Islam.

Asean-Japan W

orkshop 2014


14

Water Flow Like Algorithm for Graph Based Problem Solution

By

Zulaiha Ali Othman

Faculty of Information Science and Technology Universiti Kebangsaan Malaysia, MALAYSIA

Email: [email protected], [email protected]

Abstract

Many real world problems can be solved by representing it as a graph based problem such as network flow, routing problem, decomposition problem, sub-flows, and etc. In graph theory, usually linked nodes areusually represented using complete graphs withn vertices, which are denoted Kn. The graph is simple where an edge exists between every set of different vertices. Many meta-heuristic algorithms have been successfully used for Graph based solution, which can applied as a single or population based solution such as Simulated annealing and Genetic algorithm respectively. Water flow like algorithm is a relatively new meta-heuristic algorithm which is suitable for a graph based solution. The algorithm is inspired by water flowing from higher to lower altitudes. The flow can split into sub-flows when it traverses rigid terrains. These sub-flows merge when they meet at the same location. Flows stagnate in lower altitude locations if their momentum cannot expel water from the current location. The flow represents a solution agent; the flow altitude represents the objective function, and the solution space for a problem is represented by a geographical terrain. This study presented four examples of graph based problem such as Traveling sales problem, Web services selection, Cellular Manufacturing and Layout Problem and Capacited Vehicle Sales Problem. Each GBP shows the representation of the problem and target solution. Later, this paper presented in depth how water flow algorithm is used for the

Asean-Japan W

orkshop 2014


15

Traveling Sales Problem as one of routing problem. Lastly this paper presented the performance of WFA when applied in the three GBPs. The result shows that WFA is competitive with previous meta-heuristic algorithm in terms of solution quality. However, the WFA has the ability to reach the solution faster as compared to previous meta-heuristics such as Ant Colony System (up to 70%). This reduction is because of the behaviour of the Water flow algorithm presented a dynamic solution by identifying the number of solution (n) according the problem. The splitting and merging process is decided during the solution process. This study also presented the improvement of Water flow like algorithm using various nearest neighbour such as 2-opt, 3-opt, 4-opt to find the best neighbourhood solution for all sub-flows and hybridizing with other algorithm in move and split process such as using Simulated annealing and Genetic algorithm. WFA has improved the solution quality and has increased a little bit of time to reach solution. WFA has the ability to reach the solution faster as compared to previous meta-heuristics such as Ant Colony System with ratio (O)n, where n is the number of cities, while in WFA about n(O) time. With such result, therefore WFA has the potential for more improvement as the algorithm is less sensitive with time due to its dynamic behaviour.

Asean-Japan W

orkshop 2014


16

Ubiquitous Wireless Computing: Current Research Progress, Challenging, and Future Directions

By

Elyas Palantei

Faculty of Engineering Hasanuddin University (UNHAS), INDONESIA

E-mail: [email protected], [email protected]

Abstract The aggressive research activities and generous studies focusing on the ubiquitous mobile computing carried-out during the last two decades have resulted in tremendous outcomes which is applicable in broad areas of modern society.. In the near future, the computing technology application is highly possible to emerge as the dominant method to connect any objects to the global ICT infrastructure, the internet. This talk mainly discusses several R&D achievements performed during the last five years by our group of researchers at the Department of Electrical Engineering, Faculty of Engineering, Hasanuddin University, Indonesia. There are a number of attractive studies where the mobile computing concepts have been widely applied. These include wireless environment monitoring exploiting the powerful performance of the sensor networks, object tracking, smart building, smart parking, the intelligent transportation system (ITS), sub-marine environment monitoring, underwater mobile objects control, and biomedical engineering applications. The advanced studies concerning the wireless computing innovation such as the green and smart laboratory, wireless power transmission, high altitude communication systems and wireless digestive sensor network are also currently initiated. In general, most of the wireless computing apparatus constructed optimize the limited communication channels available in the ISM frequency bands. These spread from 433 MHz band, 875 – 925 MHz band, and 2.4 – 2.5 GHz band. The

Asean-Japan W

orkshop 2014


17

more challenging studies will later utilize the higher frequency bands including the 5 GHz and 10 GHz bands. Some technical and non-technical experience existed in the adoption of the traditional methods on the mobile computing devices construction will be analytically compared with the modern design concepts supported by the advanced mobile operating systems (such as Android OS, Windows mobile OS, Tizen OS and iOS platforms). The significant impacts of the rapid development of the mobile computing technologies to the improvement of the curriculum of the telecommunication and informatics engineering study program will be also presented.

Asean-Japan W

orkshop 2014


18

Industrial Informatics: Research Issues and the Path Forward

By

Riza Sulaiman

Industrial Computing Research Group Universiti Kebangsaan Malaysia, MALAYSIA


Abstract Information Technology (IT) is a combination of computer technology and communication technology. The concept of IT was first introduced in the 1980s and has grown rapidly in the 1990s. Malaysia is not left behind by the progress. Efforts to promote the development of information technology have been done by the government. In the era of globalization and liberalization, industrial manufacturing of the country faces significant challenges from various other countries, whether developed or developing countries. To meet these challenges, several strategies have been identified. One strategy is to encourage the widespread use of IT and high-tech machines in manufacturing and production. To achieve these goals, a new field of study, known as "Industrial Computing", has been introduced in several universities around the world. Malaysia too, in which the Universiti Kebangsaan Malaysia (UKM) has created a department called the Department of Industrial Computing in the Faculty of Information Science and Technology (FTSM) and now known as Industrial Informatics Research . The main focus of this research team is to contribute towards the manufacturing and production industries. However, with the development of other industries such as biotechnology,

Asean-Japan W

orkshop 2014


19

medical and agro-based industries, the research also focuses on producing graduates who are suitable for a wide range of various industries in Malaysia. Issues and challenges to be solved are to manipulate large data in the form of visual or graphics. Visualization can also represent data in a more meaningful value. Hence, visualization is used as a medium of instruction or training to represent data in the form of Visual Informatics and this is the trend and path forward.

Asean-Japan W

orkshop 2014


20

A Framework for Cyber Security Strategy for Developing Country

By

Khosraw Salamzada, Zarina Shukur and Marini Abu Bakar

Faculty of Information Science and Technology Universiti Kebangsaan Malaysia, MALAYSIA


Abstract

Given the importance of cyber space for a country’s development, different countries have a lot of investment in cyber space applications. Based on official documents, Afghanistan is in the process of integrating ICT into its critical information infrastructure. To this end, the country may face various challenges including cyber security. Due to various potential threats and risks to Afghanistan cyber security, a comprehensive cyber security strategy is necessary. Accordingly, Afghanistan has introduced an ICT security law. However, nowadays the internet has become an integral part of both the government and non-government sectors. Thus, the country must introduce a comprehensive and appropriate cyber security strategy to tackle all of the issues and risks related to this arena. The aim of this study is to propose a general cyber security strategy based on developed countries experiences in cyber security, specifically Malaysia. This is because Malaysia and Afghanistan are both Islamic countries with commensurable cultural and religious values. Furthermore, Malaysia is considered as a developed country in terms of cyber security. Therefore, in this study, based on literature review and official documents, the current status of ICT as well as cyber threats were identified in the Afghanistan context. In addition, to evaluate the current status of cyber security strategy in Afghanistan and Malaysia, some experts were interviewed and accordingly based on their suggestions and Malaysian experience in cyber security, a cyber-security strategy

Asean-Japan W

orkshop 2014


21

framework was proposed for Afghanistan. The proposed strategy framework was evaluated experts in the field.

Asean-Japan W

orkshop 2014


22

A Hybrid Model of Genetic Algorithm and Neural Network for Predicting Dengue Outbreak

By

Nor Azura Binti Husin

Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, MALAYSIA


Abstract Prediction of diseases especially dengue fever has become very crucial for the government specifically the health department. This is to enable them to provide plans and arrange early intervention programs including campaigns to those susceptible groups of communities before an outbreak occurs. Previous studies show that the hybrid neural network model can solve the problem of finding a suitable parameter that can deliver better model performance than a neural network model or the other standalone model. However, no such model has been developed to predict dengue outbreak. The aim of this paper is to conduct experiments for finding which architectures and models will be the best in proposing a more promising model for predicting the spread of dengue outbreak by using hybrid of genetic algorithm and neural network model with standalone models namely neural network and nonlinear regression model. Several model architectures were designed with the parameters appropriately adjusted to achieve optimal predicted performance. Sample historical data will be collected from State Health Department of Selangor and Malaysian Meteorological Service for the purpose to solve the time series prediction problem. The result showed that dengue cases data and neighbouring location of dengue cases was very effective in predicting the dengue outbreak (architecture III) and it is proven that the hybrid model is capable of producing better prediction

Asean-Japan W

orkshop 2014


23

result compared to the neural network and nonlinear regression models.

Asean-Japan W

orkshop 2014


24

Stacking of Texture Based Filters for Visual Place Categorization

By

Azizi Abdullah

Faculty of Information Science and Technology UniversitiKe bangsaan Malaysia, MALAYSIA


Abstract Recent research in computer vision has shown that combining multiple features is an effective way to improve classification performance. Furthermore, the use of filters that convolve images at multiple filter responses can increase descriptions to images. The more distinctive the filter responses, the better it is able to distinguish characteristics from other groups. Thus, this paper describes a combination method that combines multiple classifier outputs at several filter responses to enhance the automatic visual place categorization system. Besides, one of the goals of this study is to explore performance differences between single and dedicated combination of filter response classifier methods. One possible problem of combining multiple filter responses for describing images is that the input vector becomes very large in dimensionality, which can increase the problem of overfitting and hinder generalization performance. Therefore, the stacking of support vector machine is used to compute the right output class from each single descriptor of filter responses that has been trained at the first layer of support vector machine. Next, the second layer support vector machine is used to combine those class probability output values of all trained first layer support vector

Asean-Japan W

orkshop 2014


25

models to learn the right output class. We have performed experiments on five different categories of visual places from the KTH-IDOL2 dataset with a single descriptor using 25 different filter responses of Laws filters. Results showed that the 2-layer stacking algorithm outperform the single and naive approaches that uses single filter response input vector and combines all filter response outputs directly in a very large single input vector respectively.

Asean-Japan W

orkshop 2014


26

Colloborative Decision Making on Multi-Display Interactive Visualisation Environment using Haptic Horizontal Surface

(Mi-VHTabletop™)

By

Halimah Badioze Zaman Institut Informatik Visual (IVI)

Universiti Kebangsaan Malaysia, MALAYSIA Email : [email protected]

Abstract

Various research conducted currently on computer multi-touch technologies involve work environments such as collaborative meeting and decision making interfaces which enables visualization, shared displays, document sharing, organization of content and group decision making. Current study in this area does not involve visual cognition principles embedded in the multi-touch environment for education applications. This means that teachers find it difficult to determine whether learners are truly undergoing effective digital collaborative learning. Multi-touch tabletops that has embedded the functionality to conduct an efficient user identification capabilities are also still relatively new tools that have not been fully investigated. Therefore, this research will incorporate findings of the previous research on cognition and multimedia instructional design for learning into the development of the prototype called MiTabletop™ to create a novel approach to display text, multimedia, augmented reality inputs on a horizontal interactive surfaces, using a combination of strategies and devices to resolve the inherent difficulty of using the conventional monitor

Asean-Japan W

orkshop 2014


27

and keyboards for digital collaborative learning purposes. The idea is that with the use of an interactive multi-touch tabletop, it provides a test bed for exploring the very foundations of education and communication in problem solving during collaborative learning. This multi-touch tabletop (Mi-VHTabletop™) provides a fluid interaction and meaningful multi-display of content to become the users’ true cognitive prosthesis (substitute for eyes etc), based on visual cognition principles. Such visual cognition environment is suitable for applications used for collaborative learning, where knowledge can be extracted easily through the multi-touch environment for presentation of information, brainstorming and group decision making essential in collaborative learning process. However, this paper highlights the use of the visualization tool embedded in the multi-touch tabletop to data mine and discover meaningful patterns of interaction through mining sequential patterns. Evaluation of individual students in small group collaborative learning experience based on the learning material called ‘Digital Mysteries’, which was specifically developed to investigate the use of multi-touch tabletops in a face-to-face collaborative environment was conducted . Digital Mysteries was used on the multi-touch tabletop prototype that uses Tangible User Interface (TUI) that is the multi-pen that looks and feel like the normal pen. Qualitative and Quantitative analysis based on student trial videos were conducted based on their work logs. Criteria that could be visualized to reflect students’ ‘cognitive’ behaviour were tracked by the tool as well as teachers (with educational psychology background) to see if the findings matched at the end of the trials. This would indicate the effectiveness of the visualization tool. Latest work is the use of Mi-VHTabletop™ on decision making for top military officers.

Asean-Japan W

orkshop 2014


28

Customer Process Reference Model for Designing Innovative Service Encounters

By

Muriati Mukhtar1, Norleyza Jailani2 and Yazrina Yahya3

Service Science Research Unit Faculty of Information Science and Technology

Universiti Kebangsaan Malaysia, MALAYSIA Email:[email protected], [email protected],

[email protected]

Abstract

As the economy shifted from a goods manufacturing paradigm to that of services, service innovation is fast becoming an important competitive factor for any organization. The pressure to embark on service innovation is also felt by manufacturing firms who are now forced to look into the integration of products and services. The advent of the Service Science paradigm has also pushed the area of services to the fore by proposing that there is no need for the demarcation between products (goods) and services since products can be viewed as vehicles of service delivery. Service Science as a field had imposed a new way of thinking about organizations and its core competences. In this new proposed paradigm, customers now become a new source of competence for the organizations. Based on the concept that all systems or organizations are in effect service organizations a new set of logic which is termed the service dominant logic is proposed. Under this new logic the customers are viewed as active players in the co-creation of value of products and services acquired. Hence under this new proposition the roles of the organizations are changing. It is now not enough only to embed value in services and products but now the role of the organizations is also to ensure that the customer can extract value and hence experience the maximum benefit from using the product or service.

Asean-Japan W

orkshop 2014


29

This means that organizations will have to craft innovative service encounters. Here, it is asserted that innovative service encounters are interactions or processes (between firms and customers) which will facilitate customers’ value extraction, in the customers’ realm, thus realizing value-in use. However, it is not easy to realize these innovative service encounters. This is because of latent customer needs that are unknown to the developer of the product or service. In the effort to understand customer needs, values, behaviour and processes, many customer co-creation techniques and value co-creation models have populated the literature. Methods such as participatory methods, emphatic design and personas are employed. These methods use customers as resource for ideas and/or as partners in co-developing or coproducing the products. The focus of the co-creation techniques are mostly on the realization of customers’ preferences of product attributes. In this context the customers are resources for co-creating the products. Besides these co-creation techniques there are also co-creation models that take into consideration customers’ use processes and their relationship to the goals or objectives of the customers. An important example of a model that falls into this category is the DART model that articulates the importance of dialogue, accessibility, risks and transparency in the value co-creation process. Besides the DART model, the importance of understanding customer goals is elaborated in Payne’s value co-creation model. In addition to these methods are various in-situ methods that explored customer in-use processes while using particular products or services. These methods involve the use of videos and journals to capture customers’ feelings, problems and experience in using a product or service. The literature thus showed the importance of including the customers in designing the products (via customer co-creation techniques), the importance of understanding customers’ use situation while using the product (in situ methods) and also the importance of facilitating the customers in extracting the value of the product or service (value co-creation models). Hence, in the

Asean-Japan W

orkshop 2014


30

quest for designing new innovative service encounters, the customer must be the focal part of the value co-creation process in any service encounter. This implied the importance of understanding customer process while using a product or service. Thus in this paper, it is proposed that a customer process reference model is needed. The proposed customer process reference model articulates the importance of several factors:

(i) Identification of customer goals, (ii) Identification of user context, (iii) Identification of customer’s use situation and (iv) Identification of operand and operant resources.

In operationalising the customer reference model, it is important to understand that customers’ processes will be responsible to extract the required value from the products and/or services by integrating all available resources. They will be the active participant in value co-creation. The role of the firm now is to facilitate the extraction or realization of value for each of the customers by judiciously mapping all of these information (gleaned from factors (i) to (iv)) in the design of innovative service encounters. The use of the customer reference model is demonstrated via several case examples. The first is the development of myXTVT, a prototype learning management system. Here, relevant service encounters to facilitate the process of value co-creation and to enhance value in use must enable dialogue between all identified stakeholders (student–student, student–instructor, instructor-policy maker) in order for all stakeholders to realize their goals. Identifying and understanding users context and use situation, led to the design of encounters such as contextual notifications and alerts, chatting facilities, e-mails, forums and social network links. Access to relevant operant and/or operand resources can be done by enabling the learning management system to allow access to other

Asean-Japan W

orkshop 2014


31

academic sources on the Internet. Employing a widget style interface, the learning management system enables customization and personalization for each individual user thus making the teaching and learning experience unique to each individual student or other stakeholders. The second case involves the development of a new e-bidding system that complies with Islamic Sharia principles. Here the identification of customers’ goals formed the basis for the development. Realizing that the goals of Muslim bidders are markedly different from that of conventional bidders led to the development of an innovative agent based e-bidding system that embeds altruistic features and social network features. The prototype, which is still in the development stage, utilizes an improved utility function and allows co-bidding of Halal products and services. These two cases demonstrate that the customer process reference model can assist in designing appropriate encounter processes as a basis for engaging customers and subsequently provide innovation for organizations. It is hoped that it will be a useful contribution to aid in research and industry.

Asean-Japan W

orkshop 2014


32

A Hybrid Approach for Semantic Similarity Measurement

By

Amirah Ismail Faculty of Information Science and Technology

Universiti Kebangsaan Malaysia, MALAYSIA Email: [email protected]

Abstract

The increasing volume of data available in the domain of knowledge has contributed towards the significant role of Information Retrieval Systems among various ICT and other technologies. Researchers and software developers have been focusing on Information Retrieval Systems to keep pace with the information revolution to make searching and retrieving information more efficient to meet the user’s requirements. The collaboration between semantic systems and proper data structures has the potential to make semantic technology more accurate than the traditional methods which relies on matching keywords. Semantic similarity measurement between concepts has become a significant component in most Information Retrieval (IR). Measuring semantic similarity between vocabularies has become an important task in the development of Information Retrieval Systems focusing on the intelligent knowledge management applications, especially in fields of Information Extraction (IE) and following the advent of the Semantic Web. By formally defining the semantic similarity, useful information can be obtained about concepts similarity and their compatibility. Measuring similarity among concepts has been considered as a quantitative measure of the information; computation of similarity relies on the relations and the properties linked between the concepts in an ontology. This research aims at achieving three main objectives, namely, applying new hybrid approach that combines the Depth Relative Measure method and Information Content-based Measure method

Asean-Japan W

orkshop 2014


33

to compute semantic similarity between concepts; developing the knowledge based system by implementing the improvised approach; and evaluating the knowledge based system by using the correlation between semantic similarity approaches and human assessment as an evaluation measure. The proposed matching approach has also considered semantic distance relationships and properties shared between the concepts for measuring semantic similarity. Semantic measurement between concepts was done by Depth Relative Measure and Information Content-based Measure. The correlation between the semantic similarity approaches and human assessment have been applied as an evaluation method to test the performance of existing Information Retrieval Systems. The term semantic similarity indicates the computation of the amount of similarity among the concepts, which does not necessary to be a lexical similarity but it could be a conceptual similarity. Semantic similarity measurements determine similar concepts in a given ontology. Usually, similarity is calculated based on the target terms to ontology and through testing their relations in ontology. Detection of semantic similarity relations among concepts or entities might be possible if these concepts are semantically linked or share some common attributes in ontology. The main objective of measuring the similarity among concepts is to provide strong approaches for standardizing the contents and to deliver information over information and communication technology. The functions of semantic similarity matching concepts define the methods of comparing the concepts and display those in a given ontology. . Information content measure depends on the amount of properties shared between the two concepts. Basically, hybrid method combine shortest path and information content to improve the similarity measurement; and the last approach we had reviewed was feature measure, which considered the common features and as well as the specific different features between the concepts. The basic idea of distance-based measure lays in chosen the shortest path available among all possible paths between two concepts. This kind of measure assumes that the more similar the concepts are, if the shorter distance separate between them. The

Asean-Japan W

orkshop 2014


34

Depth Relative method is actually finds the semantic distance between two concept nodes by shortest path length, but it considers the depth of the edges connecting them in the overall structure of the ontology to quantify similarity. It calculates the depth (shortest path length) from the root of the taxonomy (ontology) to the target concept. The similarity degree is determined on the basis of this path, and in general will correspond inversely with the length of path. The Depth-Relative approach relies entirely on the is-a hierarchy. The computational measures of semantic similarity in this study employed the WordNet lexical database for depth-relative method. WordNet may be differentiated from traditional lexicons. The information is organized according to the word meanings, and not according to the word forms. As a result of the shift of emphasis towards the word meanings, the core unit in WordNet is called a synset. Synsets are sets of words that have same meaning, these so-called synonyms. One synset represents one concept, which refers to different word forms. To compute the similarity by using Information Content (IC) approaches we chose Wikicorpus as a database for measuring the semantic similarity between the concepts which includes large parts of Wikipedia and has been automatically enriched with linguistic information. The Information Content based approaches (IC) are also referred to as information theoretic based approaches or corpus based approaches. The knowledge discovered by the corpus analysis is used to increase the information already existing in the taxonomies or ontologies. Two measures which include the corpus statistics as an additional and qualitatively different knowledge source are presented. Usually the notion of Information Content (IC) used by Information based approaches, which can be considered as a measure quantifying the amount of information the concepts expresses. Corpus based approaches generally calculate the needed IC values by associating probabilities to each concept in the taxonomy. These probabilities are based on the word appearance in a given corpus. The information content values of the intermediate concepts in the taxonomy range from 1 to 0. The leaf level concepts of taxonomy

Asean-Japan W

orkshop 2014


35

will have the information content value as 1, as they are maximally expressed and could not be further differentiated. The information content of the root concept or the most abstract concept is 0. In our hybrid method we take the advantages from the methods mentioned above in order to improve the quality of similarity result. As we use WordNet a computational resource and Wikicorpus to compute Information Content values. We proposed a combination of four semantic distance features and weights. We assume the maximum depth of the WordNet taxonomy is equal to 10, and the Information Content value should be between 0 and 1. When the similarity result is near to 1 means that the concepts share high similarity and properties and the distance between them will be less in the hierarchy. While, when the result is approaching from 0 which means the concepts share a little similarity in common between them and the path is long enough to give this low similarity. The evaluation results have shown significant levels of semantic correlations values between the existing human assessment dataset and the current developed approach by applying the new hybrid approach to measure the semantic similarity between the concepts.

Asean-Japan W

orkshop 2014


36

Semantic Harmonisation for Collective Connected Wisdom in Islamic Finance and Banking

By

Roslina Othman

Kulliyyah of Information and Communication Technology International Islamic University Malaysia, MALAYSIA


Abstract

Semantic harmonisation calls for common conceptual model within a particular domain, with the inclusion of merging and alignments of sense-level dictionaries and thematic ontologies. Cultural and linguistic influences must be captured in both the dictionary and ontology. Concurrent views present in ontologies enable the conduct of sentiment analysis. Semantic harmonisation regulates the interconnections of standards between knowledge domain and languages. Islamic Finance and Banking is recognised under EPP7 Building Islamic Finance and Business Education discipline cluster, and EPP10 Becoming the Indisputable Global Hub for Islamic Finance, the Malaysian NKEA Education. Bank Negara Malaysia and International Islamic University Malaysia are the Champions. In 2007, Bank Negara Malaysia has compiled the shariah resolutions. There have been major programs, however, there were issues pertaining to the semantic of Islamic Banking and Finance discipline. The domain is still growing. Experts and learners of the Islamic Banking and Finance expressed their concern on the semantic issues particularly when it is introduced at the International arena, and thus posed the need to look into the semantic harmonisation with the conventional model. Thus

Asean-Japan W

orkshop 2014


37

collective connected wisdom is a pre-requisite. Islamic Banking and Finance is one discipline championed by local intellectuals yet recognised for economic problem solutions at the International arena. In a sharing environment, semantic harmonisation is vital.The knowledge representation of Islamic Banking and Finance has been developed in the form of concepts that are actually still in the form of annotations; and thus insufficient for reasoning in the Semantic Web (Konig et al, 2011). Other issues include terminology alignment between Islamic Banking and Finance and the conventional Banking and Finance, further complicated when implemented parallel in the public community. There has been a problem in the coherence side of Islamic Banking and Finance, as it has been a multicultural approach (Fadyat, 2011), affecting the semantic interoperability when the domains community of different cultures convenes. A thorough semantic and epistemological investigation is required to show, inter alia, what usury (ribà) really means (Assaif, 2012). In addition, Islamic Banking and Finance is a discipline that evolves on issues, problems and solutions to a Country and the public community, and yet must sustain the discipline. Due to the current demand for Islamic Banking and Finance educational programs, training and consultancy, its growth enhance the sustainability of education in Malaysia. This research aims to come up with the semantic harmonisation framework for collective connected wisdom for gaining reputable regional centre for education in Islamic Banking and Finance. This semantic harmonisation framework is important in ensuring the appropriate flow of knowledge provision and conducive environment for wisdom engagement. The main components of semantic harmonisation are semantic interoperability, trusted engagement, intellectual engagement, innovation engagement and knowledge technologies.

Asean-Japan W

orkshop 2014


38

Semantic interoperability.Despite the remarkable growth and development of Islamic banks and financial institutions, their expansion in developed countries such as Australia is very slow though steady. These are due to unfamiliarity of the Islamic financial system. While demand for Islamic financial services exceeds supply, many say that Islamic banks are not really different from conventional banks. Concerns now exist on the difference between the words in the Islamic financial system and the conventional one; and on the alignment of words among the Islamic financial system adopted in other countries. Muslims and non-Muslims alike do not understand what Islamic finance exactly is. This project looks at the semantic interoperability for collective connected wisdom of Islamic Banking and Finance as online content. Trusted engagement.There does not appear to be a unified definition of an Islamic Financial product. Thus the key concern is that Islamic financial institutions define what is and what is not an Islamic financial product and interprets transactions differently; leading to ambiguity. This ambiguity prevents standardization and causes difficulty to adopters and regulators who would like to know what are they authorising. This ambiguity too calls for authenticity as the most important element of trust. Intellectual engagement.Islamic Banking and Finance qualifies as real sector driven than the conventional banking does. However, Muslims and non-Muslims alike do not understand what Islamic Finance exactly is. This ambiguity among the adopters and regulators responsible for authorising an Islamic Financial product calls for an education in the form of intellectual discourses and exchange; and wisdom in providing solutions in the real sector. The issues are not only in training students, but also providing adequate knowledge resources that reach the industry on the International arena.

Asean-Japan W

orkshop 2014


39

Knowledge technologies.Islamic Banking and Finance knowledge resources are widely available as academic and executive programs; trainings and conferences; intelectual discourses and press conferences; news and announcements; and authorizations as shari’ah-compatible. However, these knowledge resources need to be captured and coded foruse in distance education and e-learning, as well as for semantic interoperabilty and trusted engagement purposes. Innovation engagement.Currently, Malaysia is lagging as a provider of Islamic Finance and Business Education. The lack of national consensus on what constitute a standard curriculum for Islamic Finance and Banking Education has hindered the development of an internationally recognised professional certification to major markets such as the Middle East. Research questions: What is semantic harmonisation? What is the appropriate methodological framework for assessing the semantic harmonisation in Islamic Finance and Banking? Research objectives are to discover the semantic harmonisation issues relevant to Islamic Finance and Banking; to construct the methodological frameworks for assessing the semantic harmonisation; and to assess the implementation of the semantic harmonisation. Research methods include ontology evaluation, content analysis and survey. Findings at this initial stage revealed that ontology merging and alignment require more details on concepts and implementation in the financial market, sense-level dictionary needs comprehensive scope, and cultural based opinion and decision widened the gap within the domain itself. Both the ontology and dictionary expand. Semantic harmonisation issues also include the foundation and basis of the domain. Methodological framework applied to assess the five main components works as needed at the broader level, and still limited in terms of level of details due to issues caused by alignment.

Asean-Japan W

orkshop 2014


40

Semantic harmonisation in Islamic Finance and Banking faces issues related to scarcity of materials and human experts.

Asean-Japan W

orkshop 2014


41

A Model of Software Testing as a Service (STaaS) in Cloud Computing: A Case of Knowledge Management Perspective

By

Rusli Abdullah

Faculty of Computer Science and Information Technology Universiti Putra Malaysia, MALAYSIA

E-mail: [email protected]

Abstract

With the emergence of cloud computing (CC) that is offering offered everything as a service, community of practise (CoP) of software testing (ST) such as software tester, software developer, programmer, software user, and many other people can take the advantage of CC in enabling them to work together and promote knowledge sharing among them collaboratively. This kind of service in the context of ST has provided them the services which also can be called as ST as a services (STaaS). Based on this scenario CC and ST environment, there is still a lack of models of StaaS in supporting the CoP to work together or working collaboratively over the CC effectively and efficiently. The efforts will contribute towards avoiding the same mistakes or errors that may be the cause of software failure since ST knowledge experience is found during the ST process based on the software development process as shown as in Figure 1.

Asean-Japan W

orkshop 2014


42

Figure 1: Knowledge of ST in software development process

In the context of the CC environment, STaaS play an important role in ensuring the product specification that has been delivered by the cloud service provider (CSP) and also being required by the cloud service recipient (CSR) for a particular software as a product that meets their requirements respectively. Hence, there is a need for the CoP to have a model of StaaS and then translate or adopt it into a tool called Collaborative Knowledge Management System (CKMS) in managing the ST knowledge especially related to best practice and lesson learnt in the CC environment. The model of StaaS can be viewed as shown in Figure 2.

Asean-Japan W

orkshop 2014


43

Figure 2: Collaborative KM as a System in Supporting of Knowledge Sharing

in STaaS environment

Furthermore, the discussion of this context will also be emphasized based on how the model of STaaS can be developed and implemented in CC environment for helping a CoP to offer its services in ST process through the knowledge life cycle or KM processes. The KM processes include the knowledge acquisition – involves the process of gathering knowledge, knowledge storing – the process knowledge ontology and codification, knowledge dissemination – involves the process of knowledge alert and notification, and knowledge application – involves knowledge searching and its utilization in order to promote knowledge sharing

Asean-Japan W

orkshop 2014


44

as well as in overcoming any shortcoming or failure of the software especially during the ST in relation to the platform of CC environment. Therefore, by applying the KMS model in managing knowledge of STaaS, the CoP can maximize the STaaS knowledge of utilization and thus enhancing the delivery of services in a manner that is effective and efficient for the benefits of the CoP whilst adhering to the quality of services (QoS).

Asean-Japan W

orkshop 2014


45

Records and Information Management VS Knowledge Sharing: Is There Possibility to Harmonizing the Conflicting

Interest?

By

Zawiyah M. Yusof Faculty of Information Science and Technology

Universiti Kebangsaan Malaysia, MALAYSIA Email: [email protected]

Abstract

Organization needs to be well informed about the information it creates and possess. This will help the challenges faced by the organization. This is where information management can help the organization to fulfil its strategy. Becoming strategically aware will enable organization to consider the strategic value of the organization’s information, to show how valuable information is and how important it is for it to be managed strategically. This will enable organization to see how information can make a difference to the organization and help meet the organization’s strategic goals. Information management plays significant role in helping to create a more effective and efficient organization by recognizing how information resources can be positioned to help fulfil the organization’s strategic objectives since information management strategy and corporate strategy are increasingly interdependent. A corporate strategy that has been developed without consideration of the opportunities presented by strategic use of information could be seriously deficient. Organization needs only strategic information which is not is the possession of its rivals or competitors. This information is not creative in nature (creative information is meant for sharing) but rather a kind of information not meant for disclosure, restricted circulation and authorized access. This information contains in records which is defined as domesticated information – information captured on an external

Asean-Japan W

orkshop 2014


46

storage device within the context of a particular institution and instantiating a particular type. It is an internally generated information, unique in its nature since it is only possessed by the organization that create it and created as the by-product of transactions carried out by organization. Records area strategic resource which gives many untold advantages. Some are vital and should not be made available for access particularly to the unauthorized party. As some information embedded in records are valuable and strictly protected, competitors seem to resort for various techniques and tactics to have access to the information such as by means of social engineering, reverse reengineering and form intelligences which indicate that strategic information (the records) is not meant for sharing. A record is recorded information regardless of medium or characteristics, created or received by an organization in pursuance of legal obligations or in the transactions of business. During the first decade of the twenty-first century, organizations aspiring to manage records and information assets across the enterprise embraced the concept of information governance with accountability becomes its major element. This indicates accountability with the laws, regulations, and standards governing records and information. The fundamental records management principles has been called to serve as the foundation for sound information governance since information governance is an integrated strategic approach to managing, processing, controlling, archiving, and retrieving information as evidence of all transactions of the organization. Thus, what records and information are needed to support business process; what steps must be taken to be in compliance with governing laws and regulations; and what records and information should be destroyed and when, are seemingly central to information governance. Regulatory compliance is also required to safeguard records (both electronic and physical records), shield the organization from unnecessary risks, and help control costs. Thus, this information should not be unnecessarily shared and disclosed to unauthorized party.

Asean-Japan W

orkshop 2014


47

It is strategic information which interest information intelligences and espionages. This information calls for strategic and efficient management, restricted access, security measure, and policy and procedures and standards, legal compliance. On the other hand, knowledge sharing promotes openness. It is an activity through which knowledge (i.e. information, skills or expertise) is exchanged among people, communities and organizations. It is a deliberate act that makes knowledge reusable by other people through knowledge transfer. In a broader contact, knowledge sharing refers to communication of all types of knowledge including explicit knowledge or information, know-how and knows-who, and tacit knowledge in the form of skills and competency. It implies that the span of the activity is without limit. This indicates that the information to be shared is normally revolved around non-strategic information which is harmless to the originating organization when it is accessed by an unauthorized party. Defined as a collection of data, ideas, thoughts, or memories, information is a fact provided or learned about something. In the computer environment, information is normally conveyed or presented by a particular arrangement or sequence (e.g. computing data as processed, stored, or transmitted by a computer). On the contrary, knowledge constitutes a valuable intangible asset for creating and sustaining competitive advantages within organizations. This has implied that knowledge sharing should be restricted to only within the organization. If this is likely the accepted common practice, then, knowledge sharing initiative, as defined earlier will never be materialized. Furthermore, employees generally believe that hoarding knowledge would benefit them more than sharing knowledge. Organization should first identify what information can be shared. If it is creative information which is collected and organized and made available for access by anybody who needs it, then knowledge sharing is not a ‘big’ concept. The value of creative information is not as vital as the privilege ones. Providing access to this information is not at all burdensome compared to privilege information such as financial information and trade secret where their disclosure is closely associated with risk which can be resulted

Asean-Japan W

orkshop 2014


48

in litigation. Creative information is not subject to any rule to comply with. It can be destructed at any time without having to consult the laws and regulations relating to it. On the other hand, organization must document and follow their policies and procedures if they wish to demonstrate that they have disposed of information especially which is potentially discoverable information in the normal course of good faith business operations. Thus, if knowledge sharing is about privileged information, then knowledge sharing is likely to fail. For knowledge to be shared, it must first be captured. The captured knowledge or information has been the concerned of records and information management. Tacit knowledge is disqualified in this. The concept of knowledge sharing initiative was coined not within the context of information governance. The initiative is promoted merely to encourage sharing of tacit knowledge and expertise with the aim to improve the organization’s performance. This shows that knowledge sharing is possible within organization. However, despite tremendous efforts has been taken, the initiative remains unpopular and has not been proven fruitful since knowledge sharing is seen as a way of losing status and power. Also, if any category of information and knowledge is to be shared, then, this is a sign for disaster. Knowledge sharing is not a cure to organization’s problem relating to information insufficiency. Its materialization is only utopian in character.

Asean-Japan W

orkshop 2014


49

Prediction Model Using GA-NN for Medium Density Fiberboard Testing

By

Nordin Abu Bakar1 and Faridah Sh Ismail2

Faculty of Computer and Mathematical Sciences Universiti Teknologi MARA, MALAYSIA

Email: [email protected], [email protected]

Abstract

Medium Density Fiberboard (MDF) is a board panel made from fiber and has been in the furniture industry for the past three decades. It is an alternative to solid wood, which is known for its strong and sturdy material. Even so, MDF is as popular, for its stability even during weather changes. The success of MDF is also due to its smoother surface, knot free and cheaper as compared to solid wood. MDF fiber resources are mainly from wood leftover, which makes it environmentally friendly. Malaysia is one of the world top MDF exporters with an annual total production of more than two million cubic meters. Among the main source of fiber used in Malaysia are forest wood, rubber wood and oil palm biomass. Being a non-solid engineered wood panel, MDF needs to be tested against a set standard to ensure the board strength. This is a very important step to determine the level of quality based on its properties. According to the Malaysian Palm Oil Board pilot plant, there are four test procedures that produce eight properties. These tests need to conform to standards set by the British Standard European Norm (BS EN). Two test procedures for each mechanical and physical aspects of the board. Mechanical testing procedures consist of Internal Bonding test and Bending Strength

Asean-Japan W

orkshop 2014


50

test, while physical tests include Thickness Swelling (TS) test and Moisture Content (MC) test. Mechanical tests produce properties related to tensile and flexural capabilities. Physical tests focus more on the aspect of water resistance and moisture features that can be detected by the height and weight changes. Testing for accurate properties is important to determine the MDF quality. Time spent for physical tests is longer as compared to mechanical tests. TS test takes up to 24 hours, while MC test needs 48 hours to run. Supervised learning is a machine learning technique that learns from previous trend to produce classification or prediction. One of the approaches in supervised learning is Artificial Intelligence (AI) for automation problems. Among others, Neural Networks (NN) and Genetic Algorithm (GA) are AI techniques well accepted in supervised learning environments. This research produces a prediction model, using hybrid GA-NN, which aims to reduce the time spent to implement testing procedures. Neural network (NN) imitates human brain to learn and make decisions. NN is a classical and preferred tool for many predictive data mining applications because of its power, flexibility, and ease of use. This method is a popular choice in making predictions in several areas, such as weather predictions, stock market forecasting and medical diagnosis, including in agriculture-based areas. NN approach has also implemented in engineered wood industry such as MDF predictions. The initial step is to understand the data and identify the potential independent variable for the model. Correlation analysis provides the initial knowledge of the relationship between independent and dependent data. The higher coefficient of correlation highlights the strength of the relationship between the two. Independent data comes from the mechanical properties, which require much simpler tests and use less time. Physical properties are the dependent data, also known as the targets. Since physical test procedures require longer time, they will be replaced

Asean-Japan W

orkshop 2014


51

by model simulation. Various NN topologies are suitable for consideration to solve the research problem, such as multi-input with single-output and multi-input with multi-output. Problems arise when having separate models for each target because different target requires different training criteria and rescaling method. Different activation function was chosen accordingly for each layer in the single-output network, which is inappropriate and messy. A more suitable solution is to have multi-output model. A multilayer perceptron NN has at least three layers; Input Layer, Hidden Layer and Output Layer; connected by vectors. Being a supervised learning machine, the network learns from previous knowledge; whereby the independent data is fed through the Input Layer. Hidden Layer processes the input to match the targets in the Output Layer.Only one hidden layer is sufficient and proven able to produce excellent results, even though more is allowed. This research uses a three-output network and therefore need to include methods that can best suit all targets. The choice of sigmoid as activation function is due to suitability with the research data. Learning rate was set to be the reciprocal of epoch as suggested by other research has produced lower error as compared to a fixed rate. The epoch cycle continues until 1000 cycles, even when there is no improvement in error reduction observed. A traditional technique and popular choice of weight optimizer for multilayer perceptron NN is back propagation algorithm. Error is back propagated in the network to find suitable reduction rate as the network learns in each epoch cycle. The reason of doing so is to allow any chances of improvement in RMSE reduction, if any. However, back propagation normally experiences local optimum scenario, which cause improvement to stop at one point, known as early convergence. In order to overcome the local optimum problem, GA is added as the weight optimizer to replace back propagation. GA is known as an excellent optimizer through its

Asean-Japan W

orkshop 2014


52

crossover and mutation operators. GA practices the reproduction activity by its operators at fixed rate. Crossover was set at 0.6 and mutation at 0.01. The rates were selected based on comparisons among various possible rates, in search of the best rate. Generation size was 50 containing 100 populations each. RMSE is the fitness value and stored as the population. The fitness values were sorted in descending order, so that the fittest will be in earlier population and the worst will be at the last. Two best set of chromosome are selected from current generation to be the parents. Parents were identified easily at the first and second chromosome. Reproduction activity by these parents produces offspring. Replacement of new offspring replaces the weakest two chromosomes. These operators ensure that offspring consists of better set of chromosome to obtain better population for next consecutive generations. Results show that the hybrid model produced very low RMSE. Convergence was at epoch 250 and obtained 8.0E-16 RMSE. Multilayer perceptron NN is suitable supervised learning technique for MDF testing prediction model. GA is able to replace the function of back propagation as the weight optimizer of NN. This hybrid model also helps to increase model efficiency. Result proves model is capable of reducing testing time, while maintaining the quality standard set. Prediction using GA-NN produces significantly low error, which suggests model accuracy and reliability. Further improvement for GA using adaptive mechanism is recommended.

Asean-Japan W

orkshop 2014


53

An Enhanced Resampling Technique for Imbalanced Datasets

By

Maisarah Zorkeflee1, Aniza Mohamed Din2 and Ku Ruhana Ku-

Mahamud3 Universiti Utara Malaysia, MALAYSIA

Email: [email protected], [email protected], [email protected]

Abstract

Imbalanced data sets occur when the number of samples in one class is very low when compared to other class. In binary classification, the class that contain more instances is known as majority class and the other class is known as minority class. The issue that is commonly related to imbalanced data is poor classification performance due to the tendency of classifiers to ignore data samples that belong to minority class (Lin & Chen, 2012). Hence, to overcome the problem, several methods have been proposed at data-based level and algorithm-based level approaches. The aim of data-based level approach is to modify the ratio of imbalanced data before the data is trained (Chairi, Alaoui, & Lyhyaoui, 2012). The advantage of this approach is its independence towards classifier (Lopez, Fernandez, Garcia, Palade and Herrera, 2013). At algorithm-based level approach, either new algorithm is created or the existing classification algorithm is improved so that it can recognize the minority class. It is also complicated to handle and only performs effectively under certain circumstances (Sahare & Gupta, 2012). In contrast with algorithm-based level approach, data-based level approach is easier to handle because data sets are modified to produce balanced data sets before classifier is trained. Therefore, this study will be focusing on

Asean-Japan W

orkshop 2014


54

the techniques grouped under data-based level approach. Resampling technique is categorised as data-based level approach. Resampling technique is divided into under sampling and oversampling. Under sampling can be defined as a technique of removing samples from majority class while oversampling adds samples to the minority class However, these two techniques may lead to loss of potential data and create over fitting. A number of studies discovered that the problems that may lead to loss of potential data when under sampling technique is performed are because of ambiguity and bias. Hence, this study introduced fuzzy logic due to its ability to overcome these issues. Then, to improve the classification inaccuracy, the enhanced under sampling and oversampling are. According to several studies, the combination of the resampling techniques produce better classification accuracy result as compared to standalone technique (Li, Zou, Wang, & Xia, 2013).There are four phases that need to be completed to achieve the objectives of this study. The first phase is data cleaning where outliers are removed from the imbalanced data. In the second phase, under sampling technique is enhanced by introducing fuzzy logic in the algorithm. Membership function is computed to determine which samples in majority class need to be removed. This method is expected to reduce losing of useful data. After the enhancement phase, the enhanced under sampling technique is combined with oversampling technique. Imbalanced datasets will undergo under sampling process to remove selected samples from majority class. Then, some new instances are created in minority class. This combination of two techniques is expected to produce better classification accuracy as compared to standalone technique. Finally, the enhanced under sampling techniques will be compared with existing standalone techniques namely Synthetic Minority Over-sampling Technique (SMOTE) and Distance-based Under sampling (DUS) technique. The new combination technique will be compared with combination of SMOTE and DUS. The comparison is based on their classification accuracy. The classification performance is evaluated using ROC curve, AUC and G-mean.

Asean-Japan W

orkshop 2014


55

In this study, under sampling technique is enhanced to overcome the problem related removal of potential data in majority class that will lead to inaccurate classification. Fuzzy logic is proposed to be implemented in the under sampling algorithm to solve the problem of ambiguity and bias. This technique is expected to reduce the losing of useful data. Combination of resampling technique is developed by integrating the enhanced under sampling technique with oversampling technique. This combination is expected to produce better classification accuracy as compared to standalone resampling technique. All the proposed technique will be evaluated by comparing their classification accuracy with existing resampling techniques using several performance metrics such as ROC curve AUC and G-mean.

Asean-Japan W

orkshop 2014


56

Machine Learning for Big Data Problems

By

Siti Mariyam Shamsuddin and Shafaatunnur Hasan UTM Big Data Centre

Universiti Teknologi Malaysia, MALAYSIA Email: [email protected] and [email protected]

Abstract

The volume of data being produced is increasing at an exponential rate due to our unprecedented capacity to generate, capture and share vast amounts of data. In this context, Machine Learning (ML) algorithms can be used to extract information from these large volumes of data. However, these algorithms are computationally expensive which are usually proportional to the amount of data being processed. Hence, ML algorithms often demand prohibitive computational resources when facing large volumes of data. As problems become increasingly challenging and demanding (in some cases intractable by traditional CPU architectures), often toolkits supporting ML software development fail to meet the expectations in terms of computational performance. Therefore, the scientific breakthroughs of the future will undoubtedly be powered by advanced computing capabilities that will allow researchers to manipulate and explore massive datasets [1]. GPUMLib was an open source machine learning library developed by Lopes and Ribeiro [2] using CUDA architecture. The aim of GPUMLib is to provide the building blocks for the development of efficient GPU ML software. GPUMLib offers several advantages such as useful in adoption of soft computing methods particularly on the neural network algorithms and fast detection of errors. Thus, in this paper, we provide our experiences in developing one of machine learning

Asean-Japan W

orkshop 2014


57

algorithms, Self-Organizing Map (SOM), to solve large scale of dimensions and features under the platform of GPUMLib. The remainder of this paper is organized as follows: Section 2 discusses the GPU machine learning library (GPUMLib) for SOM network. Section 3 provides the experiment protocol of the GPUMLib implementation on SOM network. Section 4 presents the by experimental results and discussion follows by a conclusion of the study in Section 5.In this study, we develop SOM algorithm for the parallel implementation on the distance computation and BMU searching process using the GPUMLib on GPU (host and device) and CPU (host only).For better representation, the implementation is depicted in Fig 1. Basically, the input data and the weights are initialized randomly on the host side. Meanwhile, the Best Matching Unit (BMU) searching is implemented on the device side. In this process, the memory is allocated for the both side (host and device) and also transfer from host to device (vice versa). For instance, the weights and input data function variables are defined in a Host Matrix(host side) and in aDevice Matrix(device side), follows by the launching of Compute Distances kernel. This function is designed purposely to calculate the sum squared distance between the input data and weights, i.e. the Euclidean distance. Later, the reduction framework, MinIndex Kernel is commenced. The reduction process synchronizes the threads, in order to find the minimum value of BMU (x, y), and the result of each block is written to global memory. The minimum values are copied back to the host for updating the weights. Hence, the looping process continues until the termination criterion is satisfied and finally displays the result. On the other hand, all the processes from read the input data to display output are fully executed on the host (CPU) implementation. The distance and BMU are compute on BestMatchingUnit () function; without transfer to device.

Asean-Japan W

orkshop 2014


58

Figure 1: SOM with GPUMLib Implementation on training the Host (CPU)

and Device (GPU) In this study, high dimensional biomedical dataset including gene expression data, protein profiling data and genomic sequence data that are related to classification is shown in Table 1. The Leukemia training dataset consists of 38 bone marrow samples which categorize as 27 Acute Myeloid Leukemia (ALL) and 11 Acute Lymphoblastic Leukemia (AML), over 7129 probes from 6817 human genes. Also 34 samples testing data are provided, with 20 ALL and 14 AML [24]. The prostate cancer training set contains 52 prostate tumor samples and 50 non-tumors which label as normal with 12600 genes. While, testing set consist of 25 tumor and 9 normal samples [25]. The proteomic patterns for ovarian cancer were generated by mass spectroscopy, which consists of 91 normal and 162 ovarian cancers. The raw spectral data of each sample contains 15154 identities and 253 samples [3]. All datasets are normalized within the range of 0 to 1. In this study, various SOM

Asean-Japan W

orkshop 2014


59

mapping structure in given for each datasets and the number of weight vectors to be updated is illustrated in Table 1. For all datasets with 10 x 15 mapping structure, the number of weight vectors to be updated is 7,6993,200 for leukemia datasets; 257,040,000 Prostate cancer and 575,094300 for ovarian cancer. Table 1: Kent Ridge Biomedical Dataset Repository 1

No

Dataset No. of Samples No. of Features Class Name 1 Leukemia

For nodes updating, we need to deal with 76,993,200 weight vectors (10x15 = 150 nodes x 72 instances x 7129 attributes )

72

For each node, we need to compute feature vectors of 513,288 (72 X 7129)

7129 ALL AML

2 Prostate Cancer (257,040,000)

136 (1,713,600)

12600 Tumor Normal

3 Ovarian Cancer (575,094,300)

253 (3,833,962)

15154 Tumor Relapse Normal Non-Relapse

In this study, the SOM algorithm is executed on NVIDIA Tesla C2075 graphic hardware and Intel Xeon high performance computer. The algorithm is tested on high dimensional biomedical datasets (Leukemia, Prostate Cancer and Ovarian Cancer). The SOM algorithm is setup for 1000 iterations in three different size of mapping. The biomedical datasets such as Prostate cancer, ovarian cancer and leukemia dataset are indicated as large, medium and small feature dimensions. Meanwhile, the SOM mapping size (5x10, 10x10 and 10x15) are labeled as small, medium and large, respectively.

2http://datam.i2r.a-star.edu.sg/datasets/krbd/

Asean-Japan W

orkshop 2014


60

In this experiment, the size of SOM mapping dimension is categorized as (mapping 1=5x10, mapping 2=10x10, mapping 3= 10x15). While, the number of hidden nodes are set to 100 for the first hidden layer and a total 15 nodes for the second hidden layer (see Table 2). Hence, large proportion size of mapping dimension, number of hidden nodes, iterations and feature dimensions of the dataset generates slow computation times for both algorithms. Since the computational time depends on certain parameters, we evaluate SOM algorithm with similar datasets, number of nodes and number of iterations. The SOM speed on GPU generates approximately three times for all datasets as depicted in Fig 2.

Table 2: SOM Speed Performance

Dataset Performance

Evaluation SOM Result

Leukimia

Max Epoch 1000 Size of Mapping Mapping 1 Mapping 2 Mapping 3

5x10 10x10 10x15

CPU time 356.436 s 533.726 s 974.418 s

GPU time 115.441 s 207.574 s 301.79 s

Speed 3.087603 x 2.57126 x 3.228795 x

Prostate Cancer

Max Epoch 1000 Size of Mapping Mapping 1 Mapping 2 Mapping 3

5x10 10x10 10x15

CPU time

1621.65 s 2618.41 s 4081.941 s

GPU time

660.474 s 1118.06 s 1565.038 s

Speed 2.455275 x 2.34192 x 2.608206 x

Ovarian Cancer

Max Epoch 1000

Size of Mapping Mapping 1 Mapping 2 Mapping 3

5x10 10x10 10x15

CPU time

3455.925 s 6354.214 s 9116.06 s

GPU time 1086.895 s 2061.42 s 3166.077 s

Asean-Japan W

orkshop 2014


61

Speed 3.179631 x 3.082445 x 2.879292 x

Figure 2: SOM speed Analysis

In this study, we found that the results are proportionate to the mapping size of the SOM architecture and feature dimensions of the datasets. In other words, the larger the mapping size and feature dimensions, the slower the computation time for CPU but not for GPU. This is due to ANNs (SOM) parameters that depend on size of mapping (number of nodes), dataset feature dimensions, number of input samples, and termination criterion (number of iterations or convergence rate). The GPU-based for SOM algorithm performs three times (3x) faster than the CPU. However, the SOM’s speed could be improved further with the parallelism on updating the weights. We can also create several maps for different data that could afterwards be merged together latter in a bigger map. In

Asean-Japan W

orkshop 2014


62

the future, SOM-GPUMLib will be shared publicly to complement the existing GPUMLib tool.

Asean-Japan W

orkshop 2014


63

Inductive Proof of Invariant Properties for State Machines With Search

By

Kazuhiro Ogata

School of Information Science Japan Advanced Institute of Science and Technology, JAPAN

Email:[email protected]

Abstract

A combination of bounded model checking and theorem proving can find a counterexample located at a deep position from a given initial state that cannot be found only by bounded model checking. In general, however, we need to write two different types of system specifications for model checking and theorem proving. One possible way to alleviate the situation is to translate one type of system specifications into the other type. But, some subtle errors might be penetrated into the translation (or the translator). Therefore, it would be preferable that one type of system specifications could be used for both model checking and theorem proving. Rewrite theory specifications that are often used for model checking are one promising type of system specifications that could also be used for theorem proving. The talk uses an authentication protocol asan example to describe theorem proving based on rewrite theory specifications. Asean-Ja

pan Workshop 2014

Asean-Japan W

orkshop 2014

asean-japan workshop 2014

Documents