an introduction to intelligent operating system kz2 xie li...

An Introduction to Intelligent Operating System KZ2

Xie Li, Du Xing, Chen Jun, Zheng Yuhua, Sun Zhongxiu Computer Science Department, Nanjing University

Nanjing, 210093, P.1LChina

Abstract

This paper proposes an intelligent operating system, KZ2, which is a new generation OS to manage the resources of massively parallel computing systems, to provide a friendly human-computer interface, and control the execution of programs based on knowledge processing and parallel processing. A knowledge based demonstrative example is also described in the paper to show how KZ2 improves the functions and performance of existing operating systems.

CR Categories and Subject DeScl'iptors: D.4.7 distributed systems, D.4.1 scheduling, C.2.1 network communication, 1.2.1 natural language interface, D.1.6 logic programming, 1.2.11 artificial intelligence, distributed. General Term: knowledge processing.

1. Introduction

Historically, operating systems (OS) have undertaken the following three generations: batch, time-sharing, and distributed OSs. With the rapid decrease of hardware costs, it is possible to develop the massively parallel computing systems (MPCS) for high performance computing. The characteristics of an MPCS are a distributed architecture with massive processors, massive parallel capability for computing and knowledge processing and intelligent man-machine interaction devices. However, existing OSs could not meet the new challenge. Their human-computer inlerfaces are not as friendly as expected. The schedulers lose the efficiency when managing massive number of processors. In addition, no capability is provided for knowledge processing and parallel processing.

Fleisch [7] and Blair [1] have recognized the problems and have proposed "intelligent OS" and "knowledge-based OS" respectively to solve them. Many experimental systems have also been developed along these directions, such as UC [14] for improving OS interfaces, [10][12] for managing distributed systems, intelligent OS [5] for image understanding, and the PIMOS [4] for Parallel Inference Machine. Unfortunately, all of them are designed to attack special problems for specific applications.

This paper proposes an intelligent operating system, which is a new generation OS to manage the resources of MPCS, to provide a friendly human-computer interface, and to control the execution of programs based on knowledge processing and parallel processing. Based on these concepts, an experimental intelligent OS called KZ2 has been implemented. At present, the main functions of KZ2 include an intelligent task scheduling, an intelligent man-machine interface, and a mechanism of supporting parallel/distributed processing and knowledge processing. It has the following characteristics: • multi-computer based computing. KZ2 supports distributed computing and parallel processing based on

multi-computer systems in which massive number of computers are connected loosely-coupled or tight- coupled via networks to co-operatively support the execution of applications.

• knowledge based processing. KZ2 ulilises knowledge to manage the resources of the system, and to control

29

the man-machine communication. In addition, a mechanism for efficient knowledge processing is also available for system and application use..

• intelligence. KZ2 uses artificial intelligence techniques to solve the problems, particularly the indeterminacy encountered in resource management of MPCS so that the system behaves more intelligent. The learning capability of the system may also improve its functions and performance itself.

• novice user-oriented. KZ2 is novice user-oriented. The parallel/distributed processing, which, to some extend, is difficult for the novice user to master, might be transparent to him. In addition, a natural language interface is provided so that the user need not be familiar with the command languages of OSs.

The reminder of the paper is organised as follows. In Section 2, the a,rchitecture of KZ2 is introduced. Section 3, 4, and 5 describes respectively in detail the originality and design of the main three components of KZ2. The experimental environment and implementation details are outlined in Section 6. Section 7 gives a glimpse at the performance of KZ2. A demonstrative example is presented in Section 8 as a case study to illustrate how each component of KZ2 is integrated to support the execution of applications and improve its performance. Section 9 concludes with summary and future work.

2. Architecture

2.1 Model of MPCS

A client-server based model of MPCS is proposed in this section in which massive intelligent workstations (IW) and intelligent servers (IS) are interconnected via networks (Fig.l). The IW is a basic processing unit and the IS is a parallel execution mechanism for special purpose such as composed inference, computing or image recognition. An MPCS based on the model may be composed of thousands of IWs and a lot of ISs. The model is a sharing nothing heterogeneous parallel/distributed architecture which provides two levels parallelism. On the first level, multi-tasks are distributed to different IWs sO that they are executed in parallel. On the second level, some frequently used special sen,ices of the system are implemented on multi-processor ISs in parallel. Consequently, the application could use them without considering if they are executed sequentially or in parallel. Their underlying parallel execution might speed up the execution of the applications. Moreover, this model is scaleble because IWs and ISs can be added and deleted arbitrarily.

IS multiple

processor

machine I I I I

multtiple

processor machine

IS

network

I nni-processor

IW maohi e I I I I

uni-processor

machine

Fig. 1

IW

30

2.2 Structure of KZ2

In order to break the barrier between MPCS hardware and the user/application, KZ2 adopts a two-perspective strategy: kernel out and shell in. At present, in its initial stage, KZ2 includes mainly a scheduler for efficient use of processor resources, and a natural language user interface for easy manipulation of the system as well as a parallel inference mechanism. Both the scheduler and the interface are based on knowledge processing and inference. To increase their performance, a parallel inference mechanism was also developed which automatically exploits the parallelism in knowledge processing and inference and supports their parallel execution. It supports parallel processing not only for operating systems but also for user applications. A network system was also included for the provision of uniform communication between distinct computers/processors.

A so-called "kernel-service" structure is adopted by KZ2 (Fig.2). The operating system is composed of a kernel with multiple services. Both the kernel and services are based on knowledge processing. Currently, KZ2 provides the kernel with following functions:

• knowledge base management system (KBMS) • intelligent distributed task scheduling (KZ2/ZFJ)

and several services: • general natural language interface service (KZ2/TZJ); • parallel inference service (KZ2/BTJ); • heterogeneous network service (KZ2/YWJ).

The services may be used by the operating system and user applications. For example, the parallel inference service KZ2/BTJ" can be used either by the system components such as KZ2/ZFJ and KZ2/TZJ or the user's applications.

................... !i!i!i

. . . . ~ . . . . . . . . . . . . . . .

iiiiiiiii~i~ , i : i , i , i+ ,!,~i ~:

ii . . . . . . . . .

, :} I ; I}

- . , . . . . . . . .

ii~ii~iiiiii

!iiiiii!!iii il i!l Fig.2

31

These OS services and the application components may be threaded in any way the application designer likes, which results in the tight integration of OS services and applications. One advantage is that it provides more freedom for the designer to make fidl use of the capability offered by OS efficiently. The designer may customise the OS to be a particular OS for his application easily in order to increase the performance of his applications. In addition, the services are extensible. New services, for example a multimedia processing service, may be added when required. Knowledge is regarded as the basis of KZ2 which is shown as the core. Currently, a simple knowledge base management s3'stem (KBMS) is also'available for the manipulation of knowledge. Although now, KZ2/YWJ is not based on knowledge processing, it is planned to be replaced by intelligent network management which utilizes knowledge to accomplish their functions.

In the subsequent three sections, the general natural language interface, the intelligent task scheduling, and parallel inference are discussed in detail.

3. General natural language interface KZ2/TZJ

Natural language (NL) interfaces are one way to increase the friendliness of man-machine communication. KZ2/TZJ offers the user the ability to communicate with KZ2 in natural language (Chinese). A hierarchical approach is adopted by KZ2/TZJ to constructing the NL interface. It makes KZ2/TZJ be a general NL interface, and could serve for various applications and become the common interface for them.

Three kinds of knowledge are included in KZ2/TZJ: linguistic knowledge, domain knowledge, and application command knowledge. Linguistic 1<nowledge consists of vocabulary knowledge and sentence pattern knowledge. Vocabulary knowledge includes the formation, the usage, and the semantic meaning of words, and is represented in frames. Sentence pattern knowledge presents how to employ words/phrases to construct sentences and what the meanings of sentences are. Backus Naur Form (BNF) is applied to describe such kind of knowledge. Domain knowledge is essential for processing NL requests. Every domain has its own concepts and the same concept may have different meaning from one domain to another. For example, the concept POSITION may be explained to be the location of a file in the naming space when it is in the domain of file systems. However, the same concept may become the co-ordinate of the cursor on the screen if the domain is graphical applications. KZ2/TZJ represents such domain knowledge explicitly and when being used, KZ2/TZJ chooses the appropriate knowledge to interpret the concepts occurred in the requests. The command knowledge describes the forms and semantic meanings of all commands provided by the application. For example, if the application is Unix, the application interface knowledge usually includes all the commands of Unix: their formations and semantic meanings. It is used' to realise the transformation from the internal forms of the request to those acceptable by the application. :

Current researchers in NL interfaces pay little attention to the declarative representation of the domain knowledge and application command knowledge, which prevents NL interfaces from adapting easily to another application. KZ2/TZJ attacks it by expressing them declaratively and separately, and providing a method to extend those particular knowledge.

KZ2/TZJ adopts a hierarchical approach to processing NL requests, which includes three phases: natural language understanding (NLU), domain compiling (DCO), and inference engine (IEN) (Fig.3). Firstly, based on the linguistic knowledge, NLU translates the request into an internal form which represents the initial understanding. The internal form is then refined to be the precise one by the DCO using the domain knowledge. Thirdly, according to application interface knowledge, it is converted by the IEN into the application command sequence. Finally, the command sequence is executed by the application. In addition, another component Learning module is also included in KZ2/TZJ to ease the task of interactive knowledge acquisition.

32

l l

s-/,,,,.

ussr

> L K B : ~ t ~ [ ] data DKB: domain KB . . . . ~" -,~", -,~", -,~" AKB: application K.B

control ,, ,'- , ,'- ,"~'- linguistic domdn application expert expert designer

Fig.3

The NLU realises the conversion from NL requests to their internal forms. This conversion is application independent. It needs no application domain knowledge. The method used by the NLU is based on syntactic analysis. In addition, KZ2/TZJ utilises the domain-independent approach to processing NL requests: it separates completely the application domain-specific part of NL understanding from the NLU and introduces a special phase called domain compiling to complete the application-dependent understanding .Consider an example request "move thefile from abc to cba". Firstly, according to the pattern knowledge available in the linguistic knowledge base, the NLU knows that the request is syntactically correct one and the verb "move" is the predicate. After that, the NLU utilises the knowledge about the word "move" to obtain the rough meaning of the request:

change something from one position to another and represents it in an internal form. If the request is within the domain of some graphical application, and is: "move the rectangle from point (2,4) to (10,30)", after processed by the NLU, it has the same result as the first request. However, for both requests, what something, one position, and another are are not processed by the NLU because they are application domain dependent, and will be analysed in the next phase: domain compiling.

Application domain knowledge is the basis for the DCO to work. Each domain has its own knowledge base (KB), within which as many concepts associated with each domain as possible are stored and managed. When the user indicates that his request is in a particular domain, all the concepts in the internal form of the request will be interpreted according to that domain concept KB. For example, for the first request, if the user indicates that the domain is an operating system, e.g. Unix, the DCO knows form the domain knowledge that file is movable, so something is refined to be file. Meanwhile, from the definitions of the concept POSITION, abc and cba are verified to be the valid values of the concept POSITION, and one position, and another are

33

abc and cba respectively. Consequently, the DCO obtains the precise understanding of the request. Similarly, for the second request, the DCO refines something, one position, and another to be rectangle, (2,4), and (10,30) respectively because in the graphical application domain KB, it is known that the rectangle is movable, and the POSITION is a pair of non-negative integers meaning the co-ordinate on the screen, rather than a location in naming space as the operating systems. Clearly, if the two domain KBs are exchanged, the DCO will regard the two requests to be semantically wrong ones. As a result, this procedure is domain dependent.

The IEN achieves the final conversion from the precise meanings of the request to the application command sequence. Obviously, the request may not'be able to be satisfied by one command. A sequence of commands is required. The IEN achieves it through two kinds of reasoning: concept compound reasoning and command sequence reasoning.

Usually, there are concepts whose meanings can be simply and mechanically derived from some other concepts (here we call them components). For example, a room may consist of a roof, walls, windows, and a door. So the definition of the concept ROOM should include the definitions of components: ROOF, WALL, WINDOW, and DOOR. Besides it, the relation among those components, that is how they are used to form a room, must also be specified: the door and windows are on the walls, and the roof is on the top of the walls. If there are ways to move each component, and if the room is requested to be moved, according to the relationship, it is known how to move the room by reasoning: move the roof firstly, move the door or move windows secondly, and move walls thirdly. Note that the ordering is important, which is derived directly from the relationship. Concept compound reasoning is the mechanism that performs the above kind of reasoning. If the relation between components and compound concepts, and the operations that can be applied on components, are known, those operations could be extended by using the relationship to be able to apply on the compound ones.

Command sequence reasoning is another reasoning way supported by the IEN, which is based on backward reasoning. For example, if the user needs to print all the filenames in current directory by Unix commands, IEN firstly, finds that one way to print is to print from files. However, filenames are not in a file. So in order to print them, they need to be put in a file. Fortunately, there exists a way to do that. Consequently, the IEN obtains the sequence: Is > temp; lpr temp to meet the demand. Although this inference method is not very complex, it works well in the command sequence reasoning. For the first example request, the IEN converts it to the command "mv abc cba". What the commands are for the second request depends on what the graphical application is.

4. Intelligent distributed taskscheduling KZ2/ZFJ

The traditional distributed task scheduling schemes are inadequate in MPCS because of the following specific problems [2]: information about the system status is not timely; the scheduler may not have all information; related to making decisions; and the performance of schedule algorithms relies on the relations between the communication structure of tasks and the configuration of hardware, so that the decision is less effective than the algorithms expect.

KZ2/ZFJ is an attempt to solve these problems. It can deal with indeterminacy such as uncertain global system states, incomplete task knowledge, etc., to get a more rational schedule by synthesizing and reasoning. The efficiency of the decision made by KZ2/ZFJ is at least as high as thai of the schedules generated by the traditional scheduling algorithms which are contained in the knowledge base.

Two kinds of knowledge are available in KZ2/ZFJ: object knowledge and schedule rules. Each component of MPCS such as a processor and a printer is regarded as an object. The object knowledge includes the basic

34

facts about its corresponding component. In detail, it includes innate and postnatal knowledge. The former covers, for example, if the object is a processor, the speed, the memory size of that particular processor, etc., and is usually provided by the creator once the object is created. The latter, normally covering such information as how busy a processor is if the object is a processor, is acquired dynamically during the lifetime of the object. In addition, a belief network is given to reflect quantitily the relationship between such knowledge. In a belief network, the nodes are facts, and the arcs with weights denote the relationship. If there is an arc from the fact A to B with the weight being or, it means that if the fact A is true, the possibility of that the fact B is true is c. Fig.4 is part of an example belief network.

o .

0 .7 O.

@ @

6 F : o e n d fa i lo

SS: 8 e n d g u ¢ o e e ~

Ntd: no r n e e e a g e

MBO: m a i l b o x over ' f low

RM: reoe ive r n e ~ e a g e

RPCTO: R P C t i m e o u t

Fig.4

The schedule rules are defined on the basis of experience knowledge of operating system designers. It covers the existing achievements in distributed task scheduling. The form of a rule is: C ~ A {B}, where C contains conditions, A contains actions or conclusions, and B is the maximum weight of the relationship between proposition A and evidence C, i.e. B is the maximum support for conclusion A when C holds (Belief(C)=l). For example, the experience knowledge "IF no message comes from processor P for a long time, THEN processor P may be down, weight is 0.6" can be expressed as:

T - T(P) > LongTime ~ Down(P) {0.6} where T is the current time, T(P) represents the last time of receiving any message from processor P, LongTime is a constant, and Down(P) represents the fact that the processor P is down.

All knowledge are represented by a rule-based language based on Dempster-Shafer theory of evidence [13], and can be acquired through the following methods:

35

* inherited from other objects; • provided by the user; • deduced by the system or the system default value.

KZ2/ZFJ consists of three basic components: Global state manager, Decision manager, and Knowledge base, and are resided in each IW and IS (Fig.5). The resources of the system are managed in a distributed centralized way. Detail of it may be found in [3].

network

Global state manager

. . . . . .

: . : , • . . : . : ' ' , ' . ' ' ' - . ' . C , : . ' , : , : . : - : . " " " . ' . " > - . " ' . . . . . . . : . : - : , - • .~ . C . " . . . . . . . . .

:::::::::::::::::::::::::::::::::::::: ~" '~ i:::i::::::::::::::i:::i:::i:i ~" -- : : : : ~ . e ~ . . : : : I ~ - ' - " - ' ~ t ' - " - - ~ : i : :~ i~ t~ . i : : ~" I" :i:~at/a~::::i:~: ~" " i:i:i:::i:i:~ii:i:i:i:i::::i:::::i I

I :::::::::::::::::::::::::::::::: [i:i:::: ::::i:i:i:i i:i:i:i:i:i i:i:iL ::::::::::::::::::::::::::::::::

..................... L ........... [ oow 0 ......

Decision manager

::::::::::::::::::::::::::::::::::::: ~ a ~ r l a t ~ - ::::::::::::::::::::::::: : ~ a ~ : : : : : ::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::: :i:::::i:::i:::i:::::i:i:::::i:::i:: i:::i::~:i:i::::::

:::::::::::::::::::::::::::::::::::: I!:!:!:!:!:!:!:!:i:!:!?!:!:!:!:!:!:!~[!:!:~:!:!:::!:!:!:!:!:!:!:::!:!:!:!:~!!!!!!!:!:!:!:!:!:!:!:i:!:!!!!!i!!!i!~

Fig.5

The Global state manager observes the behaviors of the system, forms hypotheses about the system, and refines them into knowledge. Three modules are available in the Global State manager: the I/O interpreter, the Learner, and the Belief manager. The I/O interpreter is responsible for the collection of system information. After it collects a certain amount by observing the communication among the system (rather than obtaining it explicitly which is too costly in MPCS), it generates hypotheses about the system state trying to explain the received communication information. Each hypotheses is assigned an initial belief based on the evidence supporting it. The belief of a hypotheses is updated frequently by the Belief manager according to the refreshment and attenuation rules. In detail, it is performed through the propagation of the beliefs in the belief network. Consider such a belief network as Fig.4 (comm, for short of communication), if the I/O interpreter does not observe any message being sent or received by a processor P (NM) for a period of time, this is an evidence to two hypotheses: comm idle, and abnormal. The Belief manager increases their beliefs of these two hypotheses according to the weights 0.8 and 0.7 respectively. In addition, this evidence is propagated to the hypotheses: idle, and finally to normal with different weights. When the belief of a

36

hypotheses increases to a threshold, which means there is enough evidence to support this hypotheses, the hypotheses becomes a fact with belief, and is put in the knowledge base as parts of the postnatal knowledge of an object. In contrast, with the time passed, if there is no evidence collected to support a hypotheses or a fact, its belief decreases gradually according to the attenuation rule. If the belief of a fact in the knowledge base falls below a threshold, the fact is removed from the knowledge base. When conflicting facts are generated, the Learner is invoked to solve the conflict. For example, if the I/O interpreter observes that no message coming from the processor P for a long time, it may form the two conflicting facts:

(1) P busy (too busy to send/receive messages) (2) P free (no messages are required to send/receive).

The Learner solves it by sending an inquiry message to P to acquire the exact information.

Applying the state knowledge formed by Global state manager, and schedule rules, the Decision manager generates, evaluates, optimizes, and finally obtains an acceptable solution to the schedule problem, which not only meet the user's response time demand but also the system's load balance and performance requirements. If the solution is inconsistent with the solution generated by other local scheduling, they are negotiated with each other. The Decision manager consists of four submodules: the Decision maker, the Evaluator, the Optimizer, and the Collaborator.

The Decision maker is responsible for reasoning and getting all possible solutions according to the scheduling rules and facts in the knowledge base. The Evaluator evaluates these solutions according to a set of criteria (here we call them the cost functions) to generate a set of approximate scheduling whose cost is acceptable. If there exists an acceptable scheduling, the Decision maker will accept it. Otherwise, the Optimizer will try to find one which is either acceptable or most approximately acceptable based on the optimization theorems and rules.

In KZ2/ZFJ, a novel optimization approach was proposed. Usually, finding an optimal solution is an NP complete problem. Traditional mountain-climbing algorithm is a well known approach. According to it, the optimal solution is the summit of a mountain. So finding the optimal solution is climbing the mountain to reach the summit. If there are more than one summit, each can be regarded as the local optimal solution, and the topest as the global optimal one, the mountain-climbing algorithm fails because it is trapped in one summit which may be only a local optimal solution. Taking a shortcut by a cable car from one summit to another is a more efficient way to approach the global optimal solution. In KZ2/ZFJ, a cable-car-mountain- climbing algorithm was proposed. It makes full use of the set of approximate solutions, most of which are very close to their local optimal summits. The new algorithm starts climbing from one approximate solution. When it reaches the summit, it decides if it is the topest (global optimal one). If it is, the algorithm terminates, otherwise, it continues to climb from another approximate solution (just like take a cable car to move from one place to another) rather than continue climbing at the original place. In the worst case when no approximate solution exists, the algorithm retrogrades to the traditional mountain-climbing algorithm.

Finally the scheduling scheme is sent back to the I/O interpreter to dispatch the tasks. However, due to the incorrectness of the state knowledge and the automicity of each local system, scheduling conflicts are inevitable. For example, a local system may dispatch a task to the processor P according to its state knowledge that P is not busy. However, in fact the P may not be as free as the scheduler expects. Also, two local systems may both assign their tasks to P simultaneously. Although their state knowledge is correct, it causes the overload of the P which of course can not accomplish the tasks in time. The Collaborator solves such conflicts by negotiation and arbitration: when one task or more is dispatched to a processor P, the local system of it decides whether to accept it or not. If it is overloaded, and the task may not be finished in time, it sends back a suggestion to where the task comes, and asks that node's local system to reconsider its decision. When more than one tasks come simultaneously, it arbitrates it by selecting one and refusing other tasks and let those schedulers dispatching tasks to it reconsider their scheduling.

37

5. Parallel inference mechanism KZ2/BTJ for knowledge processing

Conventional researchers on the parallelism exploitation of logic program s usually focus on OR parallelism [9] , independent AND parallelism [6][8], and the combination. The exploitation of dependent AND parallelism is comparably even more difficult because when two dependent subgoals, which have shared variables, are executed in parallel, the binding conflict appears if both of them attempt to bind the same shared variable with different values, and a runtime consistency check is required. Little literature about it is available. KZ2/BTJ is a novel parallel inference mechanism based on a new full Or/And parallel execution model (FORDAP) for logic programs. FORDAP can automatically exploit not only the OR parallelism, and independent AND parallelism but also the dependent AND parallelism of the user's logic programs and support their parallel execution. It adopts the Execution Graph Expressions (EGE) and the Parallel Execution Trees (PET) in decomposition control scheme, and a synchronization mechanism based on producer- consumer is proposed to control the binding of shared variables to prevent it from producing inconsistent results in execution control schemes[ 15]..

In FORDAP, a logic program is first translated into the EGEs at compile time, then the PETs is formed according to the inquiry and the EGEs at runtime. The PETs deeply depends on the EGEs compiled, the runtime check of some variables and the granularity computation. Thirdly, the inquiry is decomposed into multiple parallel goals according to PETs formed previously. Finally, all the solutions of those parallel tasks are collected and the final solution is constructed based on that PETs.

In the control of OR-parallelism, FORDAP utilizes the method of parallel executing all the clauses that match a goal predicate. For example, in the following program

f(X,Y) :- gI(X,R). gl(al,rl). gl(a2,r2).

both gl(al,rl) and gl(a2,r2) match gI(X,R) so they are executed in parallel if they are large enough. For the AND-parallelism, if the subgoals of a goal are independent, FORDAP supports their parallel execution. In the following program

f(X,Y) :- gl(X), g2(Y). gl(X) and g2(Y) may be executed in parallel. This is a typical example of independent AND-parallelism.

However, a clause might consist of several dependent AND subgoals. The right ones need to wait for the completion of the left ones if they have shared variables with the left ones. In other words, if they are several AND parallel subgoals which share one common variable, the leftmost one can execute immediately, while the others wait. If the shared variable is fully bound by that leftmost subgoal, then all the right ones can be executed afterwards in parallel; otherwise, if only partially bound, the second leftmost one inherits that partial binding and begins running. The clause is executed in this way until all the subgoals finish execution successfully. FORDAP can exploit the underlying parallelism in dependent subgoals. Consider the following example:

f([LllL2],X,Y) :- gl(L1,X), g2(L2,X,Y) (upon the entry of f, Lland L2 are bound initially). In conventional AND-parallelism model, gl(L1,X) and g2(L2,X,Y) are required to run sequentially because they depend to eac h other on the shared variable X. Suppose g2 is defined as

g2(L,X,Y) :- g3(L,Y),g4(X,Y) Note that g3 is independent to gl. So although in the level of the clause f, g2 is dependent to gl, when decreasing the level to g2, we still can decompose a subgoal of g2 which i~s independent to gl. FORDAP can support this kind of dependent AND-parallelism by a synchronization mechanism on each shared variable. In this example, we denote X in gl to be a producer, and denote X in g2 to be a consumer of that product. Only after X is bound by gl, the consumer tag of X in g2 can be removed. Unify f with g2 and decompose g2

38

further (suppose the grain of g2 is large enough for further decomposition), the consumer tag of X is passed to g4 after unification. Similarly, Y is denoted as a producer in g3, and is regarded as a consumer in g4. Suppose gl,g3, and g4 achieve appropriate grain, and need not being decomposed further, consider all the AND subgoals decomposed, only gl,g3 have tagged variable as consumers. So they can be executed in parallel initially, and after their completion, some consumer tags may be removed, so other subgoals may be executed.

KZ2/BTJ consists of five components: the Precompiler, the controller, the Decomposer, the Solution collector, and the Scheduler (Fig.6). The Precompiler analyzes granularity statically the data dependency in the logic programs, and transfers them into execution graph expression (EGE). The Decomposer is responsible for decomposing a goal into multiple subgoals of appropriate grain and forming parallel execution tree (PET). for that goal to trace the decomposition. The Granularity Controller is invoked by the Decomposer to calculate the grain of each goal. If the grain is large enough, it promotes the Decomposer to partition the goal further; otherwise, it advises the Decomposer to stop decomposing that goal. The Scheduler receives all the executable subgoals, and dispatches them'to processors to execute. All solutions from the processors are returned to the Solution collector, and by joining the subsolutions properly according to PET, the collector returns the complete solution to outer world.

As is known, most existing parallel execution models [11] could only apply to pure logic programs because the parallel execution of extra predicates with side effects might alternate the execution results of logic programs. In KZ2/BTJ, a kind of parallel restriction strategy for the extra predicates is proposed to ensure the correctness of the logic programs: the parallel execution result of a logic program containing these extra predicates is the same as if it is executed sequentially.

data

~q~st

! :~k~0~!: i : ,"

• i.i.i.i.l.i.i.i.l,i.i.i.i.l.i.i.i:i.i , [.:.:.:.:.:.:.:.: ::::::::::::::::::::: iiiii:i:! iiii!i!:!:! i iiii!:i:i:!.i ",'" •

iii!i!! ii::ii i ii!iiiii !iiiiiii : i i iiiiii i i i i!::i::iiiiiiiiiiiiiii !iiiiiiiii •

compiling atoning h a r d w a r ~

Fig.6

39

6. Experimental environment and implementation

As shown in Fig.7, an IW in the experimental platform consists of three parts: a processing unit Sun workstation, a pre-processing unit PC (Personal Computer), and an input/output unit. The PC is used as the terminal to the Sun so as to make full use of the advanced I/O devices available on PCs such as speech input/output and handwriting recognizers. One PC is used as an IS where nine transputers are resident to provide parallel processing service. Three networks: Sun net, Ethernet, and Transputer net are interconnected forming a heterogeneous network to support the communication between IWs and the ISs.

KZ2/TZJ is a natural language interface. Through it, the user issues his applications or requests in Chinese. If they are Chinese requests, KZ2/TZJ understands them and converts them into Unix or DOS commands to be executed. If they are applications, KZ2/TZJ transfers them into the scheduler KZ2/ZFJ to dispatch tasks into proper IWs. The knowledge processing part of the applications written in Prolog may be sent by KZ2/ZFJ to the IS where KZ2/BTJ is resident to execute automatically in parallel. KZ2/TZJ was programmed in Prolog. It could be run on the IS with the support of KZ2/BTJ . KZ2/ZFJ also invoke KZ2/BTJ to expedite its inference. The network server KZ2/YWJ supports the interconnection of Ethernet, Sun net and Transputernet, and provides a uniform point-to-point communication mechanism and the sharing of network file systems.

I$

!

liiiU iiil il '!Ti i TiiTi

Lii i! iiii ii

:..~.~: 2~!,i,~:Y,!~!~i,!::~:~i,;.;,:,~::i~, ,'~':':~,~,:~'~',i,i~!::::~,:':',::i ............ i2, ,'2,!, :~:,,Y,~ iiiiiliiiiiiiii!ii!iiiii!iiiii!iii!!!iii

::::::::::::::::::::::::::::

• " " ' ; ' ; i

Fig.7

There are V, vo approaches to developing KZ2. One is to build a new complete OS from scratch. The other is to build it based on the existing OSs. Obviously, a trade-off should be made. In order to focus the research on the concepts rather than the developing detail, KZ2 adopts the later i one. The software environment is illustrated in Fig.8. SunOS 3.5, 4.01, and 4.11 (dialects of Unix) are the OSs currently running on the Sun workstations. On PCs, MS-DOS 3.X is used. However, MS-DOS 3.X is a single task OS. MTDOS was

40

developed to extend the MS-DOS 3.X to be multi-task OS in order to make PCs as servers. No OS is resident on transputers. Only a programming language, Parallel C, and an input/output manager, filter, are available. KZ2 is designed on this platform to supplement the original OSs to provide more efficient and convenient services for the user. Through it, the user may communicate with the MPCS, on which applications may also be run.

users/applications

KZ2

SunOS 3,5

4.03 411

MTDOS filter

MS-DOS 3.X Parallel C

Suns PCs Transputers

users

KZ2

OSs

hardware

Fig.8

7. Performance Glimpse

So far, there are no wide applicable benchmarks for testing and evaluating the total performance of an operating system. In this section, we only give a brief glimpse at the performance of KZ2 by examining each of the three main components.

From the interface perspective, KZ2/TZJ is a general natural language interface which is characterised by the ease of expanding its application fields. It was originally designed for the application Unix, then it was easily adapted to cover VAX/VMS of VAX machines and MS-DOS of personal computers (PCs). In order to give a fair evaluation of the approach, one quite different application, the text editor Vi, was also added. Without much difficulty, KZ2/TZJ can understand most of the natural language requests of text editing, and converts them into about 40 Vi commands or command sequences. Table. 1 shows the average cost of adding one Chinese phrase and its related Vi commands into KZ2/TZJ. From the table, it could be seen that some efforts were made to expand linguistic knowledge such as phrase semantic and pattern knowledge. This is because even though linguistic knowledge is independent of applications, currently, we only put very small amount of it, necessary just for processing a small subset of Chinese sentences, in the corresponding knowledge base. When the subset is large enough, the expanse on linguistic knowledge will decrease greatly. The low cost indicates that KZ2/TZJ could easily expand its natural language understanding scope and the domain it serves, thus having the ability to adapt easily to various applications.

41

Knowledge phrase semantic

pattern knowledge concept knowledge

command knowledge

Amount of Physical Work 25 lines of programs 5 lines of programs 13 lines of programs 28 lines of programs

Table. 1 Cost of enhancing one phrase

Amount of Logical Work I phrase

3 concepts 5 patterns

4 commands

In order to evaluate the performance of the scheduler KZ2/ZFJ, a simulation model was proposed in which 5 Sun workstations are used to simulate 10 to 100 processors. They are assumed to be connected via a virtual bus. A random task generator provides randomly tasks to the scheduler. The, scheduling strategy of KZ2/ZFJ is compared with the BID algorithm which obtains the processors' states by bidding. Also, it is assumed that each processor must give a bid. Fig.9 illustrates briefly the relation between the total time required by the system to finish a set of tasks and the number of processors. Using the KZ2/ZFJ, the time decreases with the increase of the processor number, however, for the BID algorithm, ,after the processor number reaches 50, the time increases because the communication costs increase greatly to offset the contribution offered by the extra processors. In addition, we also employed 9 transputers to simulate 100 to 500 processors. The results are similar.

time

16,

14 |

12

10

8

6

4

2

0 10

I " ~ KZ2/ZFJ

-.-.-~-.-- BID

I I I I I I I I I I

20 30 40 50 60 70 80 90 100 110 no.: of processors

Fig.9 Relation between system total finishing time and the number of processors

The speedup of the execution of logic programs achieved by the parallel inference mechanism KZ2/BTJ is the main concern. Two typical logic programs were employed to evaluate the performance of KZ2/BTJ. One is the eight-queen problem with 92 solutions altogether. The other is hanoi-tower application with one solution only. The running platform is a multiprocessor system with eight processors. Fig. 10 illustrates the required time for executing the two programs on one to eight processors respectively. The experimental results indicate as following. For the queen problem, when the number of processors increases to 5, the speedup of KZ2/BTJ reaches 4.82. Beyond that point, the time required remains almost unchanged because the number of processors exceeds the amount of parallelism KZ2/BTJ exploits. However, for hanoi-tower, the speedup reaches from 3.89, when four processors are used, to 7,36 while eight processors are engaged in work.

42

180 . 160 I

140 •

120~

100

80

60

40

20

time

J ,~, Queen J "anoi I

! I t I I ! I

2 3 4 5 6 7 8

Fig. 10 Queen and Hanoi Performance

no. of processors

8. Demonstrative example -- a I)olygon recognition application

To assess the usability and functionality of KZ2, a polygon recognition application was developed to run under the support of KZ2. Through it, it could be clarified how KZ2 extends the functionality of the application and supports its execution.

8.1 Polygon Recognition Application (PRA)

PRA is a t3q~ical knowledge based distributed application in which several tasks running in parallel recognize the polygons from images. After recognition, the knowledge about the polygon classification such as what may be defined as parallelogram, rhombus, and rectangle are used by one task to determine the shape of the recognized polygons. In detail, four functions are included:

• recognize polygons from images; • redraw the recognized polygons; • manipulate (move, rotate, etc) the polygons; • classify polygons (parallelogram, rhombus, rectangle, square ...).

The images could be obtained either through the scanner from a paper or drawing by mouse on the screen. The scanner and mouse are connected to the pre-processing unit PC. For making full use of the high resolution screens of Sun workstations, the redrawn polygons are on one dedicated Sun's screen. In addition, PRA provides a command-like interface for the user to manipulate the polygons. More information about the polygons such as its classification could also be obtained from the PRA. The PRA consists of :

• five tasks in C (reco_main, reco_0, reco_l, reco_2, reco_3) with communication mechanism provided by KZ2/YWJ, running in parallel to recognize polygons co-operatively from the images;

• one task in C (draw) to redraw polygons; • one task in Prolog (classi) to classify polygons; • one task in C (manipu) to manipulate polygons.

8.2 Running with KZ2

Initially, the above tasks and the task force table (TFT) are issued through KZ2/TZJ. The TFT contains

43

basically the information such as tasks description, execution costs for each task in any computer, communication costs between any pair of tasks, and the deadline for the job, etc. It allows the KZ2/ZFJ to know some basic information and requirements about the job to make better decision. The TFT may not be given, but clearly, the more information given, the smarter KZ2/ZFJ is.

Following is part of the TFT of the PRA:

task_force_name 'POLYGON_RECOGNITION path •/usr/ios/demo task_name • 0 :reco_main;

1 :reco 0 ' 2 :reco 1 ; 3 :reco 2 ; 4 :reco 3 ; 5 :classi ; 6 :draw ; 7 :manipu ;

execution_cost "0 :WSI5 25,

5 :WS15 big, WSI4 big, WS3 big, TP1 7 5 '

6 :WSI5 70 , 7 :WS15 45 ,

communication_cost • 0 • 1 2, 22, 32, 42.

deadline • 100 extra_request • 6 :#1,

7 :#1. local • 0 . end

The task name gives all the task names of the job. The execution cost describes the costs of each task running in some computers known to the program designer. If some task has to be run on one dedicated computer, The infinite 8 may be assigned to all the other computers so that they will not be chosen to execute the tasks. The commtmication_cost gives a brief description of the communication between any pair of tasks. The deadline indicates that the job must be finished in the designated time period. The other requests are issued in the extra_request. For example, for the PRA, task 6 and 7 are required to be in the same computer.

Based on the TFT and knowledge about the hardware system, KZ2/ZFJ dispatches the tasks to appropriate processors, and starts their execution. Reco_main is assigned to the processor where the job is issued, this is derived from the local part of the TFT. Reco_0, reco_l, reco_2, reco_3, draw, and manipu are assigned to those processors that KZ2/ZFJ thinks are optimal. Classi is assigned to the IS where the transputers are resident, and executed in parallel with the support of KZ2/BTJ. KZ2/TZJ is a general natural language interface. In addition to be the interface for Unix, by adding the PRA's ,dQmain knowledge, the user may manipulate the polygons through Chinese requests rather than using the commands, thus increasing the capability of the PRA. How the components of KZ2 and the PRA are threaded is illustrated in Fig. 11.

44

PKA reco

........... ix ............. 3 ...........................

KZ2

_~ manipu

7,11 tv., T I , 1'

Fig.ll

8.3 Benefits from KZ2

Compared with other OSs which may support the execution of the PRA, the PRA gains its speedup from fair distribution and parallel knowledge processing, and increases its friendliness from natural language interface: • fa ir distribution. In MPCS, the scheduler may not own all the system information. Using AI techniques,

KZ2/ZFJ could dispatch the tasks fairly according to the overhead of each IW and the deadline of the PRA, and the performance of the scheduling is at least as high as conventional task scheduling algorithms, for example, the biding algorithm.

• parallel processing. Although the knowledge processing part of the PRA is written in sequential Prolog, and may conventionally be executed sequentially, by invoking the parallel inference server KZ2/BTJ, which exploits automatically the parallelism in sequential Prolog programs and supports their parallel execution, The PRA speeds up its performance.

• natural language interface. The PRA itself provides a command interface for the user to manipulate the polygons. By adding its domain knowledge into the knowledge base of KZ2/TZJ, the PRA may utilize the natural language service of KZ2/TZJ to let the user use natural language to issue his requests. For example, the user may ask:

move the rectangle to the left 20 points; rotate the triangle anti-clockwise through 30 degrees;

rather than using the commands so that he need not know the commands.

9. Conclusions and future work

This paper proposed an intelligent operating system KZ2 for MPCS, which is characterised by massive number of processors, high distributed architecture, effective process communication, heterogeneous tight- and loosely-coupled interconnection, and multimedia and knowledge processing. The objective of intelligent OS is to accommodate intelligence in resource management, user interfaces, and job control in order to adopt to the new requirements proposed by MPCS. KZ2 attacks the problems from intelligent man-machine communication, intelligent task scheduling, and parallel inference and distributed computing. A polygon recognition application is used to run on KZ2 which improves the performance of the applications greatly.

45

Currently, an intelligent memory management based on distributed shared memory is being developed. It will be included in the kernel of KZ2. Also an intelligent real time distributed task scheduling and development tools are under development. A server with the capability to exploit automatically the large and medium grain of parallelism in programs written in C++, an object-oriented extension to C, is being designed and implemented. A multimedia processing service is another one that will be covered by KZ2.

Reference

[I] Blair,G.S. A Knowledge-based Operating System. Computer Journal. 30, 3. (1987). pp193-200. [2] Casavant,T.L. A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems. IEEE

Transactions on Software Engineering. SE-14,2 (1988), pp. 141-154. [3] Chen,J., Xie,L., and Sun,Z, A Model for Intelligent Task Scheduling in a Large Distributed System. ACM

Operating System Review, (Oct, 1990). pp.26-33. [4] Chikayama,T. Overview of the Parallel Inference Machine Operating System (PIMOS). In Proceedings of

the International Conference on Fifth Generation Computer Systems. (1988, Tokyo, Japan). pp.230-251. [5] Chu,C., Delp,E., Jamieson,L., Siegel,H. and Weil,F. A Model for An Intelligent Operating System for

Executing Image Understanding Tasks on a Reconfigurable Parallel Architecture. Journal of Parallel and Distributed Computing. 6, 3 (1989). pp.548-622.

[6] DeGroot,D. Restricted AND-Parallelism, In Proceedings of International Conference on Fifth Generation Computer System, (1984, Tokyo, Japan, Nov.), pp.471-478.

[7] Fleisch,B.D. Operating systems: A Perspective on Future Trends. AcM Operating System Review. 17, 2 (April, 1983). pp. 14-17.

[8] Hermenegildo,M.V. and Greene, K.J. &-Prolog and its Performance: Explioting Independent And- Parallelism. In Proceedings of the 7th International Conference on Logic Programming. (1990). pp.253- 268.

[9] Khayri,A., Ali,M., and Karlsson,R. The Muse OR-Parallel Prolog Model and its Performance. In Proceedings of the North American Conference on Logic Programming. MIT Press. (Oct. 1990). pp.757- 776.

[10] Korner,K. Intelligent Caching for Remote File Service. In Proceedings of the lOth International Conference on Distributed Computing Systems. (1990, Paris, France, May 28 - June 1). pp.220-226.

[11] Lin,Y.J. Performance of AND-parallel Execution of Logic Programs on A Shared-Memory Multi-processor. In Proceedings of the International Conference on Fifth Generation Computer Systems. (1988, Tokyo, Japan. Nov.28 - Dec.2). pp.851-859.

[12] Pasquale,J. Using Expert Systems to Manage Distributed Computer Systems. IEEE Network. (Sept, 1988).

[13] Shafer,G. A Mathematical Theory of Evidence. Princton University Press. 1976. [14] Wilensky,R., Arens,Y., and Chin,D. Talking to Unix in English: An Overview of UC. Communications

of ACM. 27,6. (June, 1984). pp.574-593. [15] Zheng,Y., Tu,H., and Xie,L. And/Or Parallel Execution of Logic Programs: Exploiting Dependent And-

Parallelism. ACM SIGPLAN Notices. 28,5 (May, 1993). pp. 19-28.

46

an introduction to intelligent operating system kz2 xie li...

Documents