efficient java exception handling in just-in-time compilation

SOFTWARE—PRACTICE AND EXPERIENCESoftw. Pract. Exper. 2004; 34:1463–1480Published online 18 October 2004 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/spe.622

Efficient Java exceptionhandling in just-in-timecompilation

SeungIl Lee, Byung-Sun Yang and Soo-Mook Moon∗,†

School of Electrical Engineering, Seoul National University, Seoul 151-742, South Korea

SUMMARY

Java uses exceptions to provide elegant error handling capabilities during program execution. However,the presence of exception handlers complicates the job of the just-in-time (JIT) compiler, while exceptionsare rarely used in most programs. This paper describes two techniques for reducing such complications.First, we delay the translation of an exception handler until the exception really occurs. This on-demandtranslation of exception handlers allows more optimizations when translating the main flow, without beinghindered by constraints caused by the exception flows. Secondly, for those exceptions that are actuallythrown during program execution we insert exception-type check code and a direct branch to the translatedexception handlers. This exception handler prediction is motivated by an observation that frequently thrownexceptions are likely to be handled by the same exception handlers, so this will eliminate the exceptionprocessing overhead of the Java virtual machine. Our experiments indicate that the code quality of themain flow is no longer affected by the presence of exception handlers. Also, frequently thrown exceptionscan be efficiently handled by the exception handler prediction. Copyright c© 2004 John Wiley & Sons, Ltd.

KEY WORDS: Java virtual machine; exception handling; just-in-time compiler; on-demand translation;exception handler prediction

1. INTRODUCTION

The Java programming language provides an exception handling mechanism for elegant errorhandling [1]. When an error occurs during execution of code in a try block, the error is caught andhandled by an exception handler in one of the subsequentcatch blocks associated with it. An optional

∗Correspondence to: Soo-Mook Moon, School of Electrical Engineering, Seoul National University, Seoul 151-742,South Korea.†E-mail: [email protected]

Contract/grant sponsor: IBM

Copyright c© 2004 John Wiley & Sons, Ltd.Received 25 January 2003

Revised 26 April 2004Accepted 26 April 2004

1464 S. LEE, B.-S. YANG AND S.-M. MOON

finally block includes ‘cleanup’ code which is always executed regardless of whether or not an erroroccurs in the try block. If no catch block can handle the error, the method is terminated abnormallyand the Java virtual machine (JVM) searches backward through the call stack to find an exceptionhandler that can handle the error.

While exception handling is one of Java’s most useful features, there are some performancedisadvantages associated with it, especially when we employ a just-in-time (JIT) compiler to improveperformance. The Java JIT compiler translates the bytecode of a method into the native machine codeprior to the first execution of the method [2]. In order to generate efficient code, the JIT compiler needsto perform fast but efficient register allocation and optimizations. The presence of exception handlersin a method, however, often prevents JIT compilers from generating efficient code for the main flow ofthe method since the code generation constraints of exception flows should also be considered [3–5].Since an exception would be an ‘exceptional’ event, it would be desirable to make the code of the mainflow as efficient as possible.

This paper proposes on-demand translation of exception handlers, which delays the translationof exception handlers until the exception actually occurs. This allows more optimizations whentranslating the main flow, without being hindered by the constraints of the exception flows.We also propose the prediction of exception handlers, motivated by an observation that frequentlythrown exceptions are almost always handled by the same exception handlers. Our techniques arecurrently operational on the LaTTe JVM, an open-source JVM and JIT compiler for SPARC [6](http://latte.snu.ac.kr). We also describe other optimization techniques related to exception handlingin the LaTTe JVM.

The rest of this paper is organized as follows. Section 2 shows exception characteristics of some Javaprograms including SPECjvm98 benchmarks. Section 3 describes on-demand translation of exceptionhandlers and Section 4 describes exception handler prediction. Comparison to related works is shownin Section 5. Section 6 presents our experimental results. A summary follows in Section 7.

2. JAVA EXCEPTION BEHAVIOR IN SOME JAVA BENCHMARKS

We investigated the exception behavior for SPECjvm98 benchmarks [7] and three exception-throwing,public-domain Java programs, which include two programs from Java Grande benchmarks [8](Exception and EulerSizeA) and one program in UCSD benchmarks [9] (denoted by UCSD).Both Exception and UCSD are for measuring the performance of Java exceptions, but they includea simplified form of exceptions that can typically be found in ordinary Java programs.

Table I shows how many exceptions are thrown during program execution. It also shows howmany catch blocks each benchmark has and how many of them are actually used. Five benchmarksin SPECjvm98 do not generate any exceptions. For the remaining three SPECjvm98 benchmarks andthe other three Java programs that throw exceptions, there are many unused catch blocks. This indicatesthat Java exceptions are indeed exceptional events.

We also checked for each exception-thrown point of those six exception-throwing benchmarks howmany types of exceptions are actually thrown. Interestingly, we found all such points throw a singletype of exception. This is because exceptions are often used for controlling execution flows, e.g. exitingnested loops or making a long jump across call chains similar to setjmp - longjmp in C [10], ratherthan error handling for those programs.

Copyright c© 2004 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:1463–1480

EFFICIENT JAVA EXCEPTION HANDLING IN JUST-IN-TIME COMPILATION 1465

Table I. Usage of exception handlers in the Java benchmarks.

Number of thrown Number of used Total number ofBenchmarks exceptions catch blocks catch blocks

200 check 25 24 79201 compress 0 0 45202 jess 0 0 53209 db 0 0 50213 javac 22 372 3 132222 mpegaudio 0 0 44227 mtrt 0 0 54228 jack 241 876 84 169

Exception (JGF section 1) 4 880 000 24 44EulerSizeA (JGF section 3) 41 198 1 27UCSD 1 000 000 1 36

3. ON-DEMAND TRANSLATION OF EXCEPTION HANDLERS

The previous section indicated that many exception handlers are never used during program execution,which promotes the idea of delayed translation of exception handlers until the exception really occurs.The benefit of this on-demand translation is twofold. First, we can reduce the JIT compilation time forexception handlers and the space to save their translated code. Second, we can generate better codefor the main flow of a method since we exclude the exception flows during the JIT compilation of themethod, which simplifies the control flow graph.

We now describe in detail our exception handling mechanism with on-demand translation.We address several key issues and other related ones.

3.1. Exception management

The first issue is how to streamline the transfer of control from a native instruction that raises anexception in a translated try block, to an appropriate catch block after translating the catchblock on demand. The exception manager in the LaTTe JVM is responsible for locating the exceptionhandler based on the address of the excepting native instruction and the exception object, triggeringthe translation if it is not yet translated, and jumping into the translated handler.

A native instruction that may throw a Java exception is called a potentially excepting instruction(PEI) [11]. There are two types of exceptions and PEIs in the LaTTe exception handling mechanism.The first type is for those exceptions thrown within a Java program via the athrow bytecode, or forthose exceptions thrown by the JVM such as OutOfMemoryError initiated by the new bytecode.The exception manager is invoked via a chain of calls in the translated code and the first call instructionis regarded as the PEI in this case (denoted by the call-PEI exception).

The other type is for those runtime exceptions that are supposed to be checked by the JVM duringthe execution of a bytecode instruction. LaTTe inserts explicit check code based on a trap as part of the



translated code. When the trap is taken, the exception manager is invoked by the corresponding signalhandler of the trap. An instruction that raises a trap (including a trap itself) is regarded as the PEI inthis case (denoted by the trap-PEI exception).

There are only five exceptions that belong to the trap-PEI exception: NullPointerException,ArithmeticException, ArrayStoreException, NegativeArraySizeException,and ArrayIndexOutOfBoundException. The NullPointerException is supposed to bethrown when an object reference is NULL. Since there are too many instructions in the Java programthat reference an object, it would be too expensive to insert null-dereference check code for each ofthose instructions. Fortunately, all bytecode instructions that require such a check perform at least oneload/store based on the object reference in the translated code such that if the reference is NULL,the load/store causes the OS to raise SIGSEGV or SIGBUS trap. In this way, we can avoid explicitcheck code. One exception is the invokespecial for which we must insert the check code.The ArithmeticException which is raised when divided by zero for the div instruction canbe handled similarly by the SIGFPE signal without the check code. For all array accesses, the JVM issupposed to check array bounds and raise ArrayIndexOutOfBoundException for out-of-boundaccesses. LaTTe inserts the bound check code based on a trap (as opposed to branches around calls toerror routines) in order to simplify control flows and the signal handler takes care of throwing theexception. The redundant array bound check code is eliminated by a redundancy elimination techniquebased on value numbering.

For both call-PEI and trap-PEI exceptions, the control is transferred to the exception managerwhen an exception is thrown. The manager should first find an appropriate exception handler basedon the exception table loaded from the class file. The exception table is an array of entries of(from pc, to pc, handler pc, catch type) which means that an exception of catch typeraised between from pc and to pc in the program is handled by a handler whose start address ishandler pc [12].

One problem is that all of these pc addresses are bytecode addresses, not native code addresses ofthe translated code. To handle this, LaTTe builds another table called a PEI table which includes entriesof (native pc, bytecode pc, local variable map); for all PEIs in a translated method, thetable records the corresponding bytecode addresses along with their native addresses. Examples ofPEIs are traps, loads, stores, divisions, and calls in the translated code. Actually, the PEI table is partof a structure called method instance which is created after a method is translated. Besides the PEItable, a method instance includes a table of translated code ptr and entry variable mapfor each exception handler in the exception table. The translated code ptr is the address of thetranslated catch block and the entry variable map is the initial variable map used for translationof a catch block (which will be explained shortly).

When an exception is thrown, the exception manager searches the PEI table using the native addressof the excepting instruction as a key, for the corresponding bytecode address. Using this bytecodeaddress and the exception object, the exception table is then searched for the start bytecode addressof the exception handler. If the translated code ptr is not NULL, the control transfers to thetranslated exception handler; otherwise, the translation is triggered.

One advantage of using the PEI table is that we do not have to keep the original control structureof the bytecode during the JIT translation. Some JIT compilers replace the bytecode addresses in theexception table by corresponding native addresses in the translated code, so the original basic blockorder should be preserved in order to make the range check possible [4,5]. LaTTe allows more flexible



} catch(Exception e) { j++; }}

(a) Java source code

loop exit

24: sethi 0xee6b2, l2 or l2, 0x200, l226: subcc l1, l2, g0

for(int i = 0; i < 1000000000; i++) {

20: add l1, 0x1, l1 7: add l0, 0x1, l0

26: subcc l1, l2, g0 bl

7: add l0, 0x1, l020: add l1, 0x1, l1

(c) Translated and optimized native code

Exception table from to target type 7 10 13 <Class java.lang.Exception>

... 4: goto 23 13: pop

17: goto 2014: iinc 0 1 /* j++ in catch block */

28: ... /* loop exit */

10: goto 20 7: iinc 0 1 /* j++ in try block */

bl

try { j++;

20: iinc 1 1 /* i++ */

23: iload_1

26: if_icmplt 724: ldc #2 <Integer 1000000000> /* i < 1000000000 */

(b) Bytecode

bytecode pc : assembly code

Figure 1. Basic block ordering example.

optimizations including code motion of PEIs across basic blocks during the JIT compilation due to thePEI table.

Figure 1 shows an example of aggressive JIT translation in the presence of exception handlers.Figure 1(a) shows Java source code with try and catch blocks and Figure 1(b) shows its bytecodein a control structure form of basic blocks. The bytecode 20 is a kind of a join point between themain flow and the exception handler. Figure 1(c) shows the translated code in a control structure form.Since we exclude the exception handler during the translation, the basic block with bytecode 7/10and the basic block with bytecode 20 could be merged into one without goto in the translated code.Some loop invariant instructions are also moved out of the loop.

If an appropriate exception handler cannot be found in the current method’s exception table, theexception manager unwinds the call stack to access the caller method’s exception table, PEI table, andthe native address of the call instruction, which are used to find the exception handler in the callermethod. This process is repeated throughout the call stack until an appropriate exception handler isfound.

3.2. Consistency of register allocation

Another issue is how to maintain the consistency of register allocation between try blocks and catchblocks, since it is possible for local variables defined in a try block to be used in a catch block [12].Due to its on-demand translation of catch blocks, LaTTe does not consider register consistency with



catch blocks when it translates try blocks. Instead, LaTTe keeps a map of (local variable, realregister/spill location) pairs in the PEI table (in the local variable map field) for each PEI; themap indicates where each local variable is saved at that point. When an exception is actually thrown atthat point, the chosen exception handler is translated based on that map so that local variables in theexception handler are mapped to the same registers/spill locations at the exception-thrown point.

Obviously, register values and the stack pointer value cannot be the same between the exception-thrown point and the point just before entering the translated exception handler, because severalfunctions including the exception manager are called in-between, so they will be overwritten. In orderto recover the previous local variable values before entering the translated exception handler, we resortto SPARC’s trap and restore.

For the trap-PEI type of runtime exceptions, the OS automatically saves the current local and inregisters at the register-window save area in the call stack when an exception (hence the trap) is raised.For the call-PEI type of user-thrown exceptions, LaTTe ensures an artifact trap (ta 3) is executedbefore the exception manager is called, which will also flush all local and in registers in all previousregister window frames (including the frame of when the call PEI is executed) in the call stack‡.We also save the stack pointer value at the time when the exception was thrown since it is requiredfor restoration of local variables spilled in the stack as well as for unwinding the call stack to find theexception handler.

Just before entering the translated exception handler, the exception manager reinstates local variablesmapped to registers as follows. It first sets SPARC’s frame pointer register with the saved stack pointervalue at the exception-thrown point in order make out we are returning back to the exception-thrownpoint. Then, we execute SPARC’s ‘restore’ instruction which will restore the old local and in registervalues from the register-window save area in the call stack.

There is a problem when the same exception handler needs to catch exceptions that are thrownat different program points since their local variable maps may be different. In order to handle thisproblem, we save the map used for the translation of the exception handler during the first invocationat the entry variable map in the method instance, and compare it with the map of subsequentinvocations; if they are the same we can just jump to the same translated code, otherwise the exceptionmanager reconciles the maps by relocating local variables according to the entry variable mapbefore entering the translated code (e.g. if local variable 1 is allocated to register 1 in theentry variable map while it is allocated at the spill area in the local variable map of anexception-thrown point, we need to move it from the spill area to register 1).

Even with this relocation of local variables, there is still a problem due to copy coalescing of localvariables done by the JIT compiler. LaTTe performs aggressive coalescing during its JIT registerallocation to remove copies corresponding to pushes and pops in the bytecode. This can result intwo local variables being allocated to the same register, which is then used to translate the exceptionhandler. The problem is that if these two local variables are allocated to separate registers (hence canhave different values) at a different exception-thrown point, there is no way to reconcile the maps and

‡For SPARC’s out registers, if the exception-raised instruction is a call instruction, they cannot be mapped to local variablesbecause out registers are used for arguments (i.e. if there were an out register that is mapped to a local variable, it would becopied into a local/in register or spilled to memory before the call). For the case of trap-PEI exceptions, a local variable may bemapped to an out register, so out registers are also saved by the signal handler.



relocate those two variables (e.g. if both local variables 1 and 2 are allocated to the register 1 in theentry variable map, while they are allocated to registers 1 and 2 in local variable map,respectively, we cannot reconcile the location of local variable 2).

In order to handle this problem, the exception manager splits all coalesced local variables andallocates them to different registers before it translates the exception handler (local variables 1 and 2are forced to be allocated separately to two different registers in the above example). This defines adifferent map from the original map at the first exception-thrown point (this is the only case whenthese two maps differ), but makes the map reconcilable with other maps.

One might want to simplify the consistency issue of register allocation by first checking which localvariables are used in exception handlers and then assigning the same registers to those local variablesthroughout the whole method. Unfortunately, this fixed preallocation for local variables may result ininefficient code for the main flow since the JIT compiler is constrained in coalescing copies betweenlocal variables (through xload-xstore sequence) or in allocating different registers to different liveranges of a local variable when needed [6], which is useful in LaTTe for generating efficient code.

3.3. Pseudo code of on-demand translation of exception handlers

The pseudo code for the on-demand translation is depicted in Figure 2. For the call-PEI type ofexceptions, throwExternalException() is called first. It saves the current register windowby executing ta 3 and retrieves the stack pointer value of when the exception is thrown via inlineassembly. With the stack pointer and the exception object, the exception manager is called.

For the trap-PEI type of exceptions, the trap itself already saves the current register window, so thereis no need for ta 3. However, the exception object is not yet created, so the corresponding signalhandler must generate an exception object first. It also retrieves the stack pointer value at the PEI fromthe context argument of the signal handler. Then the exception manager is called.

The exception manager searches for a method that includes an exception handler who can catch theexception. Using the saved stack pointer, it unwinds the stack frame and retrieves the native address ofthe excepting instruction. The corresponding bytecode address is obtained from the PEI table and theexception table is looked up. If the method cannot catch the exception, the stack pointer is updated intothat of the previous stack frame and the same process is repeated.

If the current method can catch the exception, the manager calls invoke exceptionhandler(). This routine first checks if the chosen exception handler is already translated.If not, the handler is translated using the variable map at the exception-thrown point saved atlocal variable map of the PEI table. It then saves this map and the start address of the translatedcode at entry variable map and translated code ptr in the method instance, respectively.

If the exception handler is already translated, the local variable map at the exception-thrownpoint and the entry variable map in the method instance are compared. If they are not the same,we need to reconcile them by moving some variables from their locations in local variable mapinto the locations at entry variable map. Since we might need to swap two or more variables,we should be careful not to destroy their old values.

In order to simplify the reconcilation, we perform it before we execute ‘restore’, when all registersof the exception-thrown point are still located at the register-window save area in the call stack.We first copy the whole register-window save area and the spill area into a separate backup area.We then move each local variable from the backup area (its location can be obtained from the



throwExternalException(exception_object) {flush register windows with a trap instruction (ta 3)obtain the stack pointer value of when the exception is thrown and save it in stack_pointercall exception_manager( exception_object, stack_pointer )

}

signal_handler(){

create a corresponding exception_objectobtain the stack pointer value of when the trap is raised and save it in stack_pointercall exception_manager( exception_object, stack_pointer )

}

exception_manager( exception_object, stack_pointer ) {// throwing_stack_pointer and catching_stack_pointer are variables that will hold stack pointer// values at exception-throwing method and exception-catching method in the stack, respectively.

throwing_stack_pointer = stack_pointerdo {

get native pc from stack_pointerget bytecode pc of native pc from PEI tablecheck exception table in methodif (this method can catch the thrown exception) {

catching_stack_pointer = stack_pointercatching_exception_handler = matched exception handlerinvoke_exception_handler(exception_object, throwing_stack_pointer,

catching_stack_pointer, catching_exception_handler);}unwind one stack frame, update stack_pointer by that of the previous stack frame

} while (there is more stack frame);// no exception handler foundprocess uncaught exception procedure;

}

invoke_exception_handler(exception_object, throwing_stack_pointer,catching_stack_pointer, catching_exception_handler) {

if ( translated_code_ptr of catching_exception_handler == null ) { // not yet translatedcreate entry_variable_map from local_variable_map of the excepting PEItranslate exception handler using entry_variable_mapsave translated_code_ptr and entry_variable_map for catching_exception_handler

}

get entry_variable_map from catching_exception_handlerif ( entry_variable_map != local_variable_map of the excepting PEI ) { // thrown at different PEI

copy whole register-window save area & spill area into separate backup areamove local variables from locations in local_variable_map to locations in entry_variable_map

}if (throwing_stack_pointer == catching_stack_pointer // if throwing method equals to catching method

&& we do exception handler prediction) {create check code and copy instructions for exception predictionmodify "call to throwExternalException" by "jump to created check code"exit

}update frame pointer register by catching_stack_pointerexecute restore instructionjump to the translated_code_ptr

}

Figure 2. Exception generation and management process.



......

...jsr

...iload_1...

main flow

subroutine

...aload_1...

...astore_1...jsr

exception handling flow

istore_1...

...jsr

...iload_1...

inlinedsubroutine

exception handling flow

istore_1 astore_1...jsr

...aload_1...

inlinedsubroutine

main flow

(a) (b)

Figure 3. jsr problem solved with inlining subroutine: (a) problematic subroutine;(b) problem solved with inlining.

local variable map) back into the register-window save area or the spill area according to themap in entry variable map.

If we use exception prediction (described in Section 4), we insert check code and some copies.Finally, we execute the ‘restore’ after updating the frame pointer register by the stack pointer and

jump to the exception handler by following the translated code ptr.

3.4. Efficient handling of subroutines

While the Java programming language is type safe, meaning that each local variable is guaranteed tohave a unique type at any point of the program, there exists one exception related to finally blocks.A finally block, also known as a subroutine at the bytecode level, is defined with the finallykeyword and is handled with the jsr bytecode. If a JVM encounters ‘jsr’ bytecode during execution,it stores the current bytecode pc onto the operand stack and jumps to the target bytecode pc. It is alsopossible to return from the subroutine using the stored bytecode pc.

The JVM specification permits a local variable to have different types in a subroutine depending onthe path from which the subroutine is called [12]. In Figure 3(a), for example, the type of the localvariable 1 is ambiguous in the subroutine; the variable is an integer if the subroutine is called from themain flow, and a reference if it is called from the exception handling flow.

This type of ambiguity may cause a complication in JIT compilation or garbage collection.For example, the JIT compiler should know the type of a local variable to map it onto an appropriateregister. Also, the garbage collector should know if a local variable is a reference or not, in order tocompute live objects [13].



try { MyException eobj = new MyException();. . . . . .throw new MyException(); try {. . . . . .

} catch(MyException e) { throw eobj;. . . . . .

} } catch(MyException e) {. . .

}

Figure 4. Typical usage of exceptions.

One solution to remove this type of ambiguity is splitting conflicting local variables via bytecoderewriting [14]. However, this requires data flow analysis and rewriting bytecode, which might beexpensive.

In LaTTe, we chose to simply inline a subroutine into every jsr bytecode. If a subroutine is inlinedto a path, the type of a local variable is fixed for that path, so that the ambiguity issue is obviated.One concern is the code expansion caused by duplicating the subroutine for every jsr. Fortunately,there exists only one jsr in the main flow which is inlined during the translation of the main flow,while jsrs in the exception handlers are rarely inlined since exception handlers are seldom translatedin our on-demand JIT translation. Consequently, a subroutine is often inlined only once.

Figure 3(b) illustrates how the jsr problem is solved by inlining. The ambiguous type of the localvariable 1 is now fixed as an integer on the main flow and as a reference on the exception flow viainlining, if the exception is indeed raised (which is again unlikely in general).

4. EXCEPTION HANDLER PREDICTION

Although many exception handlers are not used at all, exceptions are still frequent in some programs.Fortunately, the observation in Section 2 indicates that exception-thrown points are likely to throw asingle type of exception, which we can exploit to reduce the exception handling overhead of the JVM.This section introduces such a technique based on exception handler prediction.

4.1. The prediction mechanism

We found from our benchmarks that many of those actually thrown exceptions are thrown with athrow statement in the Java program. The throw statement requires an exception object, which iseither created on the spot with a new() statement or is saved in a variable, as shown in Figure 4.All exceptions raised in SPECjvm98 follow these patterns, thus having a single type.

We can exploit this behavior by predicting the exception handler. Since the same exception handleris likely to be used repetitively for a given exception-thrown point, we can simply replace the call tothe JVM exception manager by a direct jump to the exception handler, once the exception handler islocated and translated on-demand in the first throw. With this conversion, we can bypass the exceptionmanager in the next throws, obviating the exception manager intervention.



// fetch handler type constant into r2 registerld [%exception_object%], r1 // get type info from exception objectcmp r1, r2 // compare types of the object and catch blockbe _matched_casecall throwExternalException // non-matched case normal handling

_matched_case:... resolving codes ... // some copy instructionsjump translated_catch_code // matched case fast handling

Figure 5. Pseudo code of type checking in front of exception handler.

In order to implement exception handler prediction, we need to address two issues. The first oneis that it is not guaranteed that the same type of exception will be generated at the same throwingpoint unless the exception usage follows the usage in Figure 4. So, it is always required to check theexception type before the direct jump. A data flow analysis might be used for completely removingthis check, although it is not implemented in our current work. If the exception type is not matched, anormal exception handling process should be followed.

The other issue is that even with prediction, local variables defined in the normal flow can still beused in the exception handler, so the local variable values should be passed into the exception handlerwithout the exception manager’s intervention. For a given translated exception handler, if we predict ata PEI where the first invocation of the exception handler was made, we do not need any reconcilationbefore we jump since their maps should be identical, except for the splitting of coalesced variables forwhich we insert some copies in addition to the check code. When we predict the exception handler atdifferent PEIs, we need to add reconcilation code which specifies what the exception manager did forreconcilation.

Figure 5 presents the result code generated between the throwing point and the predicted exceptionhandler. Only a few machine instructions are required.

4.2. Extension of exception handler prediction with method inlining

We allow exception handler prediction only when the exception-thrown point and the chosen exceptionhandler are located in the same method. If they are located in different methods, the prediction wouldbe too complicated because the runtime call chain may vary. On the other hand, exception handlingis often used to make a long jump across call chains which would limit the applicability of exceptionhandler prediction.

We can partially overcome this limitation via method inlining done by the JIT compiler. When thethrowing method and the catching method are different, they can be merged into one method, and theexception handler prediction can be easily applicable.

Figure 6(a) shows a part of a call graph extracted from 228 jack in SPECjvm98.The scan token() method throws an exception, and it is caught either by Jack2 44 orJack2 43 depending on the call chain. After method inlining in Figure 6(b), each exception-thrown

point and its corresponding exception handler are now located in the same method, and we can useexception handler prediction.



_Jack3_44 _Jack3_43

_Jack2_44 _Jack2_43

scan_token()

��

��

��

��

��

��

��

��

_Jack3_44 _Jack3_43

_Jack2_44 _Jack2_43

scan_token() scan_token()

directlyinvoke EH

directlyinvoke EH

��

��

��

��

��

��

��

��

duplicated

(a) (b)

Figure 6. Direct connection with method inlining: (a) original call graph; (b) inlined call graph.

5. COMPARISON TO RELATED WORK

The JIT compiler in the CACAO JVM [4] or the Kaffe JVM [5] keeps the original order of basic blocksand gives up aggressive optimizations, if the method includes exception handlers. This is becausethe search process of the exception handler is based on an address range check on the exceptiontable. In contrast, our technique can alter basic block orders because the PEI table is built duringJIT compilation, which allows more aggressive optimizations.

The idea of delaying translation of exception handlers until they are really used is similar tothe uncommon trap in the SELF compiler, which delays compilation of infrequent paths [15].The uncommon trap in the SELF compiler inserts a call to the compilation process at the beginning ofan infrequent path, which is later replaced by a jump to the compiled code after the path is actually takenand is compiled. While this is similar to our on-demand translation of exception handlers and exceptionhandler prediction, our problem is a little more complicated since we cannot determine which exceptionhandler to translate until it is really used and we need to insert the check code before the jump.

Exception directed optimization (EDO) [16] is a technique that is similar to our proposal of inlining-based exception handler prediction (it was developed independently and published later than ourwork [17], however). The main difference is that EDO relies on profiling for deciding which methodsto inline in the context of its profile-based JIT compilation, while our technique does not. Also, EDOanalyzes the inlined code to determine the type of exception that will be thrown at the excepting point,whereas our technique predicts whatever is thrown at the excepting point. If excepting points are likelyto throw a single type of exception (as we have observed in our benchmarks), our approach would besimpler and more efficient.

6. EXPERIMENTAL RESULTS

In order to evaluate on-demand translation of exception handlers, we compared LaTTe (denoted byLaTTe) with Sun JDK 1.1.7 in JIT mode (SUN117) and a version of LaTTe whose JIT compiler is



public class Base { public class NoException {public static public staticvoid main(String[] argv) { void main(String[] argv) {

int j = 0; int j = 0;for(int i = 0; i < 1000000000; i++) { for(int i = 0; i < 1000000000; i++) {j++; try {j++; j++;

} } catch(Exception e) {} j++;

} } finally {j++;

}}

}}

Figure 7. Test programs in which no exception occurs.

replaced by Kaffe’s JIT compiler (Kaffe). We also tested three variants of LaTTe to evaluate exceptionhandler prediction: prediction with no inlining, inlining with no prediction, and no prediction with noinlining. All experiments were performed on a SUN UltraSPARC IIi 270 MHz machine running Solaris2.6 with 256 MB of memory, and the minimum execution time was taken among five runs.

6.1. Evaluation of on-demand translation

We first evaluated how the presence of exception handling blocks affects the behavior of a JIT compiler,hence the performance. We made two test programs as shown in Figure 7. The first program is anormal program with no exception handling structure while the other one is a similar program withtry, catch, and finally blocks. Both programs do not generate any exceptions, thus behavingidentically.

Table II shows the running time of the two programs on LaTTe, SUN117, and Kaffe. There islittle difference in running time on LaTTe, while there is a big difference on the other two JVMs,meaning that the presence of exception handling structures keeps their JIT compilers from performingfull optimizations for the main flow, even when no exceptions are actually thrown§.

We measured how many methods actually have exception handling structures in our benchmarks.Table III shows the total number of executed methods and the number of methods that have theexception handling structures among them. An average of 12% of executed methods have exceptionhandling structures, so they would benefit from our on-demand translation due to fully optimized mainflows.

On-demand translation also reduces the JIT translation overhead. Table IV shows the amountof bytecode actually translated by the LaTTe JIT compiler and the amount of bytecode in allexecuted methods. The difference is around 2.6% and is due to the untranslated bytecode in untaken

§We also tested the latest Sun JVMs (JDK 1.3.1 and JDK 1.4.2) and NoException was still significantly slower than Base(80% and 45%, respectively).



Table II. Impact of exception handling structures on JVM performance.

Execution time (s)

Benchmark LaTTe SUN117 Kaffe

Base 10.3 22.6 71.4NoException 10.3 49.4 175.9

Overhead 0 26.8 104.5

Table III. Methods with exception handlers in the Java benchmarks.

Total number of Number of methods RatioBenchmark executed methods with exception handlers (%)

200 check 265 49 18.49201 compress 209 32 15.31202 jess 576 37 6.42209 db 334 37 16.59213 javac 905 84 9.28222 mpegaudio 316 31 9.81227 mtrt 286 39 13.64228 jack 430 99 23.02

Exception (JGF section 1) 121 12 9.92EulerSizeA (JGF section 3) 174 16 9.20UCSD 189 19 10.05

GEOMEAN 12.06

exception handlers. Since the JIT compilation time is approximately proportional to the bytecodesize [6], we can expect 2.6% reduction in the JIT compilation time.

These results support our on-demand translation of exception handlers.

6.2. Evaluation of exception handling prediction

In order to evaluate exception handler prediction, we made two test programs¶ as shown in Figure 8,where the same exception is generated repetitively inside a loop. The only difference between these twoprograms is that the exception-thrown point and the exception handler are located in the same method inthe first program (NoCallChain) while they are located in different methods in the second program(CallChain), hence requiring inlining for exception handler prediction.

¶Similar style programs are included in Java Grande benchmarks for measuring exception handling speed [8].



Table IV. Actual translation of bytecode in the Java benchmarks.

Translated Total RatioBenchmark bytecode bytecode (%)

200 check 26 689 27 452 97.22201 compress 23 643 24 315 97.24202 jess 44 394 45 230 98.15209 db 25 619 26 414 96.99213 javac 89 287 91 963 97.09222 mpegaudio 38 099 38 768 98.27227 mtrt 32 901 33 998 96.77228 jack 50 147 51 208 97.93

Exception (JGF section 1) 7431 7693 96.59EulerSizeA (JGF section 3) 22 259 22 613 98.43UCSD 9604 9915 96.86

GEOMEAN 97.41

class NoCallChain { class CallChain {public static public staticvoid main(String[] argv) { void throw method() throws Exception {

int j = 0; throw new Exception();for(int i = 0; i < 1000000; i++) { }

try {j++; public staticthrow new Exception(); void main(String[] argv) {

} catch(Exception e) { int j = 0;j++; for(int i = 0; i < 1000000000; i++) {

} try {} j++;

} throw method();} } catch(Exception e) {

j++;}

}}

}

Figure 8. Test programs in which exceptions occur frequently.



Table V. Comparison of exception generating programs.

Execution time (s)

LaTTe

Benchmark neither inlining only prediction only both SUN117 Kaffe

NoCallChain 24.0 17.8 14.1 8.9 22.3 14.0CallChain 28.4 18.9 27.9 9.3 23.4 16.1

We implemented four variations of exception handler prediction on top of LaTTe. The first oneenables both prediction and inlining (both). The second one enables prediction but no inlining(prediction only). The third one enables inlining but no prediction (inlining only). The lastone enables neither (neither). Table V shows the running time of the two test programs on the fourvariations of LaTTe.

For NoCallChain, prediction only shows a faster running time than neither due to adirect jump to the exception handler. inlining only also shows a better running time, which isnothing to do with prediction but is due to the inlining of a constructor call Exception(); this makesthe loop work faster. The impact of prediction can be seen by comparing inlining only and both(inlined version) or by comparing neither and prediction only (non-inlined version).

For CallChain, there is little difference between neither and prediction only since themethod call blocks any prediction. For the inlined version, however, prediction is possible and shortensthe running time, as one can see by comparing inlining only and both. Both NoCallChainand CallChain results indicate that exception handler prediction can effectively reduce the exceptionhandling time.

Although exception handling prediction can, in theory, work alone as in NoCallChain, we foundthat it should work with inlining, in practice. That is, we rarely see a hot spot where an exception-thrown point and its exception handler are located in the same method in 213 javac or 228 jack,which throw many exceptions (this is due to the usage of exception handling for long jumps).

On the other hand, finding an appropriate inlining heuristic is a different issue. In 228 jack, forexample, the normal inlining heuristic used by LaTTe does not offer any tangible opportunities forexception handler prediction. If we employ a more aggressive inlining heuristic, however, we can seethe benefit of prediction but we also see a highly increased translation overhead, which offsets thebenefit, as shown in Table VI.

Table VI shows the running time of 228 jack on the original LaTTe (original), LaTTe withaggressive inlining only (inlining only), and LaTTe with both aggressive inlining and prediction(both). The total running time (Total) is the sum of the translation overhead (TR) and the purerunning time of the program (Pure). When we move from original into inlining only, TRincreases significantly while Pure improves little. When we move frominlining only into both,Pure decreases due to prediction but not as much as to offset the increased TR overhead. In fact, thenumber of attempts to find the exception handlers by the exception manager is reduced by 73% inboth, as shown in the lower part of Table VI.



Table VI. Handler prediction effect in 228 jack benchmark.

Elapsed time(s)

original inlining only both Ratio

Total 50.68 60.54 57.13 1.13TR 8.09 19.13 19.12 2.36Pure 42.59 41.40 38.01 0.89

Number of attempts to findexception handlers

original both Ratio241 876 67 238 0.27

Table VI does show the opportunity of exception handler prediction with inlining, while it is not clearat this point what kind of inlining heuristics would be appropriate. Since inlining involves many issuesother than exception prediction (e.g. inlining would be more effective in adaptive JIT compilation),it is left as a future goal to find a good heuristic that can work well with the exception predictiontechnique.

7. SUMMARY

In this paper, we have described two exception handling mechanisms used in the LaTTe JIT compiler.On-demand translation of exception handlers allows the main flow to be fully optimized without beingdisturbed by the constraints of the exception handlers. Exception handler prediction reduces the JVMoverhead for frequently thrown exceptions by jumping to the translated exception handler directly.

Although exception handling has not been popular in Java programs so far, it is a clean and usefulfeature for easy programming. Therefore, we expect that exception handling will be used more oftenin the future and that an efficient exception handling mechanism scheme such as ours will be useful.

ACKNOWLEDGEMENTS

This research was supported by IBM via a sponsored research agreement. We are grateful to all members ofthe LaTTe project, especially Kemal Ebcioglu and Suhyun Kim. This is a revised and expanded version of apaper published in the Proceedings of the ACM 2000 Java Grande Conference, San Francisco, California, June2000 [17]. The difference between this paper and [17] is that we expanded the exception management processand the register consistency issues of the on-demand translation with more details and with pseudo code forthe exception manager. We also clarified what is done by the exception manager and what is done by the codegenerated by the JIT compiler, respectively. The same issues were clarified for exception handler predictionby providing more details and the type-checking pseudo code. Finally, we added three new benchmarks formeasurement of the Java exception behaviors.



REFERENCES

1. Gosling J, Joy B, Steele G. The Java Language Specification (The Java Series). Addison-Wesley: Reading, MA, 1997.2. Moon S-M, Ebcioglu K. A just-in-time compiler. Computer 2000; 33(3):41.3. Gupta M, Choi J-D, Hind M. Optimizing Java programs in the presence of exceptions. Proceedings of the 14th European

Conference on Object-Oriented Programming (ECOOP’00). Springer: Berlin, 2000.4. Krall A, Probst M. Monitors and exceptions: How to implement Java efficiently. Proceedings of the 1998 ACM Workshop

on Java for High-Performance Network Computing. ACM Press: New York, 1998; 15–24.5. Wilkinson T. Kaffe: A JIT and interpreting virtual machine to run Java code, 1998.

http://www.transvirtual.com/ [July 1999].6. Yang B-S, Moon S-M, Park S, Lee J, Lee S, Park J, Chung YC, Kim S, Ebcioglu K, Altman E. LaTTe: A Java VM just-

in-time compiler with fast and efficient register allocation. Proceedings of the 1999 International Conference on ParallelArchitectures and Compilation Techniques (PACT ’99) Newport Beach, CA, 12–16 October 1999. IEEE Computer SocietyPress: Los Alamitos, CA, 1999; 128–138.

7. SPEC JVM98 benchmarks, 1998. http://www.spec.org/osg/jvm98/ [July 1999].8. Java grande benchmarks. http://www.epcc.ed.ac.uk/javagrande/ [July 1999].9. Bill and Paul’s Excellent UCSD Benchmarks for Java.

http://www-cse.ucsd.edu/users/wgg/JavaProf/javaprof.html [July 1999].10. Arnold K, Gosling J, Holmes D. The Java Programming Language (3rd edn). Addison-Wesley: Reading, MA, 2000.11. Mahlke SA, Chen WY, Bringmann RA, Hank RE, Hwu W-MW, Rau BR, Schlansker MS. Sentinel scheduling: A model

for compiler-controlled speculative execution. ACM Transactions of Computer Systems 1993; 11(4):376–408.12. Lindholm T, Yellin F. The Java Virtual Machine Specification. Addison-Wesley: Reading, MA, 1997.13. Stichnoth JM, Lueh G-Y, Cierniak M. Support for garbage collection at every instruction in a Java compiler. Proceedings

of the ACM SIGPLAN ’99 Conference on Programming Language Design and Implementation. ACM Press: New York,1999; 118–127.

14. Agesen O, Detlefs D, Moss JE. Garbage collection and local variable type-precision and liveness in Java virtual machines.Proceedings of the ACM SIGPLAN ’98 Conference on Programming Language Design and Implementation. ACM Press:New York, 1998; 269–279.

15. Hoelzle U. Adaptive optimization for self: Reconciling high performance with exploratory programming. PhD Thesis,Computer Science Department, Stanford University, August 1994.

16. Ogasawara T, Komatsu H, Nakatani T. A study of exception handling and its dynamic optimization in Java. Proceedingsof the 2001 ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications(OOPSLA’01), November 2001. ACM Press: New York, 2001.

17. Lee S, Yang B-S, Kim S, Park S, Moon S-M, Ebcioglu K, Altman E. Efficient Java exception handling in just-in-timecompilation. Proceedings of ACM 2000 Java Grande Conference. ACM Press: New York, 2000.


efficient java exception handling in just-in-time compilation

Documents