aug newsletter 2011

www.yglworld.com

Aug 2011

Newsletter

Customer Visit

Ygl team members pay a visit to Penang Swimming Club in early August

to understanding customer needs, problems and preferences. Our GM

Mr. Leong Vai Long, Sales Director Michael Chin and Senior Consultant

Ms Yong Fee Min were chatting with the PSC employees at the Penang

Swimming Club office.

Hari Merdeka (Independence Day) is a national day of Malay-

sia commemorating the independence of the Federation of

TECHNOLOGY: Support

Troubleshooting Internal Errors By Tamzin Oscroft

A guide to assessing and resolving ORA-600 and ORA-7445 errors

If you’re an Oracle DBA, you’re likely to have come across an error message in your Oracle Database

alert.log files prefixed by either ORA-600 or ORA-7445, such as

Thu Jan 20 13:35:52 2011

Errors in file /DATA/oracle/admin/

prod/udump/prod_ora_2131.trc:

ORA-00600: internal error code,

arguments: [ktfbtgex-7], [1015817],

[1024], [1015816], [], [], [], []

Because these internal error messages include no attached explanation in the way that external error messages do (for example, “ORA-

00942: table or view does not exist”), it is difficult to assess the seriousness of the error and whether it is cause for concern.

This column explains what you can do to assess some ORA-600 or ORA-7445 errors and identify solutions.

As Published In

September/October 2011

ORA-600 or ORA-7445: What Is the Difference?

ORA-600 is a catchall message that indicates an error internal to the database code. The key point to note about an ORA-600 error is that it is

signaled when a code check fails within the database. At points throughout the code, Oracle Database performs checks to confirm that the

information being used in internal processing is healthy, that the variables being used are within a valid range, that changes are being made

to a consistent structure, and that a change won’t put a structure into an unstable state. If a check fails, Oracle Database signals an ORA-600

error and, if necessary, terminates the operation to protect the health of the database.

The first argument to the ORA-600 error message indicates the location in the code where the check is performed; in the example above, that

is ktfbtgex-7 (which indicates that the error occurred at a particular point during tablespace handling). The subsequent arguments have differ-

ent meanings, depending on the particular check.

An ORA-7445 error, on the other hand, traps a notification the operating system has sent to a process and returns that notification to the user.

Unlike the ORA-600 error, the ORA-7445 error is an unexpected failure rather than a handled failure.

The Oracle function in which that notification signal is received is usually, from Oracle Database 10g onward, contained in the ORA-7445 error

message itself. For example, in the error message

ORA-07445: exception encountered:

core dump [kocgor()+96] [SIGSEGV]

[ADDR:0xF000000104] [PC:0x861B7EC]

[Address not mapped to object] []

the failing function is kocgor, which is associated with the handling of user-defined objects. The trapped signal is SIGSEGV (signal 11, seg-

mentation violation), which is an attempt to write to an illegal area of memory. Another common signal is SIGBUS (signal 10, bus error), and

there are other signals that occur less frequently, with causes that range from invalid pointers to insufficient OS resources.

http://www.oracle.com/technetwork/issue-archive/2011/11-sep/ssLINK/444627

http://www.oracle.com/technetwork/issue-archive/index-091870.html

Both ORA-600 and ORA-7445 errors will

Write the error message to the alert.log, along with details about the location of a trace containing further information

Write detailed information to a trace file in the location indicated in the alert.log

In Oracle Database 11g Release 1 onward, create an incident and place the relevant files in the incident directory in the location defined

by the diagnostic_dest initialization file parameter

Write the error message to the user interface if the server process is not terminated or signal ORA-3113 if it is

Often you will see multiple errors reported within the space of a few minutes, typically starting with an ORA-600. It is usually, but not

always, the case that the first is the significant error and the others are side effects.

Can I Resolve These Errors Myself?

For some ORA-600 and ORA-7445 errors, you can either identify the cause and resolve the error on your own or find ways to avoid the

error. The information provided in this section will help you resolve or work around some of the more common errors.

ORA-600 [729]. The first argument to this ORA-600 error message, 729, indicates a memory-handling issue. The error message text will

always include the words space leak, but the number after 729 will vary:

ORA-00600: internal error code,

arguments: [729], [800],

[space leak], [], [],

A space leak occurs when some code does not completely release the memory it used while executing. In this example, when that proc-

ess disconnected from the database, it discovered that some memory was not cleaned up at some point during its life and reported ORA

-600 [729]. The number in the second set of brackets (800) is the number of bytes of memory discovered.

You cannot determine the cause of the space leak by checking your application code, because the error is internal to Oracle Database.

You can, however, safely avoid the error by setting an event in the initialization file for your database in this form:

event="10262 trace name

context forever, level xxxx"

or by executing the following:

SQL>alter system set events

'10262 trace name context forever,

level xxxx' scope=spfile;

You will also need to shut the database down and restart it to enable the event to take effect.

Replace xxxx with a number greater than the value in the second set of brackets in the ORA-600 [729] error message. In the example

above, you could set the number to 1000, in which case the event instructs the database to ignore all user space leaks that are smaller

than 1000 bytes.

ORA-600 [kddummy_blkchk]. The kddummy_blkchk argument indicates that checks on the physical structure of a block have failed. This

error is reported with three additional arguments: the file number, the block number, and an internal code indicating the type of issue with

that block. The following is an alert.log excerpt for an ORA-600 [kddummy_blkchk] error:

Errors in file/u01/oracle/admin/

PROD/bdump/prod_ora_11345.trc:

ORA-600: internal error code, arguments:

[kddummy_blkchk], [2], [21940], [6110],

[], [], [], []

The ORA-600 [kddummy_blkchk] error message usually reports a corruption, so to identify the object involved, first use the file number

(&afn) and the block number (&bn) reported in the error message in the SQL query:

select * from dba_extents

where file_id=&afn

and &bn between block_id

and block_id + blocks -1;

&afn is the first additional argument (2 in this example); &bn is the second additional argument (21940 in this example).

If the query returns a table, confirm the corruption by executing

SQL>analyze table <tablename>

validate structure;

If the query returns an index, confirm the corruption by executing

SQL>analyze index <indexname>

validate structure;

If there is definitely a corruption in the object, the error returned will be

ORA 1498 "block check failure -

see trace file"

The best way to resolve the corruption is to restore a copy of the affected datafile from before the error and to recover the database so

these changes are applied to the restored datafile and brought forward to the current time. This action is possible only if the archivelog fea-

ture for your database is enabled. (Enabling this feature ensures that all changes to the database are saved in archived redo logs.)

ORA-600 [6033]. This error is reported with no additional arguments, as shown in the following alert.log file excerpt:

Errors in file/u01/oracle/admin/PROD/

bdump/prod_ora_2367574.trc:

ORA-600: internal error code, arguments:

[6033], [], [], [], [], [], [], []

The ORA-600 [6033] error often indicates an index corruption. To identify the affected index, you’ll need to look at the trace

file whose name is provided in the alert.log file, just above the error message. In this alert.log excerpt, the trace file you need

to look at is called prod_ora_2367574.trc and is located in /u01/oracle/admin/PROD/bdump.

There are two possible ways to identify the table on which the affected index is built:

Look for the SQL statement that was executing at the time of the error. This statement should appear at the top of the trace file, under

the heading “Current SQL Statement.” The affected index will belong to one of the tables accessed by that statement.

Search within the trace file for the words Plan Table. This search will return the tables and indexes the Oracle Database optimizer is

using to access the data that will satisfy the query being executed. For example, the query using the plan in Listing 1 is accessing

the testtab1, testtab2, and testtab3 tables and the XC179S1 and XC179PO indexes.

Code Listing 1: Query plan accessing three tables and two indexes

-------------------------------------------------------------------------------

|Id |Operation |Name |Rows |Bytes |Cost |Time |

-------------------------------------------------------------------------------

|0 |SELECT STATEMENT | | | |883K | |

|1 | WINDOW NOSORT | |2506K | 318M |883K |04:31:56|

|2 | SORT GROUP BY | |2506K | 318M |883K |04:31:56|

|3 | HASH JOIN RIGHT OUTER | |2506K | 318M |837K |03:20:05|

|4 | VIEW |index$_join$_006| 8777 | 257K | 35 |00:00:01|

|5 | HASH JOIN | | | | | |

|6 | INDEX FAST FULL SCAN |XC179S1 | 8777 | 257K | 18 |00:00:01|

|7 | INDEX FAST FULL SCAN |XC179P0 | 8777 | 257K | 25 |00:00:01|

|8 | VIEW | |2506K | 245M |837K |03:20:04|

|9 | HASH JOIN OUTER | |2506K | 296M |837K |03:20:04|

|10 | HASH JOIN | |2506K | 184M |454K |02:49:32|

|11 | TABLE ACCESS FULL |TESTTAB1 |2484K | 102M |361K |01:26:13|

|12 | TABLE ACCESS FULL |TESTTAB2 | 21M | 688M | 49K |00:12:37|

|13 | TABLE ACCESS FULL |TESTTAB3 | 94M |4326M |146K |00:35:51|

For each of the tables used by the SQL statement that was executing at the time of the error, execute the following statement:

SQL>analyze table <tablename> validate

structure cascade;

This will check to ensure that every value in the index is also in the table, and vice versa. If it finds a mismatch, it will report

ORA-1499 table/Index Cross Reference

Failure - see trace file

The trace file will be in the location indicated by the user_dump_dest or diagnostic_dest initialization parameter and will contain information

similar to

row not found in index

tsn: 8

rdba: 0x04d01348

You can then find the index with by using the query in Listing 2. Replace the &rdba and &tsn values in Listing 2 with the appropriate values.

For this example, the &rdba value is the rdba from the trace file with the 0x portion removed and &tsn is the tablespace number (tsn) from

the trace file. (&rdba in this case would be 04d01348, and &tsn would be 8.)

Code Listing 2: Find the index

SELECT owner, segment_name, segment_type, partition_name

FROM DBA_SEGMENTS

WHERE header_file = (SELECT file#

FROM v$datafile

WHERE rfile# =

dbms_utility.data_block_address_file(to_

number('&rdba','XXXXXXXX'))

AND ts#= &tsn)

AND header_block = dbms_utility.data_block_address_block(to_

number('&rdba','XXXXXXXX'));

Once you have identified the index, drop and re-create it. It is important to drop and re-create the index

rather than rebuilding it online, because only re-creating it will reinitialize the values in the index.

Ygl Shanghai

Unit 1502, Kerry

Everbright City Tower 2,

218 West Tianmu Road

Shanghai, China

Tel No : 0086-22-63538210

Fax No: 0086-21-63549862

Ygl Beijing Tower

Yuan Da 5 Area

Ban Jing Road

Hai Dian District

Beijing

Tel No : 0086-10 8859 7040

Fax No: 0086-10 5198 8637

Ygl Hong Kong

Unit 2402, 24/F Nanyang Plaza,

57, Hung To Road

Kwun Tong

Kowloon

Hong Kong

Tel No : 852-2609 1338

Fax No : 852-2607 3042

Ygl Penang

No. 16, China Street

10200 Penang

Malaysia

Tel No : 604-2610619

Fax No : 604-2625599

Ygl Research & Development Centre

No. 5, Lintang Bayan Lepas 1

Bayan Lepas Industrial Park

Phase 4, 11900 Bayan Lepas

Penang, Malaysia

Tel No : 604-6303373

Fax No : 604-6303377

Ygl Kuala Lumpur

Suite 9-10,

Wisma UOA II, Jalan Pinang

50450 Kuala Lumpur

Malaysia

Tel No : 603-2166 5928

Fax No : 603-2166 5926

E-mail: [email protected]

Ygl Singapore

55, Market Street

#10-00

Singapore 48941

Tel No : 65-65213030

Fax No : 65-65213001

ORA-7445 [xxxxxx] [SIGBUS] [OBJECT SPECIFIC HARDWARE ERROR]. This ORA-7445

error can occur with many different functions (in place of xxxxxx). For example, the following

alert.log excerpt shows the failing function as ksxmcln.

/u01/app/oracle/admin/prod/bdump/

prod_smon_8201.trc:

ORA-7445: exception encountered:

core dump [ksxmcln()+0] [SIGBUS]

[object specific hardware error]

[6822760] [] []

The important part of this error is the ”object specific hardware error” argument, which indicates

that there were insufficient operating system resources to complete the action. The most common

resources involved are swap and memory.

To diagnose the cause of an ORA-7445 error, you should first check the operating system error

log; for example, in Linux this error log is /var/log/messages. Within the error log, look for informa-

tion with the same time stamp as the ORA-7445 error (this will be in the alert.log next to the error

message). You will often find an error message similar to

Jun 9 19:005:05 PRODmach1 genunix:

[ID 470503 kern.warning]

WARNING: Sorry, no swap space to grow

stack for pid 9632

If no errors are reported in the operating system error log with the same time stamp as the ORA-7445 error, check the ORA-7445 trace file. If there is a statement in the trace file under the head-ing “Current SQL Statement,” execute that statement again to try to reproduce the error. If the error is reproduced, run the statement again while monitoring OS resources with standard UNIX monitoring tools such as sar or vmstat (contact your system administrator if you are not sure which to use). Once you’ve identified the resource that affects the running of the statement, increase the amount of that resource available to Oracle Database.

Summary

By following the instructions in this article, you should be able to resolve some errors that are

caused by underlying physical issues such as file corruption or insufficient swap space. But be-

cause ORA-600 and ORA-7445 errors are internal, many cannot be resolved by user-led trouble-

shooting.

For those Oracle Database users with Oracle support contracts, however, additional knowledge

content is available via My Oracle Support. Most notably, the ORA-600/ORA-7445 lookup tool

[Knowledge Article 153788.1], shown in Figure 1, enables you to enter the first argument to an

ORA-600 or ORA-7445 error message and use that information to identify known defects, work-

arounds, and other knowledge targeted specifically to that error/argument combination.

09 Aug 2011 Singapore celebrated the Republic’s 46th birthday

aug newsletter 2011

Documents

error message ora

error internal

internal error code

bus error

internal error messages

external error messages

database code

oracle database signals