pointers and null pointer

8/8/2019 Pointers and Null Pointer

http://slidepdf.com/reader/full/pointers-and-null-pointer 1/24

Pointer a pointing to variable b . Note that b

contains a number(hexadecimal 00000011= decimal 17), where a

contains the address of b in memory (hexadecimal1008). In this case boththe address and the data

are contained in 32 bitwords.

Pointer (computing)

From Wikipedia, the free encyclopedia

In computer science, a pointer is a programminglanguage data type whose value refers directly to (or"points to") another value stored elsewhere in thecomputer memory using its address. For high-levelprogramming languages, pointers effectively take theplace of general purpose registers in low levellanguages such as assembly language or machinecode—but, in contrast, occupies part of the availablememory. A pointer references a location in memory,and obtaining the value at the location a pointer refersto is known as dereferencing the pointer. A pointer is

a simple, less abstracted implementation of the moreabstracted reference data type (although it is not asdirectly usable as a C++ reference). Several languagessupport some type of pointer, although some are morerestricted than others.

Pointers to data significantly improve performance forrepetitive operations such as traversing strings, lookuptables, control tables and tree structures. Inparticular, it is often much cheaper in time and spaceto copy and dereference pointers than it is to copy andaccess the data to which the pointers point.

Pointers are also used to hold the addresses of entrypoints for called subroutines in procedural programming and for run-timelinking to dynamic link libraries (DLLs). In Object-oriented programming,pointers to functions are used for binding methods, often using what are called virtual method tables.

While "pointer" has been used to refer to references in general, it more properlyapplies to data structures whose interface explicitly allows the pointer to bemanipulated (arithmetically via pointer arithmetic ) as a memory address, as

opposed to a magic cookie or capability where this is not possible.[citation needed ]

Because pointers allow both protected and unprotected access to memoryaddresses, there are risks associated with using them particularly in the lattercase. For general information about references, see reference (computerscience).

Pointer (computing) - Wikipedia, the free encyclopedia http://en.wikipedia.org/w/index.php?title=Pointe

de 24 03/02/10 1



Contents

1 Formal description2 Quotations

3 Use in data structures4 Use in control tables5 Architectural roots6 Uses

6.1 C pointers6.2 C arrays6.3 C linked list6.4 Pass-by-address using pointers6.5 Memory-mapped hardware

7 Typed pointers and casting8 Making pointers safer9 Null pointer10 Double indirection11 Wild pointers12 Wild branch13 Simulation using an array index14 Support in various programming languages

14.1 Ada14.2 BASIC14.3 C and C++

14.4 C#14.5 D14.6 Eiffel14.7 Fortran14.8 Modula-214.9 Oberon14.10 Pascal

15 See also16 References17 External links

Formal description

In computer science, a pointer is a kind of reference.


2 de 24 03/02/10 1



A data primitive (or just primitive ) is any datum that can be read from or writtento computer memory using one memory access (for instance, both a byte andword are primitives).

A data aggregate (or just aggregate ) is a group of primitives that are logicallycontiguous in memory and that are viewed collectively as one datum (forinstance, an aggregate could be 3 logically contiguous bytes, the values of whichrepresent the 3 coordinates of a point in space); when an aggregate is entirelycomposed of the same type of primitive, the aggregate can be called an array ; ina sense, a multi-byte word primitive is an array of bytes, and some programs usewords in this way.

In the context of these definitions, a byte is the smallest primitive; each memoryaddress specifies a different byte. The memory address of the first byte of adatum is considered the memory address (or base memory address ) of the entiredatum.

A memory pointer (or just pointer ) is a primitive, the value of which is intendedto be used as a memory address; it is said that a pointer points to a memory address . It is also said that a pointer points to a datum [in memory] when thepointer's value is the datum's memory address.

More generally, a pointer is a kind of reference, and it is said that a pointer references a datum stored somewhere in memory ; to obtain that datum is to dereference the pointer . The feature that separates pointers from other kinds of reference is that a pointer's value is meant to be interpreted as a memoryaddress, which is a rather 'low-level' concept.

References serve as a level of indirection: A pointer's value determines whichmemory address (that is, which datum) is to be used in a calculation. Becauseindirection is a fundamental aspect of algorithms, pointers are often expressed asa fundamental data type in programming languages; in statically (or strongly)typed programming languages, the type of a pointer determines the type of thedatum to which the pointer points.

Quotations

“I do consider assignment statements and pointer variables to be

among computer science's most valuable treasures. ”

— Donald Knuth (Stanford Professor Emeritus of the Art of Computer

Programming), "Structured Programming with go to Statements"[1]

Use in data structures


3 de 24 03/02/10 1



When setting up data structures like lists, queues and trees, it is necessary tohave pointers to help manage how the structure is implemented and controlled.Typical examples of pointers are start pointers, end pointers, and stack pointers.These pointers can either be absolute (the actual physical address or a virtualaddress in virtual memory) or relative (an offset from an absolute start address("base") that typically uses less bits than a full address, but will usually require

one additional arithmetic operation to resolve).

A two-byte offset, containing a 16-bit, unsigned integer, can be used to providerelative addressing for up to 64 kilobytes of a data structure. This can easily beextended to 128K,256K or 512K if the address pointed to is forced to be on ahalf-word, word or double-word boundary (but, requiring an additional "shiftleft" bitwise operation—by 1,2 or 3 bits—in order to adjust the offset by a factorof 2,3 or 4, before its addition to the base address).

A one byte offset, such as the hexadecimal ASCII value of a character (eg. X'29')can be used to point to an alternative integer value (or index) in an array (eg

X'01'). In this way, characters can be very efficiently translated from 'raw data'to a usable sequential index and then to an absolute address without a lookuptable.

Use in control tables

Control tables, that are used to control program flow, usually make extensive useof pointers. The pointers, usually embedded in a table entry, may, for instance,be used to hold the entry points to subroutines to be executed, based on certainconditions defined in the same table entry. The pointers can however be simply

indexes to other separate, but associated, tables comprising an array of theactual addresses or the addresses themselves (depending upon the programminglanguage constructs available). They can also be used to point (back) to earliertable entries (as in loop processing) or forward to skip some table entries (as in aswitch or "early" exit from a loop). For this latter purpose, the "pointer" maysimply be the table entry number itself and can be transformed into an actualaddress by simple arithmetic.

Architectural roots

Pointers are a very thin abstraction on top of the addressing capabilitiesprovided by most modern architectures. In the simplest scheme, an address , or anumeric index, is assigned to each unit of memory in the system, where the unitis typically either a byte or a word, effectively transforming all of memory into a very large array. Then, if we have an address, the system provides an operationto retrieve the value stored in the memory unit at that address (usually utilizingthe machines general purpose registers).


4 de 24 03/02/10 1



In the usual case, a pointer is large enough to hold more addresses than thereare units of memory in the system. This introduces the possibility that a programmay attempt to access an address which corresponds to no unit of memory ,either because not enough memory is installed (i.e. beyond the range of availablememory) or the architecture does not support such addresses. The first casemay, in certain platforms such as the Intel x86 architecture, be called a

segmentation fault (segfault). The second case is possible in the currentimplementation of AMD64, where pointers are 64 bit long and addresses onlyextend to 48 bits. There, pointers must conform to certain rules (canonicaladdresses), so if a noncanonical pointer is dereferenced, the processor raises ageneral protection fault.

On the other hand, some systems have more units of memory than there areaddresses. In this case, a more complex scheme such as memory segmentation orpaging is employed to use different parts of the memory at different times. Thelast incarnations of the x86 architecture support up to 36 bits of physicalmemory addresses, which were mapped to the 32-bit linear address space

through the PAE paging mechanism. Thus, only 1/16 of the possible totalmemory may be accessed at a time. Another example in the same computerfamily was the 16-bit protected mode of the 80286 processor, which, thoughsupporting only 16 MiB of physical memory, could access up to 1 GiB of virtualmemory, but the combination of 16-bit address and segment registers madeaccessing more than 64 KiB in one data structure cumbersome. Somerestrictions of ANSI pointer arithmetic may have been due to the segmentedmemory models of this processor family.

In order to provide a consistent interface, some architectures provide memory-mapped I/O, which allows some addresses to refer to units of memory while

others refer to device registers of other devices in the computer. There areanalogous concepts such as file offsets, array indices, and remote objectreferences that serve some of the same purposes as addresses for other types of objects.

Uses

Pointers are directly supported without restrictions in languages such as PL/I, C,C++, Pascal, and most assembly languages. They are primarily used forconstructing references, which in turn are fundamental to constructing nearly

all data structures, as well as in passing data between different parts of aprogram.

In functional programming languages that rely heavily on lists, pointers andreferences are managed abstractly by the language using internal constructslike cons.

When dealing with arrays, the critical lookup operation typically involves a stage


5 de 24 03/02/10 1



called address calculation which involves constructing a pointer to the desireddata element in the array. If the data elements in the array have lengths that aredivisible by powers of two, this arithmetic is usually much more efficient.Padding is frequently used as a mechanism for ensuring this is the case, despitethe increased memory requirement. In other data structures, such as linked lists,pointers are used as references to explicitly tie one piece of the structure to

another.

Pointers are used to pass parameters by reference. This is useful if theprogrammer wants a function's modifications to a parameter to be visible to thefunction's caller. This is also useful for returning multiple values from a function.

Pointers can also be used to allocate and deallocate dynamic variables andarrays in memory. Since a variable will often become redundant after it hasserved its purpose, it is a waste of memory to keep it, and therefore it is goodpractice to deallocate it (using the original pointer reference) when it is nolonger needed. Failure to do so may result in a memory leak (where available

free memory gradually, or in severe cases rapidly, diminishes because of anaccumulation of numerous redundant memory blocks).

C pointers

The basic syntax to define a pointer is

int *money;

This declares money as a pointer to an integer. Since the contents of memory arenot guaranteed to be of any specific value in C, care must be taken to ensurethat the address that money points to is valid. This is why it is sometimessuggested to initialize the pointer to NULL (however initialising pointersunnecessarily can mask compiler analyses and hide bugs).

int *money = NULL;

If a NULL pointer is dereferenced then a runtime error will occur and execution

will stop, usually with a segmentation fault.

Once a pointer has been declared, the next logical step is for it to point atsomething:


6 de 24 03/02/10 1



int a = 5;int *money = NULL; money = &a;

This assigns the value of money to be the address of a. For example, if a is storedat memory location of 0x8130 then the value of money will be 0x8130 after theassignment. To dereference the pointer, an asterisk is used again:

*money = 8;

This means take the contents of money (which is 0x8130), "locate" that address inmemory and set its value to 8. If a is later accessed again, its new value will be 8.

This example may be more clear if memory is examined directly. Assume that a islocated at address 0x8130 in memory and money at 0x8134; also assume this is a32-bit machine such that an int is 32-bits wide. The following is what would be inmemory after the following code snippet is executed

int a = 5;int *money = NULL;

Address Contents0x8130 0x00000005

0x8134 0x00000000

(The NULL pointer shown here is 0x00000000.) By assigning the address of a tomoney

money = &a;

yields the following memory values

Address Contents


7 de 24 03/02/10 1



0x8130 0x00000005

0x8134 0x00008130

Then by dereferencing money by coding

*money = 8;

the computer will take the contents of money (which is 0x8130), 'locate' thataddress, and assign 8 to that location yielding the following memory.

Address Contents

0x8130 0x00000008

0x8134 0x00008130

Clearly, accessing a will yield the value of 8 because the previous instructionmodified the contents of a by way of the pointer money.

C arrays

In C, array indexing is formally defined in terms of pointer arithmetic; that is,the language specification requires that array[i] be equivalent to *(array +

i).[2] Thus in C, arrays can be thought of as pointers to consecutive areas of

memory (with no gaps),[2]

and the syntax for accessing arrays is identical forthat which can be used to dereference pointers. For example, an array array canbe declared and used in the following manner:

int array[5]; /* Declares 5 contiguous (per Plauger Standard C int *ptr = array; /* Arrays can be used as pointers */ ptr[0] = 1; /* Pointers can be indexed with array syntax */ *(array + 1) = 2; /* Arrays can be dereferenced with pointer syntax

*(1 + array) = 3; /* Pointer addition is commutative */ 2[array] = 4; /* Subscript operator is commutative */

This allocates a block of five integers and names the block array, which acts as apointer to the block. Another common use of pointers is to point to dynamicallyallocated memory from malloc which returns a consecutive block of memory of noless than the requested size that can be used as an array.


8 de 24 03/02/10 1



While most operators on arrays and pointers are equivalent, it is important tonote that the sizeof operator will differ. In this example, sizeof(array) willevaluate to 5*sizeof(int) (the size of the array), while sizeof(ptr) will evaluateto sizeof(int*), the size of the pointer itself.

Default values of an array can be declared like:

int array[5] = {2,4,3,1,5};

If you assume that array is located in memory starting at address 0x1000 on a32-bit little-endian machine then memory will contain the following (values arein hexadecimal, like the addresses):

0 1 2 3

1000 2 0 0 01004 4 0 0 0

1008 3 0 0 0

100C 1 0 0 0

1010 5 0 0 0

Represented here are five integers: 2, 4, 3, 1, and 5. These five integers occupy32 bits (4 bytes) each with the least-significant byte stored first (this is a little-

endian CPU architecture) and are stored consecutively starting at address0x1000.

The syntax for C with pointers is:

array means 0x1000array+1 means 0x1004 (note that the "+1" really means to add one timesthe size of an int (4 bytes) not literally "plus one")*array means to dereference the contents of array. Considering thecontents as a memory address (0x1000) , look up the value at that location(0x0002).

array[i] means the ith index of array which is translated into *(array + i)

The last example is how to access the contents of array. Breaking it down:

array + i is the memory location of the ith

element of array*(array + i) takes that memory address and dereferences it to access the value.


9 de 24 03/02/10 1



E.g. array[3] is synonymous with *(array+3), meaning *(0x1000 +3*sizeof(int)), which says "dereference the value stored at 0x100C", in thiscase 0x0001.

C linked list

Below is an example of the definition of a linked list in C.

/* the empty linked list is* represented by NULL or some * other signal value */ #define EMPTY_LIST NULL struct link { /* the data of this link */ void *data;

/* the next link; EMPTY_LIST if this is the last link */ struct link *next;};

Note that this pointer-recursive definition is essentially the same as thereference-recursive definition from the Haskell programming language:

data Link a = Nil

| Cons a (Link a)

Nil is the empty list, and Cons a (Link a) is a cons cell of type a with anotherlink also of type a.

The definition with references, however, is type-checked and does not usepotentially confusing signal values. For this reason, data structures in C areusually dealt with via wrapper functions, which are carefully checked forcorrectness.

Pass-by-address using pointers

Pointers can be used to pass variables by their address, allowing their value tobe changed. For example:


0 de 24 03/02/10 1



/* a copy of the int n is changed */ void not_alter(int n) {

n = 360;}

/* the actual variable passed (by address) is changed */ void alter(int *n) { *n = 120;} void func(void) { int x = 24;

/*pass x's address as the argument*/ alter(&x);

/* x now equal to 120 */

not_alter(x);

/* x still equal to 120 */ }

Memory-mapped hardware

On some computing architectures, pointers can be used to directly manipulatememory or memory-mapped devices.

Assigning addresses to pointers is an invaluable tool when programmingmicrocontrollers. Below is a simple example declaring a pointer of type int andinitialising it to a hexadecimal address in this example the constant 0x7FFF:

int *hardware_address = (int *)0x7FFF;

In the mid 80s, using the BIOS to access the video capabilities of PCs was slow.

Applications that were display-intensive typically used to access CGA videomemory directly by casting the hexadecimal constant 0xB8000000 to a pointer toan array of 80 unsigned 16-bit int values. Each value consisted of an ASCII codein the low byte, and a colour in the high byte. Thus, to put the letter 'A' at row 5,column 2 in bright white on blue, one would write code like the following:


1 de 24 03/02/10 1



#define VID ((unsigned short (*)[80])0xB8000000) void foo() {

VID[4][1] = 0x1F00 | 'A';}

Typed pointers and casting

In many languages, pointers have the additional restriction that the object theypoint to has a specific type. For example, a pointer may be declared to point toan integer; the language will then attempt to prevent the programmer frompointing it to objects which are not integers, such as floating-point numbers,eliminating some errors.

For example, in C

int *money;char *bags;

money would be an integer pointer and bags would be a char pointer. Thefollowing would yield a compiler warning of "assignment from incompatiblepointer type" under GCC

bags = money;

because money and bags were declared with different types. To suppress thecompiler warning, it must be made explicit that you do indeed wish to make theassignment by typecasting it

bags = (char *)money;

which says to cast the integer pointer of money to a char pointer and assign tobags.

A 2005 draft of the C standard requires that casting a pointer derived from onetype to one of another type should maintain the alignment correctness for both

types (6.3.2.3 Pointers, par. 7):[3]


2 de 24 03/02/10 1



char *external_buffer = "abcdef";int *internal_data; internal_data = (int *)external_buffer; // UNDEFINED BEHAVIOUR if " // is not correctly aligned

In languages that allow pointer arithmetic, arithmetic on pointers takes intoaccount the size of the type. For example, adding an integer number to a pointerproduces another pointer that points to an address that is higher by that numbertimes the size of the type. This allows us to easily compute the address of elements of an array of a given type, as was shown in the C arrays exampleabove. When a pointer of one type is cast to another type of a different size, theprogrammer should expect that pointer arithmetic will be calculated differently.In C, for example, if the money array starts at 0x2000 and sizeof(int) is 4 byteswhereas sizeof(char) is 1 bytes, then (money+1) will point to 0x2004 but

(bags+1) will point to 0x2001. Other risks of casting include loss of data when"wide" data is written to "narrow" locations (e.g. bags[0]=65537;), unexpectedresults when bit-shifting values, and comparison problems, especially withsigned vs unsigned values.

Although it is impossible in general to determine at compile-time which casts aresafe, some languages store run-time type information which can be used toconfirm that these dangerous casts are valid at runtime. Other languages merelyaccept a conservative approximation of safe casts, or none at all.

Making pointers safer

Because pointers allow a program to access objects that are not explicitlydeclared beforehand, they enable a variety of programming errors. However, thepower they provide is so great that it can be difficult to do some programmingtasks without them. To help deal with their problems, many languages havecreated objects that have some of the useful features of pointers, while avoidingsome of their pitfalls.

One major problem with pointers is that as long as they can be directlymanipulated as a number, they can be made to point to unused addresses or to

data which is being used for other purposes. Many languages, including mostfunctional programming languages and recent imperative languages like Java,replace pointers with a more opaque type of reference, typically referred to assimply a reference , which can only be used to refer to objects and notmanipulated as numbers, preventing this type of error. Array indexing ishandled as a special case.

A pointer which does not have any address assigned to it is called a wild pointer.


3 de 24 03/02/10 1



Any attempt to use such uninitialized pointers can cause unexpected behaviour,either because the initial value is not a valid address, or because using it maydamage the runtime system and other unrelated parts of the program. The resultis often a segmentation fault or storage violation.

In systems with explicit memory allocation, it is possible to create a danglingpointer by deallocating the memory region it points into. This type of pointer isdangerous and subtle because a deallocated memory region may contain thesame data as it did before it was deallocated but may be then reallocated andoverwritten by unrelated code, unknown to the earlier code. It is claimed thatlanguages with garbage collection prevent this type of error (becausedeallocation is performed automatically) but the pointer itself is not removed bythe garbage collector and it may point to irrelevant and unpredictable data if re-used at any time after it has been deallocated.

Some languages, like C++, support smart pointers, which use a simple form of reference counting to help track allocation of dynamic memory in addition to

acting as a reference. In the absence of reference cycles, where an object refersto itself indirectly through a sequence of smart pointers, these eliminate thepossibility of dangling pointers and memory leaks. Delphi strings supportreference counting natively.

Null pointer

A null pointer has a reserved value, often but not necessarily the value zero,indicating that it refers to no object. Null pointers are used routinely,particularly in C and C++ where the compile-time constant NULL is used (though

the integer literal for zero is preferred in C++), to represent conditions such asthe lack of a successor to the last element of a linked list, while maintaining aconsistent structure for the list nodes. This use of null pointers can be comparedto nullable types, null values in relational databases and to the Nothing value inan option type. Nullable references were invented by C.A.R. Hoare in 1965 aspart of the Algol W language. Hoare later (2009) described his invention as a

"billion-dollar mistake":[4][5]

I call it my billion-dollar mistake. It was the invention of the null reference in1965. At that time, I was designing the first comprehensive type system forreferences in an object oriented language (ALGOL W). My goal was to ensure

that all use of references should be absolutely safe, with checking performedautomatically by the compiler. But I couldn't resist the temptation to put in a

null reference, simply because it was so easy to implement. This has led toinnumerable errors, vulnerabilities, and system crashes, which have probablycaused a billion dollars of pain and damage in the last forty years.

Since a null-valued pointer does not refer to a meaningful object, an attempt todereference a null pointer usually causes a run-time error. If this error is left


4 de 24 03/02/10 1



unhandled, the program terminates immediately. In the case of C, executionhalts with a segmentation fault because the literal address of NULL is neverallocated to a running program. In Java, access to a null reference triggers aNullPointerException (http://java.sun.com/javase/6/docs/api/java/lang/NullPointerException.html) , which can be caught by error handlingcode, but the preferred practice is to ensure that such exceptions never occur.

In safe languages a possibly-null pointer can be replaced with a tagged unionwhich enforces explicit handling of the exceptional case; in fact, a possibly-nullpointer can be seen as a tagged pointer with a computed tag. Other languages,such as Objective-C, allow messages to be sent to a nil address (the value of apointer that does not point to a valid object) without causing the program to beinterrupted; the message will simply be ignored, and the return value (if any) is

nil or 0, depending on the type[6].

In C and C++ programming, two null pointers are guaranteed to compare equal; ANSI C guarantees that any null pointer will be equal to 0 in a comparison withan integer type; furthermore the macro NULL is defined as a null pointer

constant, that is value 0 (either as an integer type or converted to a pointer tovoid), so a null pointer will compare equal to NULL.

A null pointer should not be confused with an uninitialized pointer: a nullpointer is guaranteed to compare unequal to any valid pointer, whereasdepending on the language and implementation an uninitialized pointer mighthave either an indeterminate (random or meaningless) value or might beinitialised to an initial constant (possibly but not necessarily NULL).

In most C programming environments malloc returns a null pointer if it is unableto allocate the memory region requested, which notifies the caller that there is

insufficient memory available. However, some implementations of malloc allowmalloc(0) with the return of a null pointer and instead indicate failure by bothreturning a null pointer and setting errno to an appropriate value.

Computer systems based on a tagged architecture are able to distinguish inhardware between a NULL dereference and a legitimate attempt to access aword or structure at address zero.

In some programming language environments (at least one proprietary Lispimplementation, for example) the value used as the null pointer (called nil inLisp) may actually be a pointer to a block of internal data useful to the

implementation (but not explicitly reachable from user programs), thus allowingthe same register to be used as a useful constant and a quick way of accessingimplementation internals. This is known as the nil vector .

Double indirection


5 de 24 03/02/10 1



In some languages a pointer can reference another pointer, requiring twodereference operations to get to the original value. While each level of indirection may add a performance cost, it is sometimes necessary in order toprovide correct behavior for complex data structures. For example, in C it istypical to define a linked list in terms of an element that contains a pointer to thenext element of the list:

struct element{ struct element * next; int value;}; struct element * head = NULL;

This implementation uses a pointer to the first element in the list as a surrogatefor the entire list. If a new value is added to the beginning of the list, head has bechanged to point to the new element. Since C arguments are always passed by value, using double indirection allows the insertion to be implemented correctly,and has the desirable side-effect of eliminating special case code to deal withinsertions at the front of the list:

// Given a sorted list at *head, insert the element item at the firs// location where all earlier elements have lesser or equal value.void insert(struct element **head, struct element *item)

{ struct element ** p; // p points to a pointer to an element

for (p = head; *p != NULL; p = &(*p)->next) { if (item->value <= (*p)->value) break; }

item->next = *p; *p = item;}

// Caller does this:insert(&head, item);

In this case, if the value of item is less than that of head, the caller's head isproperly updated to the address of the new item.


6 de 24 03/02/10 1



Wild pointers

Wild pointers are pointers that have not been initialized (that is, a wild pointerdoes not have any address assigned to it) and may make a program crash orbehave oddly. In the Pascal or C programming languages, pointers that are notspecifically initialized may point to unpredictable addresses in memory.

The following example code shows a wild pointer:

int func(void){ char *p1 = malloc(sizeof(char)); /* (undefined) value of some pl char *p2; /* wild (uninitialized) pointer */ *p1 = 'a'; /* This is OK, assuming malloc() has not returne

*p2 = 'b'; /* This invokes undefined behavior */ }

Here, p2 may point to anywhere in memory, so performing the assignment *p2 ='b' will corrupt an unknown area of memory that may contain sensitive data.

Wild branch

Where a pointer is used as the address of the entry point to a program or start of a subroutine and is also either uninitialized or corrupted, if a call or jump isnevertheless made to this address, a "wild branch" is said to have occurred. The

consequences are usually unpredictable and the error may present itself inseveral different ways depending upon whether or not the pointer is a "valid"address and whether or not there is (coincidentally) a valid instruction (/opcode)at that address. The detection of a wild branch can present one of the mostdifficult and frustrating debugging exercises since much of the evidence mayalready have been destroyed beforehand or by execution of one or moreinappropriate instructions at the branch location. If available, an instruction setsimulator can usually not only detect a wild branch before it takes effect, butalso provide a complete or partial trace of its history.

Simulation using an array index It is possible to simulate pointer behaviour using an index to an (normallyone-dimensional) array.

Primarily for languages which do not support pointers explicitly but do supportarrays, the array can be thought of and processed as if it were the entire memoryrange (within the scope of the particular array) and any index to it can be


7 de 24 03/02/10 1



thought of as equivalent to a general purpose register in assembly language(that points to the individual bytes but whose actual value is relative to the startof the array, not its absolute address in memory). Assuming the array is, say, acontiguous 16 megabyte character data structure, individual bytes (or a stringof contiguous bytes within the array) can be directly addressed and manipulatedusing the name of the array with a 31 bit unsigned integer as the simulated

pointer (this is quite similar to the C arrays example shown above). Pointerarithmetic can be simulated by adding or subtracting from the index, withminimal additional overhead compared to genuine pointer arithmetic.

It is even theoretically possible, using the above technique, together with asuitable instruction set simulator to simulate any machine code or theintermediate (byte code) of any processor/language in another language thatdoes not support pointers at all (for example java / javascript). To achieve this,the binary code can initially be loaded into contiguous bytes of the array for thesimulator to "read", interpret and action entirely within the memory contained of the same array. If necessary, to completely avoid buffer overflow problems,

bounds checking can usually be actioned for the compiler (or if not, hand codedin the simulator).

Support in various programming languages

Ada

Ada is a strongly typed language where all pointers are typed and only safe typeconversions are permitted. All pointers are by default initialized to null, and anyattempt to access data through a null pointer causes an exception to be raised.

Pointers in Ada are called access types . Ada 83 did not permit arithmetic onaccess types (although many compiler vendors provided for it as a non-standardfeature), but Ada 95 supports “safe” arithmetic on access types via the packageSystem.Storage_Elements.

BASIC

Originally, BASIC did not support pointers. Newer dialects of BASIC, such asFreeBASIC or BlitzMax, have exhaustive pointer implementations, however.

In FreeBASIC, maths on ANY pointers (equivalent to C's void*) are treated asthough the ANY pointer was a byte width. ANY pointers cannot be dereferenced,as in C. Also, casting between ANY and any other type's pointers will not generateany warnings.


8 de 24 03/02/10 1



dim as integer f = 257dim as any ptr g = @fdim as integer ptr i = gassert(*i = 257)assert( (g + 4) = (@f + 1) )

C and C++

In C and C++ pointers are variables that store addresses and can be null . Eachpointer has a type it points to, but one can freely cast between pointer types,although the behaviour is implementation-defined. A special pointer type calledthe “void pointer” allows pointing to any variable type, but is limited by the factthat it cannot be dereferenced directly. The address can be directly manipulatedby casting a pointer to and from an integral type of sufficient size. While earlierstandards did not have an integral type that was guaranteed to be large enough,

C99 has added the uintptr_t type defined in <stdint.h>.

C++ fully supports C pointers and C typecasting. It also supports a new group of typecasting operators to help catch some unintended dangerous casts atcompile-time. The C++ standard library also provides auto ptr, a sort of smartpointer which can be used in some situations as a safe alternative to primitive Cpointers. C++ also supports another form of reference, quite different from apointer, called simply a reference or reference type .

Pointer arithmetic, that is, the ability to modify a pointer's target address witharithmetic operations (as well as magnitude comparisons), is restricted by the

language standard to remain within the bounds of a single array object (or justafter it), though many non-segmented architectures will allow for more lenientarithmetic. Adding or subtracting from a pointer moves it by a multiple of thesize of the datatype it points to. For example, adding 1 to a pointer to 4-byteinteger values will increment the pointer by 4. This has the effect of incrementing the pointer to point at the next element in a contiguous array of integers—which is often the intended result. Pointer arithmetic cannot be performed on void pointers because the void type has no size, and thus thepointed address can not be added to, although gcc and other compilers willperform byte arithmetic on void* as a non-standard extension. For working'directly' with bytes they usually cast pointers to BYTE*, or unsigned char* if BYTE isn't defined in the standard library used.

Pointer arithmetic provides the programmer with a single way of dealing withdifferent types: adding and subtracting the number of elements required insteadof the actual offset in bytes. (though the char pointer , char being defined asalways having a size of one byte, allows the element offset of pointer arithmeticto in practice be equal to a byte offset) In particular, the C definition explicitly


9 de 24 03/02/10 1



declares that the syntax a[n], which is the n-th element of the array a, isequivalent to *(a+n), which is the content of the element pointed by a+n. Thisimplies that n[a] is equivalent to a[n].

While powerful, pointer arithmetic can be a source of computer bugs. It tends toconfuse novice programmers, forcing them into different contexts: an expressioncan be an ordinary arithmetic one or a pointer arithmetic one, and sometimes itis easy to mistake one for the other. In response to this, many modern high levelcomputer languages (for example Java) do not permit direct access to memoryusing addresses. Also, the safe C dialect Cyclone addresses many of the issueswith pointers. See C programming language for more criticism.

The void pointer , or void*, is supported in ANSI C and C++ as a genericpointer type. A pointer to void can store an address to any data type, and, in C,is implicitly converted to any other pointer type on assignment, but it must beexplicitly cast if dereferenced inline. K&R C used char* for the “type-agnosticpointer” purpose (before ANSI C).

int x = 4;void* q = &x;int* p = q; /* void* implicity converted to int*: valid C, but not int i = *p;int j = *(int*)q; /* when dereferencing inline, there is no implicit

C++ does not allow the implicit conversion of void* to other pointer types, noteven in assignments. This was a design decision to avoid careless and even

unintended casts, though most compilers only output warnings, not errors, whenencountering other ill casts.

int x = 4;void* q = &x;// int* p = q; This fails in C++: there is no implicit conversion frint* a = (int*)q; // C-style castint* b = static_cast<int*>(q); // C++ cast

In C++, there is no void& (reference to void) to complement void* (pointer to void), because references behave like aliases to the variables they point to, andthere can never be a variable whose type is void.

C#

In the C# programming language, pointers are supported only under certainconditions: any block of code including pointers must be marked with the unsafe


20 de 24 03/02/10 1



keyword. Such blocks usually require higher security permissions thanpointerless code to be allowed to run. The syntax is essentially the same as inC++, and the address pointed can be either managed or unmanaged memory.However, pointers to managed memory (any pointer to a managed object) mustbe declared using the fixed keyword, which prevents the garbage collector frommoving the pointed object as part of memory management while the pointer is in

scope, thus keeping the pointer address valid.

An exception to this is from using the IntPtr structure, which is a safe managedequivalent to int*, and does not require unsafe code. This type is often returnedwhen using methods from the System.Runtime.InteropServices, for example:

// Get 16 bytes of memory from the process' unmanaged memory IntPtr pointer = System.Runtime.InteropServices.Marshal.AllocHGlobal // Do something with the allocated memory

// Free the allocated memory System.Runtime.InteropServices.Marshal.FreeHGlobal(pointer);

The .NET framework includes many classes and methods in the System andSystem.Runtime.InteropServicesnamespaces (such as the Marshal class) whichconvert .NET types (for example, System.String) to and from many unmanagedtypes and pointers (for example, LPWSTR or void *) to allow communication withunmanaged code.

D

The D programming language is a derivative of C and C++ which fully supportsC pointers and C typecasting. However D also offers numerous constructs suchas foreach loops, out function parameters, reference types, and advanced arrayhandling which replace pointers for most routine programming tasks.

Eiffel

The Eiffel object-oriented language supports pointers in the form of references,

which are typed and do not allow any form of pointer arithmetic. The ECMA standard for Eiffel includes an "attached type" mechanism that claims toguarantee void safety.

Fortran

Fortran-90 introduced a strongly-typed pointer capability. Fortran pointerscontain more than just a simple memory address. They also encapsulate the


21 de 24 03/02/10 1



lower and upper bounds of array dimensions, strides (for example, to supportarbitrary array sections), and other metadata. An association operator , => is usedto associate a POINTER to a variable which has a TARGET attribute. The Fortran-90ALLOCATE statement may also be used to associate a pointer to a block of memory.For example, the following code might be used to define and create a linked liststructure:

type real_list_t real :: sample_data(100) type (real_list_t), pointer :: next => null ()end type type (real_list_t), target :: my_real_listtype (real_list_t), pointer :: real_list_temp real_list_temp => my_real_list

doread (1,iostat=ioerr) real_list_temp%sample_data

if (ioerr /= 0) exit allocate (real_list_temp%next)real_list_temp => real_list_temp%next

end do

Fortran-2003 adds support for procedure pointers. Also, as part of the C Interoperability feature, Fortran-2003 supports intrinsic functions for convertingC-style pointers into Fortran pointers and back.

Modula-2

Pointers are implemented very much as in Pascal, as are VAR parameters inprocedure calls. Modula 2 is even more strongly typed than Pascal, with fewerways to escape the type system. Some of the variants of Modula 2 (such asModula-3) include garbage collection.

Oberon

Much as with Modula-2, pointers are available. There are still fewer ways toevade the type system and so Oberon and its variants are still safer with respectto pointers than Modula-2 or its variants. As with Modula-3, garbage collection isa part of the language specification.

Pascal


22 de 24 03/02/10 1



Pascal implements pointers in a straightforward, limited, and relatively safe way.It catches mistakes such as dereferencing a pointer into the wrong datatype;however, a pointer can be cast from one pointer type to another.

Pointer arithmetic is unrestricted; adding or subtracting from a pointer moves itby that number of bytes in either direction, but using the Inc or Dec standardprocedures with it moves the pointer by the size of the datatype it is declared topoint to.

Trying to dereference a null pointer, named nil in Pascal, or a pointerreferencing unallocated memory, raises an exception in protected mode.Parameters may be passed using pointers (as var parameters) but areautomatically handled by the static compilation system.

See also

Address constant, a pointer in IBM/360 / Z/Architecture AssemblerlanguageBounded pointerBuffer overflowHazard pointerOpaque pointerPointer swizzlingReference (computer science)Static code analysisStorage violation Variable (programming)

References

^ http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf 1.

^ a b Plauger, P J; Brodie, Jim. ANSI and ISO Standard C Programmer's Reference .

Redmond, WA: Microsoft Press. pp. 108, 51. ISBN 1556153597. "An array type does

not contain additional holes because all other types pack tightly when composed into

arrays [at page 51] "

2.

^ WG14 N1124 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf) , C -

Approved standards: ISO/IEC 9899 - Programming languages - C (http://www.open-

std.org/jtc1/sc22/wg14/www/standards.html) , 2005-05-06.

3.

^ Tony Hoare (2009). "Null References: The Billion Dollar Mistake"

(http://qconlondon.com/london-2009/presentation

/Null+References:+The+Billion+Dollar+Mistake) . QCon London.

http://qconlondon.com/london-2009/presentation

/Null+References:+The+Billion+Dollar+Mistake.

4.

^ Tony Hoare (2009-08-25). "Null References: The Billion Dollar Mistake"5.


23 de 24 03/02/10 1



(http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-

Tony-Hoare) . InfoQ.com. http://www.infoq.com/presentations/Null-References-

The-Billion-Dollar-Mistake-Tony-Hoare.

^ http://developer.apple.com/documentation/Cocoa/Conceptual/ObjectiveC/Articles

/ocObjectsClasses.html#//apple_ref/doc/uid/TP30001163-CH11-SW7 - The Objective-C

2.0 Programming Language , section "Sending Messages to nil".

6.

External links

Pointers and Memory (http://cslibrary.stanford.edu/) Introduction topointers – Stanford Computer Science Education Library0pointer.de (http://0pointer.de/) A terse list of minimum length sourcecodes that dereference a null pointer in several different programminglanguagesTutorial: C++ pointers by Alf P. Steinbach (http://alfps.izfree.com/tutorials/pointers/) This webpage contains a 171 pages pdf file discussing pointers

in C++"The C book" - containing pointer examples in ANSI C(http://publications.gbdirect.co.uk/c_book/chapter5/pointers.html)

Retrieved from "http://en.wikipedia.org/wiki/Pointer_(computing)"Categories: Articles with example C code | Data types | Primitive types

This page was last modified on 24 January 2010 at 07:22.Text is available under the Creative Commons Attribution-ShareAlikeLicense; additional terms may apply. See Terms of Use for details.Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a

non-profit organization.


pointers and null pointer

Documents