In programming languages, a pointer is a datatype whose value is used to refer to ("points to") another value stored elsewhere in the computer memory. Obtaining the value that a pointer refers to is called dereferencing the pointer.
Pointers are a very thin abstraction on top of the addressing capabilities provided by most modern architectures. In the simplest scheme, an address, or a numeric index, is assigned to each unit of memory in the system, where the unit is typically either a byte or a word, effectively transforming all of memory into a very large array. Then, if we have an address, the system provides an operation to retrieve the value stored in the memory unit at that address. Pointers are datatypes which hold addresses.
In the usual case, a pointer is large enough to hold more different addresses than there are units of memory in the system. This introduces the possibility that a program may attempt to access an address which corresponds to no unit of memory, called a segmentation fault. On the other hand, some systems have more units of memory than there are addresses. In this case, a more complex scheme such as segmentation or paging is employed to use different parts of the memory at different times.
In order to provide a consistent interface, some architectures provide memory-mapped I/O, which allows some addresses to refer to units of memory while others refer to device registers of other devices in the computer. There are analogous concepts such as file offsets, array indices, and remote object references that serve some of the same purposes as addresses for other types of objects.
Pointers, which are directly supported without restrictions in C, C++, and most assembly languages, are fundamental in constructing nearly all data structures, as well as in passing data between different parts of a program.
When dealing with arrays, the critical lookup operation typically involves a stage called address calculation which involves constructing a pointer to the desired data element in the array. In other data structures, such as linked lists, pointers are explicitly used to tie one piece of the structure to another.
In many languages, pointers have the additional restriction that the object they point to has a specific type. For example, a pointer may be declared to point to an integer; the language will then attempt to prevent the programmer from pointing it to objects which are not integers, such as floating-point numbers, eliminating some errors.
However, few languages strictly enforce pointer types, because programmers often run into situations where they want to treat an object of one type as though it were of another type. For these cases, it is possible to typecast, or cast, the pointer. Some casts are always safe, while other casts are dangerous, possibly resulting in segmentation faults, overflows, or incorrect behavior later on. Some languages store run-time type information which can be used to confirm that these dangerous casts are valid at runtime.
Because pointers are so close to the hardware, they enable a variety of programming errors. However, the power they provide is so great that it's difficult to do anything useful without them. To help deal with their problems, many languages have created objects that have some of the useful features of pointers, while avoiding some of their pitfalls.
One major problem with pointers is that, as long as they can be directly manipulated as a number, they can be made to point to unused addresses or to data which is being used for other purposes. Many languages, including most functional programming languages and recent imperative languages like Java, replace pointers with references, which can only be used to refer to objects and not manipulated as numbers, preventing this type of error.
In systems with explicit memory allocation, it's possible to create a "dangling," or invalid pointer, by deallocating the memory region it points into. This type of pointer is dangerous and subtle, because a deallocated memory region looks the same as it did before, but can be reused at any time by unrelated code. Languages with garbage collection prevent this type of error.
Some languages, like C++, support smart pointers, which help track allocation of dynamic memory in addition to acting as a reference, effectively creating objects with lifetimes limited to a static scope. This helps prevent memory leaks and use of dangling pointers.
The null pointer is a pointer with a reserved value indicating that it refers to no object. Null pointers are used routinely, particularly in C, to represent exceptional conditions such as the lack of a successor to the last element of a linked list, while maintaining a consistent structure for the list nodes. This use of null pointers can be compared to the use of null values in relational databases. Because it refers to nothing, an attempt to dereference a null pointer causes a run-time error that usually terminates the programming immediately. In safe languages a possibly null pointer can be replaced with a tagged union which enforces explicit handling of the exceptional case.
In C, pointers are variables that store addresses and can be null. Each pointer has a type it points to, but one can freely cast between pointer types. A special pointer type called the void pointer points to an object of unknown type. The address can be directly manipulated by casting a pointer to and from an integer. Pointer arithmetic is unrestricted; adding or subtracting from a pointer moves it by a multiple of the size of the datatype it points to.
C++ is a derivative of C which fully supports C pointers and C typecasting. It also supports a new group of typecasting operators to help catch some unintended dangerous casts at compile-time. The C++ standard library also provides autoptr, a sort of smart pointer which can be used in some situations as a safe alternative to primitive C pointers.
Ada is a strongly typed language where all pointers are typed and only safe type conversions are permitted. All pointers are by default initialized to null, and any attempt to access data through a null pointer causes an exception to be raised. Pointers in Ada are called access types. Ada-83 did not permit arithmetic on access types (although many compiler vendors provided for it as a non-standard feature), but Ada-95 supports "safe" arithmetic on access types via the package System.Storage_Elements.
Architectural roots
Uses
Typed pointers and casting
Making pointers safer
The null pointer
Pointer support in various programming languages
C
C++
Ada
External Links