Main Page | See live article | Alphabetical index

Array

An array, also known as a vector or list, is one of the simplest data structures in computer programming. Arrays hold a fixed number of equally-sized data elements, generally of the same data type. Individual elements are accessed by index using a consecutive range of integers, as opposed to an associative array. Most programming languages have arrays as a built-in data type. Some arrays are multi-dimensional, meaning they are indexed by a fixed number of integers, for example by a tuple of two integers; one- and two-dimensional arrays are the most common.

Table of contents
1 Advantages and Disadvantages
2 Uses
3 Representations of Multi-dimensional Arrays
4 External Links

Advantages and Disadvantages

Arrays permit efficient (constant time, O(1)) random access but not efficient insertion and deletion of elements (which are O(n), where n is the size of the array). This is in contrast to linked lists, which have the opposite trade-off. Consequently, arrays are most appropriate for storing a fixed amount of data which will be accessed in an unpredictable fashion, and linked lists are best for a list of data which will be accessed sequentially and updated often.

Another advantage of arrays that has become very important on modern architectures is that iterating through an array has good locality of reference, and so is much faster than iterating through (say) a linked list of the same size, which tends to jump around in memory. However, an array can also be accessed in a random way, as is done with large hash tables, and in this case this is not a benefit.

Arrays also are among the most compact data structures; if you store 100 integers in an array, it takes only as much space as the 100 integers, and no more. Any pointer-based data structure, on the other hand, must keep its pointers somewhere, and these occupy space. This effect is magnified the smaller the data elements become; for example, an array of 100 characterss usually takes 100 bytes, while on a 32-bit platform an aligned linked list of 100 characters takes 800 bytes, eight times as many. Conversely, for very large elements, the space difference becomes negligible.

A drawback of the simplicity of arrays is the possibility of referencing a non-existent element by using an index outside the range of valid indices. This is known as exceeding the array bounds. The result is, at best, a program working with incorrect data. At worst, the whole system can crash. This problem is particularly felt in the C programming language, the powerful syntax of which is unfortunately prone to this kind of error. Some other languages have built-in bounds checking, and will refuse to index an array outside of its permitted range.

Uses

Although useful in their own right, arrays also form the basis for several more complex data structures, such as heaps, hash tables, and vlistss, and can be used to represent strings, stacks and queues. They also play a more minor role in many other data structures. All of these applications benefit from the compactness and locality benefits of arrays.

One of the disadvantages of an array is that it has a single fixed size, and although its size can be altered in many environments, this is an expensive operation. Growable arrays or dynamic arrays are arrays which automatically perform this resizing as late as possible, when the programmer attempts to add an element to the end of the array and there is no more space. To average the high cost of resizing over a long period of time (we say it is an amortized cost), they expand by a large amount, and when the programmer attempts to expand the array again, it just uses more of this reserved space.

In the C programming language, one-dimensional character arrays are used to store null-terminated strings, so called because the end of the string is indicated with a special reserved character called a null character ('\\0') (see also C string).

Finally, in some applications where the data is the same or is missing for most values of the indexes, or for large ranges of indexes, space is saved by not storing an array at all, but having an associative array with integer keys. There are many specialized data structures specifically for this purpose, such as Patricia tries and Judy arrays. Example applications include address translation tabless and routing tables.

Representations of Multi-dimensional Arrays

Mapping a one-dimensional array into memory is obvious, since memory is logically itself a (very large) one-dimensional array. When we reach higher-dimensional arrays, however, the problem is no longer obvious. Suppose we want to represent this simple two-dimensional array:

A few common representations include:

123456789

147258369

The first two forms are more compact and have potentially better locality of reference, but are also more limiting; the arrays must be rectangular, meaning that no row or column can contain more elements than any other. Arrays of arrays, on the other hand, allow the creation of ragged arrays, in which the valid range of one index depends on the value of another, or in this case, simply that different rows can be different sizes. Arrays of arrays are also of value in programming languages that only supply one-dimensional arrays as primitives.

In many applications, such as numerical applications working with matrices, we iterate over rectangular two-dimensional arrays, especially two-dimensional arrays, in interesting ways. For example, computing an element of the matrix product AB involves iterating over a row of A and a column of B simultaneously. In mapping the individual array indexes into memory, we wish to exploit locality of reference as much as we can. A compiler can sometimes automatically choose the layout for an array so that sequentially accessed elements are stored sequentially in memory; in our example, it might choose row-major order for A, and column-major order for B. Even more exotic orderings can be used, for example if we iterate over the main diagonal of a matrix.

External Links

See also: Monge array, set, collection