Main Page | See live article | Alphabetical index

Machine language

A system of codes directly understandable by a computer's CPU is termed this CPU's native or machine language. Although machine code may seem similar to assembly language they are in fact two different types of languages. Assembly code consists of both binary numbers and simple words whereas machine code is composed only of the two binary digits 0 and 1. Every CPU has its own machine language, although there is considerable overlap between some. If CPU A understands the full language of CPU B it is said that A is compatible with B. CPU B may not be compatible with CPU A, as A may know a few codes that B does not.

The "words" of a machine language are called instructions; each of these gives a basic command to the CPU. A program is just a long list of instructions that are executed by a CPU. Older processors executed instructions one after the other, but newer superscalar processors are capable of executing several instructions at once. Program flow may be influenced by special jump instructions that transfer execution to an instruction other than the following one. Conditional jumps are taken (execution continues at another address) or not (execution continues at the next instruction) depending on some condition.

Instructions are simply a pattern of bits -- different patterns correspond to different commands to the machine. The more readable rendition of a machine language is called assembly language.

Some languages give all their instructions the same number of bits, while the instruction length differs in others. How the patterns are organised depends largely on the specific language. Common to most is the division of an instruction into fields, of which one or more specify the exact operation (for example "add"). Other fields may give the type of the operands, their location, or their value directly (operands contained in an instruction are called immediate).

As a specific example, let us take a look at the MIPS architecture. Its instructions are always 32 bit long. The general type of instruction is given by the op field, the highest 6 bits. J-type and I-type instructions are fully specified by op. R-type instructions include an addtional field funct to determine the exact operation. The fields used in these types are:

   6      5     5     5     5      6 bits
[  op  |  rs |  rt |  rd |shamt| funct]  R-type
[  op  |  rs |  rt | address/immediate]  I-type
[  op  |        target address        ]  J-type

rs, rt, and rd indicate register operands; shamt gives a shift amount; and the address or immediate fields contain an operand directly.

For example adding the registers 1 and 2 and placing the result in register 6 is encoded:

    
[  op  |  rs |  rt |  rd |shamt| funct]
    0     1     2     6     0     32     decimal
 000000 00001 00010 00110 00000 100000   binary

Loading a value from the memory cell 68 cells after the one register 3 points to into register 8:

[  op  |  rs |  rt | address/immediate]
   35     3     8           68           decimal
 100011 00011 01000 000000000001000100   binary

Jumping to the address 1025:

[  op  |        target address        ]
    2                 1025               decimal
 000010 000000000000000000010000000001   binary

See also

CISC, RISC, VLIW, Endianness.

Further reading

Patterson and Hennessy: Computer Organization and Design. The Hardware/Software Interface. Morgan Kaufmann Publishers. ISBN 1-55860-281-X