This seems to clearly indicate a bug in the system, and it's a bigger shock to find out that, no, that's the way it was happens to work (except in computer algebra systems). This document explains such issues in detail.
Nearly all computer users understand the concept of a "bit", or in computer terms, a 1 or 0 value encoded by the setting of a switch. It's not much more difficult to see that if you take two bits, you can use them to represent four unique states:
Actually, in some cases 4 bits is a convenient number of bits to deal with, and this collection of bits is called, somewhat painfully, the "nybble". In this document, we will refer to "nybbles" often, but please remember that in reality the term "byte" is common, while the term "nybble" is not.
A nybble can encode 16 different values, such as the numbers 0 to 15. Any arbitrary sequence of bits could be used in principle, but in practice the most natural way is as follows:
Each digit in the number represents a value from 0 to 9, which is ten different possible values, and that's why it's called a decimal or "base-10" number. Each digit also has a "weight" of a power of ten proportional to its position. This sounds complicated, but it's not. It's exactly what you take for granted when you look at a number. You know it without even having to think about it.
Similarly, in the binary number encoding scheme explained above, the value 13 is encoded as:
Use of these prefixes can get a bit confusing since in some documents it can be unclear if, say, "kilo" actually means 1,024 or if it means 1,000. Often in computer discussions it means 1,024 but some writers are sloppy on this point. In any case, we'll see these prefixes often as we continue.
There is a subtlety in this discussion. If we use 16 bits, we can have 65,536 different values, but the values are from 0 to 65,535. People start counting at one, machines start counting from zero, since it's simpler from their point of view. This small and mildly confusing fact even trips up computer mechanics every now and then.
Anyway, this defines a simple way to count with bits, but it has a few restrictions:
See also Base 64.
Let's take a side-trip to discuss representation of binary numbers. Computer mechanics often need to write out binary quantities, but in practice writing out a binary number like:
In a hex system, we have 16 digits (0 through 9 followed, by convention, with a through f) and we count up through the sequence as follows:
Each of these number systems are positional systems, but while decimal weights are powers of 10, the octal weights are powers of 8 and the hex weights are powers of 16. For example:
After defining unsigned binary numbers, the next step is to modify this scheme to have negative numbers, or "signed integers". The simplest thing to do would be to reserve one bit to indicate the sign of a number. This "sign bit" would probably be the leftmost bit, though it could be the rightmost for all it matters. If the sign bit is 0, the number is positive, and if the sign bit is 1 the number is negative.
This works, and it is used in a few obscure applications, but although it's the obvious solution from a human point of view, it makes life hard for machines. For example, this scheme gives a positive and negative value for zero! A human might shrug at that, but it gives a machine fits.
The more natural way, from the machine point of view, is to split the range of binary numbers for a given number of bits in half and use the top half to represent negative numbers. For example, 4-bit data gives zero, seven positive number and eight negative numbers:
This has some similarities to the sign-bit scheme in that a negative number has its topmost bit set to "1", but the two concepts are different. In sign-magnitude numbers, a "-5" is:
So now we can represent unsigned and signed integers as binary quantities. Remember that these are just two ways of interpreting a pattern of bits. If a computer has a binary value in memory of, say:
Fixed-point formats are often used in business calculations (such as with spreadsheets or COBOL); where floating-point with insufficient precision is unacceptable when dealing when money. It is helpful to study it to see how fractions can be stored in binary.
First, we have to decide how many bits we are using to store the fractional part of a number, and how many we are using to store the integer part. Let's say that we are using a 32-bit format, with 16 bits for the integer and 16 for the fraction.
How are the fractional bits used? They continue the pattern set by the integer bits: if the eight's bit is followed by the four's bit, then the two's bit, then the one's bit, then of course the next bit is the half's bit, then the quarter's bit, then the 1/8's bit, etc.
Examples:
While both unsigned and signed integers are used in digital systems, even a 32-bit integer is not enough to handle all the range of numbers a calculator can handle, and that's not even including fractions. To obtain greater range we have to abandon signed integers and fixed-point numbers and go to a "floating-point" format.
In the decimal system, we are familiar with floating-point numbers of the form:
Bits, bytes, nybbles, and unsigned integers
00 01 10 11
If you have three bits, then you can use then to represent eight unique states: 000 001 010 011 100 101 110 111
With every bit you add, you double the number of states you can represent. Therefore, the expression for the number of states with n bits is 2n.
Most computers operate on information 8 bits, or some other power of 2, like 16, 32, or 64 bits, at a time. A group of 8 bits is widely used as a fundamental unit, and has been given the odd name of the "byte". A computer's processor accepts data a byte or multiples of a byte at a time. Memories are organized to store data a byte or multiples of a byte per each addressable location. 0000 = decimal 0 1000 = decimal 8
0001 = decimal 1 1001 = decimal 9
0010 = decimal 2 1010 = decimal 10
0011 = decimal 3 1011 = decimal 11
0100 = decimal 4 1100 = decimal 12
0101 = decimal 5 1101 = decimal 13
0110 = decimal 6 1110 = decimal 14
0111 = decimal 7 1111 = decimal 15
This is natural because it follows our instinctive way of considering a normal decimal number. For example, given the decimal number:
we automatically interpret this as:
or, using powers-of-10 notation:
Note that any number to the "0th" power is 1.
Each bit can only have a value of 1 or 0, which is two values, making this a "binary", or "base-2" number. Accordingly, the "positional weighting" is as follows:
Notice the values of powers of 2 used here: 1, 2, 4, 8. People who get into the guts of computers generally get to know the powers of 2 up to the 16th power by heart, not because they memorize them but because they use them a great deal: 20 = 1 28 = 256
21 = 2 29 = 512
22 = 4 210 = 1,024
23 = 8 211 = 2,048
24 = 16 212 = 4,096
25 = 32 213 = 8,192
26 = 64 214 = 16,384
27 = 128 215 = 32,768
216 = 65,536
Another thing to remember is that, aping the metric system, the value 210 = 1,024 is referred to as "kilo", or simply "K", so any higher powers of 2 are often conveniently referred to as multiples of that value: 211 = 2 K = 2,048
212 = 4 K = 4,096
213 = 8 K = 8,192
214 = 16 K = 16,384
215 = 32 K = 32,768
216 = 64 K = 65,536
Similarly, the value 220 = 1,024 × 1,024 = 1,048,576 is referred to as a "meg", or simply "M": 221 = 2 M
222 = 4 M
and the value 230 is referred to as a "gig", or simply "G".
Despite these limitations, such "unsigned integer" numbers are very useful in computers, mostly for simply counting things one-by-one. They're very easy for the computer to manipulate. Generally computers use 16-bit and 32-bit unsigned integers, normally referred to as "integer" and "long integer" (or just "long") numbers. An "integer" allows for numbers from 0 to 65,535, while a "long" allows for integer numbers from 0 to 4,294,967,295.Octal and hex numbers
is a pain, and inclined to error at that. Generally computer mechanics write binary quantities in a base-8 ("octal") or, much more commonly, a base-16 ("hexadecimal" or "hex") number format.
This is another thing that sounds tricky but is actually simple. If it wasn't, there wouldn't be any point in doing it. In our normal decimal system, we have 10 digits (0 through 9) and count up as follows:
In an octal system, we only have 8 digits (0 through 7) and we count up through the same sequence of numbers as follows:
That is, an octal "10" is the same as a decimal "8", an octal "20" is a decimal 16, and so on.
That is, a hex "10" is the same as a decimal "16" and a hex "20" is the same as a decimal "32".
OK, if you don't understand that very well, don't worry about it too much. The real point is that an octal digit has a perfect correspondence to a 3-bit binary value number:
Similarly, a hex digit has a perfect correspondence to a 4-bit binary number:
0000 = hex 0 1000 = hex 8
0001 = hex 1 1001 = hex 9
0010 = hex 2 1010 = hex a
0011 = hex 3 1011 = hex b
0100 = hex 4 1100 = hex c
0101 = hex 5 1101 = hex d
0110 = hex 6 1110 = hex e
0111 = hex 7 1111 = hex f
So it is easy to convert a long binary number, such as 1001001101010001, to octal: 1 001 001 101 010 001 binary = 111521 octal
and easier to convert that number to hex: 1001 0011 0101 0001 = 9351 hex
but it takes a lot of figuring to convert it to decimal (37,713 decimal). Octal and hex make a convenient way to represent binary "machine" quantities. Signed integers and two's complement
0000 = decimal 0
0001 = decimal 1
0010 = decimal 2
0011 = decimal 3
0100 = decimal 4
0101 = decimal 5
0110 = decimal 6
0111 = decimal 7
1000 = decimal -8
1001 = decimal -7
1010 = decimal -6
1011 = decimal -5
1100 = decimal -4
1101 = decimal -3
1110 = decimal -2
1111 = decimal -1
Now we have a "signed integer" number system, using a scheme known as, for reasons unimportant here, "two's complement". With a 16-bit signed integer number, we can encode numbers from -32,768 to 32,767. With a 32-bit signed integer number, we can encode numbers from -2,147,483,648 to 2,147,482,647. 1101
while in two's complement numbers, it is: 1011
which in sign-magnitude numbers is "-3". Why two's complement is simpler for machines to work with will be explained in a later section. 1101
-- this could be interpreted as a decimal "13" or a decimal "-3". Fixed-point numbers
Integer bits Fractional bits
0.5 = 1/2 = 00000000 00000000.10000000 00000000
1.25 = 1 1/4 = 00000000 00000001.01000000 00000000
7.375 = 7 3/8 = 00000000 00000111.01100000 00000000
Now for something tricky: try a fraction like 1/5 (in decimal, this is 0.2). You can't do it exactly. The best you can do is one of these:
And no, you couldn't do it exactly, not even if you had more digits. The point is: some fractions cannot be expressed exactly in binary notation... not unless you use a special trick. The trick is, of course, to store a fraction as two numbers, one for the numerator and one for the denominator, and then use your grade-school arithmetic to add, subtract, multiply, and divide them. However, your grade-school arithmetic will not let you do higher math (such as square roots) with fractions, nor will it help you if the lowest common denominator of two fractions is too big a number to handle. This is why there are advantages to using the fixed-point notation for fractional numbers.Floating-point numbers
or, more compactly: 1.1030402E5
which means "1.103402 times 1 followed by 5 zeroes". We have a certain numeric value (1.1030402) known as a "significand", multiplied by a power of 10 (E5, meaning 105 or 100,000), known as an "exponent".
If we have a negative exponent, that means the number is multiplied by a 1 that many places to the right of the decimal point. For example:
The advantage of this scheme is that by using the exponent we can get a much wider range of numbers, even if the number of digits in the significand, or the "numeric precision", is much smaller than the range.
Similar binary floating-point formats can be defined for computers. There are a number of such schemes, the most popular has been defined by IEEE (Institute of Electrical & Electronic Engineers, a US professional and standards organization). The IEEE 754 standard specification defines a 64 bit floating-point format with:
Let's see what this format looks like by showing how such a number would be stored in 8 bytes of memory: byte 0: S x10 x9 x8 x7 x6 x5 x4
byte 1: x3 x2 x1 x0 m51 m50 m49 m48
byte 2: m47 m46 m45 m44 m43 m42 m41 m40
byte 3: m39 m38 m37 m36 m35 m34 m33 m32
byte 4: m31 m30 m29 m28 m27 m26 m25 m24
byte 5: m23 m22 m21 m20 m19 m18 m17 m16
byte 6: m15 m14 m13 m12 m11 m10 m9 m8
byte 7: m7 m6 m5 m4 m3 m2 m1 m0
where "S" denotes the sign bit, "x" denotes an exponent bit, and "m" denotes a significand bit. Once the bits here have been extracted, they are converted with the computation:
This scheme provides numbers valid out to about 15 decimal digits, with the following range of numbers:
maximum | minimum | |
---|---|---|
positive | 1.797693134862231E+308 | 4.940656458412465E-324 |
negative | -4.940656458412465E-324 | -1.797693134862231E+308 |
The spec also defines several special values that are not defined numbers, and are known as "NaNs", for "Not A Number". These are used by programs to designate overflow errors and the like. You will rarely encounter them and NaNs will not be discussed further here. Some programs also use 32-bit floating-point numbers. The most common scheme uses a 23-bit significand with a sign bit, plus an 8-bit exponent in "excess-127" format, giving 7 valid decimal digits.
byte 0: S x7 x6 x5 x4 x3 x2 x1 byte 1: x0 m22 m21 m20 m19 m18 m17 m16 byte 2: m15 m14 m13 m12 m11 m10 m9 m8 byte 3: m7 m6 m5 m4 m3 m2 m1 m0The bits are converted to a numeric value with the computation:
maximum | minimum | |
---|---|---|
positive | 3.402823E+38 | 2.802597E-45 |
negative | -2.802597E-45 | -3.402823E+38 |
Such floating-point numbers are known as "reals" or "floats" in general, but with a number of inconsistent variations, depending on context:
A 32-bit float value is sometimes called a "real32" or a "single", meaning "single-precision floating-point value".
A 64-bit float is sometimes called a "real64" or a "double", meaning "double-precision floating-point value".
The term "real" without any elaboration generally means a 64-bit value, while the term "float" similarly generally means a 32-bit value.
Once again, remember that bits are bits. If you have 8 bytes stored in computer memory, it might be a 64-bit real, two 32-bit reals, or 4 signed or unsigned integers, or some other kind of data that fits into 8 bytes.
The only difference is how the computer interprets them. If the computer stored four unsigned integers and then read them back from memory as a 64-bit real, it almost always would be a perfectly valid real number, though it would be junk data.
So now our computer can handle positive and negative numbers with fractional parts. However, even with floating-point numbers you run into some of the same problems that you did with integers:
If you keep dividing you'll eventually get one with a negative exponent too big for the real value to hold and have a "numeric underflow". Remember that a negative exponent gives the number of places to the right of the decimal point and means a really small number.
The maximum real value is sometimes called "machine infinity", since that's the biggest value the computer can wrap its little silicon brain around.
This means that if you add a very small number to a very large one, the result is just the large one. The small number was too small to even show up in 15 or 16 digits of resolution, and the computer effectively discards it. If you are performing computations and you start getting really insane answers from things that normally work, you may need to check the range of your data. It's possible to "scale" the values to get more accurate results.
It also means that if you do floating-point computations, there's likely to be a small error in the result since some lower digits have been dropped. This effect is unnoticeable in most cases, but if you do some math analysis that requires lots of computations, the errors tend to build up and can throw off the results.
The faction of people who use computers for doing math understand these errors very well, and have methods for minimizing the effects of such errors, as well as for estimating how big the errors are.
By the way, this "precision" problem is not the same as the "range" problem at the top of this list. The range issue deals with the maximum size of the exponent, while the resolution issue deals with the number of digits that can fit into the significand.
That is, if you want to do a computation on a decimal fraction that is a neat sum of reciprocal powers of two, such as 0.75, the binary number that represents this fraction will be 0.11, or 1/2 + 1/4, and all will be fine.
Unfortunately, in many cases you can't get a sum of these "reciprocal powers of 2" that precisely matches a specific decimal fraction, and the results of computations will be very slightly off, way down in the very small parts of a fraction. For example, the decimal fraction "0.1" is equivalent to an infinitely repeating binary fraction: 0.000110011 ...
However, high-level programming languages such as LISP and Python offer an abstract number that may be an expanded type such as rational, bignum, or complex. Programmers in LISP or Python (among others) have some assurance that their programming systems will Do The Right Thing with mathematical operations. Due to operator overloading, mathematical operations on any number -- whether signed, unsigned, rational, floating-point, fixed-point, integral, or complex -- are written exactly the same way. Others, such as Rexx or Java provide decimal floating-point which avoids many 'unexpected' results.
So now we have several means for using bits to encode numbers. But what about text? How can a computer store names, addresses, letters to your folks?
Well, if you remember that bits are bits, there's no reason that a set of bits can't be used to represent a character like "A" or "?" or "z" or whatever. Since most computers work on data a byte at a time, it is convenient to use a single byte to represent a single character. For example, we could assign the bit pattern:
There is a standard binary encoding for western text characters, known as the "American Standard Code for Information Interchange (ASCII)". The following table shows the characters represented by ASCII, with each character followed by its value in decimal ("d"), hex ("h"), and octal ("o"):
ch ctl d h o ch d h o ch d h o ch d h o
______________________________________________________________________
NUL ^@ 0 0 0 sp 32 20 40 @ 64 40 100 ' 96 60 140
SOH ^A 1 1 1 ! 33 21 41 A 65 41 101 a 97 61 141
STX ^B 2 2 2 " 34 22 42 B 66 42 102 b 98 62 142
ETX ^C 3 3 3 # 35 23 43 C 67 43 103 c 99 63 143
EOT ^D 4 4 4 $ 36 24 44 D 68 44 104 d 100 64 144
ENQ ^E 5 5 5 % 37 25 45 E 69 45 105 e 101 65 145
ACK ^F 6 6 6 & 38 26 46 F 70 46 106 f 102 66 146
BEL ^G 7 7 7 ` 39 27 47 G 71 47 107 g 103 67 147
BS ^H 8 8 10 ( 40 28 50 H 72 48 110 h 104 68 150
HT ^I 9 9 11 ) 41 29 51 I 73 49 111 i 105 69 151
LF ^J 10 a 12 * 42 2a 52 J 74 4a 112 j 106 6a 152
VT ^K 11 b 13 _ 43 2b 53 K 75 4b 113 k 107 6b 153
FF ^L 12 c 14 , 44 2c 54 L 76 4c 114 l 108 6c 154
CR ^M 13 d 15 _ 45 2d 55 M 77 4d 115 m 109 6d 155
SO ^N 14 e 16 . 46 2e 56 N 78 4e 116 n 110 6e 156
SI ^O 15 f 17 / 47 2f 57 O 79 4f 117 o 111 6f 157
DLE ^P 16 10 20 0 48 30 60 P 80 50 120 p 112 70 160
DC1 ^Q 17 11 21 1 49 31 61 Q 81 51 121 q 113 71 161
DC2 ^R 18 12 22 2 50 32 62 R 82 52 122 r 114 72 162
DC3 ^S 19 13 23 3 51 33 63 S 83 53 123 s 115 73 163
DC4 ^T 20 14 24 4 52 34 64 T 84 54 124 t 116 74 164
NAK ^U 21 15 25 5 53 35 65 U 85 55 125 u 117 75 165
SYN ^V 22 16 26 6 54 36 66 V 86 56 126 v 118 76 166
ETB ^W 23 17 27 7 55 37 67 W 87 57 127 w 119 77 167
CAN ^X 24 18 30 8 56 38 70 X 88 58 130 x 120 78 170
EM ^Y 25 19 31 9 57 39 71 Y 89 59 131 y 121 79 171
SUB ^Z 26 1a 32 : 58 3a 72 Z 90 5a 132 z 122 7a 172
ESC ^[ 27 1b 33 ; 59 3b 73 [ 91 5b 133 { 123 7b 173
FS ^\\ 28 1c 34 < 60 3c 74 \\ 92 5c 134 | 124 7c 174
GS ^] 29 1d 35 = 61 3d 75 ] 93 5d 135 } 125 7d 175
RS ^^ 30 1e 36 > 62 3e 76 ^ 94 5e 136 ~ 126 7e 176
US ^_ 31 1f 37 ? 63 3f 77 _ 95 5f 137 DEL 127 7f 177
______________________________________________________________________
The ASCII table above only defines 128 characters, which implies that ASCII characters only need 7 bits. However, since most computers store information in terms of bytes, normally there will be one character stored to a byte. This extra bit allows a second set of 128 characters, an "extended" character set, to be defined beyond the 128 defined by ASCII.
In practice, there are a number of different extended character sets, providing such features as math symbols, cute little line-pattern building block characters for building forms, and extension characters for non-English languages. The extensions are not highly standardized and tend to lead to confusion.
This table serves to emphasize one of the main ideas of this document: bits are bits. In this case, you have bits representing characters. You can describe the particular code for a particular character in decimal, octal, or hexadecimal, but it's still the same code. The value that is expressed, whether it is in decimal, octal, or hex, is simply the same pattern of bits.
Of course, you normally want to use many characters at once to display sentences and the like, such as:
Now let's consider a particularly confusing issue for the newcomer: the fact that you can represent a number in ASCII as a string, for example:
Confused? Don't feel too bad, even experienced people get subtly confused with this issue sometimes. The essential point is that the values the computer works on are just sets of bits. For you to actually see the values, you have to get an ASCII representation of them. Or to put it simply: machines work with bits and bytes, humans work with ASCII, and there has to be translation to allow the two to communicate.
8 bits is clearly not enough to allow representation of, say, Japanese characters, since their basic set is a little over 2,000 different characters. As a result, to encode Asian languages such as Japanese or Chinese, computers use a 16-bit code for characters. There are a variety of specs for encoding non-Western characters, the most widely used being "Unicode", which provides character codes for Western, Asian, Indic, Hebrew, and other character sets, including even Egyptian hieroglyphics.Numbers in Programming Languages
Low-level programmers have to worry about unsigned and signed, fixed and floating-point numbers. They have to write wildly different code, with different opcodes and operands, to add two floating point numbers compared to the code to add two integers. Encoding text: ASCII and strings
0100 0110 (hex 46)
to the letter "F", for example. The computer sends such "character codes" to its display to print the characters that make up the text you see. ASCII Table
______________________________________________________________________
The strange characters listed in the leftmost column, such as "FF" and "BS", do not correspond to text characters. Instead, they correspond to "control" characters that, when sent to a printer or display device, execute various control functions. For example, "FF" is a "form feed" or printer page eject, "BS" is a backspace, , and "BEL" causes a beep ("bell"). In a text editor, they'll just be shown as a little white block or a blank space or (in some cases) little smiling faces, musical notes, and other bizarre items. To type them in, in many applications you can hold down the CTRL key and press an appropriate code. For example, pressing CTRL and entering "G" gives CTRL-G, or "^G" in the table above, the BEL character.
This of course is is simply represented as a sequence of ASCII codes, represented in hex below:
Computers store such "stringss" of ASCII characters as "arrays" of consecutive memory locations. Some applications include a binary value as part of the string to show many characters are stored in it. More commonly, applications use a special character, usually a NULL (the character with ASCII code 0), as a "terminator" to indicate the end of the string. Most of the time users will not need to worry about these details, as the application takes care of them automatically, though if you are writing programs that manipulate characters and strings you will have to understand how they are implemented.
When a computer displays this value, it actually sends the following ASCII codes, represented in hex, to the display:
The confusion arises because the computer could store the value 1.537E3 as, say, a 32-bit real, in which you get a pattern of 4 bytes that make up the exponent and significand and all that. To display the 32-bit real, the computer has to translate it to the ASCII string just shown above, as an "ASCII numeric representation", or just "ASCII number". If it just displayed the 32-bit binary real number directly, you'd get four "garbage" characters.
But, now to get really confusing, suppose you wanted to view the bits of a 32-bit real directly, bypassing conversion to the ASCII string value. Then the computer would display something like:
The trick is that to display these values, the computer uses the ASCII characters for "1", "0", and " " (space character), with hex values as follows:
It could also display the bits as an octal or hex ASCII value. We often get queries from users saying they are dealing with "hex numbers". On investigation it usually proves that they are manipulating binary values that are presented by hex numbers in ASCII.