Number Systems
Introduction
The goal of this handout is to make you comfortable with the binary
number system. We will correlate your previous knowledge of the decimal number
system to the binary number system. That will lay the foundations on which our
discussion of variousbi representation schemes for
numbers (both integer and real numbers) will be based.
Decimal Number System
the decimal numberCounting as we have been taught
since ki
). Indecany
numbersystemsystem,.Decimalgiven
mtheansbasebase 10 (the prefix
(often referred to), theasradixnumber of digits that can be used to count is fixed. For
example in the base 10 number system, the digits that can be used to
count are 0,1,2,3,4,5,6,7,8,9.
Generalizing that for any base b, the first b digits (starting with 0)
represent the digits that are used to count. When a number ≥ b has to be represented, the place
values are used.
Example
1. Consider the number 1234. It can be represented as
1*103+ 2*102 + 3*101 + 4*100
|
(1)
|
Where:
- 1 is in
the thousand’s place
- 2 is in
the hundred’s place
- 3 is in
the ten’s place
- 4 is in
the one’s place.
The equation (1) is an expanded representation of 1234. The expanded
representation has the advantage of making the base of the number system
explicit.
Example
2. Consider the number 1234.567. It is represented as
1*103+ 2*102 + 3*101 + 4*100+ 5*10-1+ 6*10-2+ 7*10-3
|
(2)
|
Where:
- 5 is in
the tenth’s place
- 6 is in
the hundredth’s place
- 7 is in
the thousandth’s place
In equation (2), the representation includes digits both to the left and
to the right of the decimal point.
Binary Number System
Binary means base 2 (the prefix bi).
Based on our earlier discussion of the decimal number system, the digits that can be used to count in this number
system are 0 and 1. The 0,1 used in the binary system are called nary digits (bits)
The bit is the smallest piece of information that can be stored in
a computer. It can have one of two values 0 or 1. Think of a bit as a switch
that can be either on or off. For example,
Bit
|
Value
|
0
|
OFF / FALSE
|
1
|
ON/ TRUE
|
Table
1. Interpreting Bit Values
|
|
From
the hardware perspective, ON and OFF can be represented as voltage levels
|
|
(typically
0V for logic 0 and +3.3 to +5V for logic 1). Since only two values can be
|
|
stored
in a bit, we combine a series of bits to represent more information. Again
the
|
|
concept
of place values is applicable here as well.
|
|
Example
3. Consider the binary number 1101. It can be represented as
|
|
1*23+ 1*22 + 0*21 + 1*20
|
(3)
|
This expanded notation also gives you the means of converting binary
numbers directly into the equivalent decimal number.
8+4+0+1=13
Example 4. Consider the binary number 1101.101. It
can be represented as:
1*23+ 1*22 + 0*21 + 1*20+ 1*2-1+ 0*2-2+ 1*2-3
|
(4)
|
The same
notation is applicable to real numbers represented in binary notation. The
equivalent decimal number is
13 + 0.5 + 0 + 0.125 = 13.625
To represent larger numbers, we
have to group series of bits. Two of these groupings are
of importance:
- Nibble A nibble
is a group of four bits
- Byte A byte is
a group of eight bits
subtraction. Again, we will correlate the
addition and subtraction operations in the decimal number system to the
binary number system.
|
Example 5. Consider the operation 145 + 256
|
Carry 1
|
Byte Big-Endian
Little-Endian
|
0
|
00000100
|
00000001
|
1
|
00000001
|
00000100
|
Table 2. Big-Endian versus Little-Endian Representation of
00000100 00000001
|
The endian system may sometimes
be used to represent the bit order within a byte. In this context, equation
(5) refers to the little -endian
ordering of bits within the byte.
|
Operations on Numbers
|
The byte is the smallest addressable unit in most
computers. The key components of a byte are shown below in figure 1.
msb
|
lsb
|
|||||
a7 a6
|
a5
|
A4
|
a3
|
a2 a1 a0
|
||
Figure 1. Byte
|
||||||
The most
significant bit (msb) is the bit with the highest place value, while the least significant bit (lsb)
denotes the bit position that has the lowest place value. To convert the
byte to the equivalent integer
number, the formula in (5) is used.
a7 * 27 + a6 * 26 + a5 * 25 + a4 * 24 + a3 * 23 + a2 * 22 + a1 * 21 + a0 * 20 (5)
Self Exercise: What is the value
of the bit pattern 11111111 ?
Answer: 255
Self- Exercise: What is the range
of values represented by an 8-bit binary number?
Answer: 0 to 255.
Note: When we look at negative number
representation using the same 8-bit number, the range will change.
When we want to represent a value larger than 255,
we have to group bytes together to form words. The size of words is
dependent on the underlying processor, but is usually an even number of bytes
(typically 4 bytes).
In the big-endian system, the byte with the
largest significance is stored first (big-end-first). Conversely, in the little-endian
system, the byte with the least significance is stored first
(little-end-first). The number 1025 in binary is 00000100 00000001.
The sequencing of bytes to form larger numbers
leads to the issue of which is the first byte in the sequence. Many mainframe
computers, particularly IBM mainframes, use a big-endian architecture. Most
modern computers, including PCs, use the little -endian system. The PowerPC
system is bi-endian because it can
understand both systems
The term endian comes from Swift's
"Gulliver's Travels" via the famous paper "On Holy Wars and a
Plea for Peace" by Danny Cohen, USC/ISI IEN 137, 1980-04-01. The
Lilliputians, being very small, had correspondingly small political problems.
The Big-
Little -andEndianEndian parties debated over whether
soft-boiled eggs should be opened at the big end or the little end.
Carry 1
145
+ 256
401
The algorithm works by starting at the least significant digit and
working from right to left. At each place position, the digits are added and if
the resulting number is a single digit, it is entered in the same place
position in the sum. If on the other hand, the summation operation results in 2
digits, the digit of lower significance is entered into the place position of
the sum and the digit of higher significance is added along with the other
digits in the next most significant place (or is carried over to the next
most significant place).
The same principle applies to binary addition.
Rule
|
Step
|
Result
|
Carry
|
1
|
0 + 0
|
0
|
0
|
2
|
0 + 1
|
1
|
0
|
3
|
1 + 0
|
1
|
0
|
4
|
1 + 1
|
0
|
1
|

Table 3.Rules for Binary Addition
Example 6. Consider the
operation:
Carry 1
|
most
significant bit to determine if the
|
||||
11100100
|
|||||
+
|
00000101
|
||||
11101001
|
|||||
Note that at the 2nd bit position, there is a carry of 1 into the 3rd bit position (counting from the
right with 0 being the first position).
Self-Exercise: Convert the numbers above into
decimal to verify that the answer is correct.
Answer: 1 1 1 0 0 1 0 0 = 228, 0
0 0 0 0 1 0 1 = 5, 1 1 1 0 1 0 0 1 = 233 = 228+ 5
Decimal subtraction works very similar to decimal
addition, the numbers are aligned to the same place values and the algorithm
proceeds from right to left. The bottom digit is subtracted from the top digit,
and the result written in the place value position in the result. If the top
digit is less than the bottom digit, then we must 'borrow' from the next place
value position. That means decrementing the top digit in the next significant
position and adding the base to the top digit of this position before
performing the subtraction. This operation gets even more complicated when
there is a ‘0’ in the next significant position.
Example 7. Consider the decimal
operation 445 - 256:
445
256
189
The decimal subtraction algorithm can get very
complicated and the time taken to perform the subtraction can vary greatly.
Since the binary addition algorithm is already
understood
and already implemented in hardware, it can be reused to also perform–2.Substituting
the bit patterns from the subtraction.
The operation x – y can be re -written as x+ (-y).
That brings up the question – How are negative numbers represented in binary
notation?
Negative Numbers
Negative binary numbers are represented by the ‘- ’
sign followed by the magnitude of the number. The computer however does not
have a means of representing signs. The sign has to be captured in the bit
pattern itself.
Signed Magnitude
Representation
The signed magnitude
representation uses the
number is positive or negative. The advantage of this notation is that
by examining the msb alone, it is possible to determine if the number is
positive or negative. The disadvantage however is that one bit pattern is
wasted (there are two possible representations for zero) and subtraction cannot
be performed using addition alone. Table 4. shows the signed magnitude representation
of numbers using 4 bits.
Bit Pattern
|
Number
|
0000
|
0
|
0001
|
1
|
0010
|
2
|
0011
|
3
|
0100
|
4
|
0101
|
5
|
0110
|
6
|
0111
|
7
|
1000
|
-0
|
1001
|
-1
|
1010
|
-2
|
1011
|
-3
|
1100
|
-4
|
1101
|
-5
|
1110
|
-6
|
1111
|
-7
|

Table 4. Numbers using 4-bit signed
magnitude representation
Note: +0 and –0 have different
bit patterns.
Example 8. Consider the following
operation 7
table:
0111
+ 1010
0001
The bit pattern 0001 is 1 but the result should by
5 0101.
One’s Complement
The one’s
complement notation represents a negative number by inverting theagesbits
thatin the sign of the number can be each place. The one’s complement notation
for a 4-bit number is shown in Table 5. Again
the limitations of the sign magnitude
representation are not overcome (there are two bit patterns used to represent 0
and the addition operation cannot be used to perform subtraction). The one’s
complement is important because it is very easy to perform the inversion
operation in hardware and it forms the basis of computing the two’s complement.
Bit
Pattern
|
Number
|
0000
|
0
|
0001
|
1
|
0010
|
2
|
0011
|
3
|
0100
|
4
|
0101
|
5
|
0110
|
6
|
0111
|
7
|
1111
|
-0
|
1110
|
-1
|
1101
|
-2
|
1100
|
-3
|
1011
|
-4
|
1010
|
-5
|
1001
|
-6
|
1000
|
-7
|

Table 5. Numbers
using 1’s complement representation
Example 9. Consider the following operation 7 – 2. Substituting the bit
patterns from the table:
0111
+ 1101
0100
The bit pattern 0100 is 4 but the result should by
5 0101.
Two’s Complement
The two’s complement notation builds on the one’s complement notation.
The algorithm goes as follows:
1. Compute the 1’s complement
2. Add 1 to the result to get the 2’s complement.
The two’s complement notation has
the advant
computed by looking at the msb. The addition operation can be used to
perform subtraction. Also, there is only one bit-pattern to represent ‘0’ so an
extra number can be represented. Table 6 summarizes the 2’s complement notation
for a 4-bit number.
Bit Pattern
|
Number
|
0000
|
0
|
0001
|
1
|
0010
|
2
|
0011
|
3
|
0100
|
4
|
0101
|
5
|
0110
|
6
|
0111
|
7
|
1111
|
-1
|
1110
|
-2
|
1101
|
-3
|
1100
|
-4
|
1011
|
-5
|
1010
|
-6
|
1001
|
-7
|
1000
|
-8
|

Table 6. Numbers
using 2’s complement representation
Example 10. Consider the following operation 7 – 2. Substituting the bit
patterns from the table:
0111
+ 1110
0101
The bit pattern 0101 is 5, which
is the expected result.
The limitation with the 2’s complement notation is that the bit patterns
are not in order i.e. comparing the bit patterns alone does not provide any
information as to which number is larger.
Excess Notation
The excess notation is a means of representing both negative and
positive numbers in a manner in which the order of the bit patterns is
maintained. The algorithm for computing the excess notation bit pattern is as
follows:
1.
Add the
excess value (2N-1, where N
is the number of bits used to represent the number) to the number.
2. Convert the resulting number into binary format.
ic Number
|
The 2N-1
|
is often
referred to as
the
|
for computing
the excess
|
|
representation
of the number (except that there is no magic in it). Table 7 presents all the
|
||||
numbers
that can be represented using the excess-8 notation.
|
||||
Number
|
Excess Number
|
Bit
Pattern
|
||
7
|
15
|
1111
|
||
6
|
14
|
1110
|
||
5
|
13
|
1101
|
||
4
|
12
|
1100
|
||
3
|
11
|
1011
|
||
2
|
10
|
1010
|
||
1
|
9
|
1001
|
||
0
|
8
|
1000
|
||
-1
|
7
|
0111
|
||
-2
|
6
|
0110
|
||
-3
|
5
|
0101
|
||
-4
|
4
|
0100
|
||
-5
|
3
|
0011
|
||
-6
|
2
|
0010
|
||
-7
|
1
|
0001
|
||
-8
|
0
|
0000
|
||

Table 7. Numbers
using the Excess-8 representation
The number of bits used to represent a code in
excess -8 is 4 bits (24-1 = 8).
Also, the bit patterns are in sequence (the largest number that can be represented
has the bit pattern 1111).
Example 11. Consider the following operation 7 – 2.
Substituting the bit patterns from the table:
1111
+ 0110
0101
The result of the addition operation is the
bit-pattern used for 5 in binary. The excess notation representation however
takes longer to compute than the 2’s complement notation. The excess notation
will however play an important part in computing floating-point
representations.
Self-Exercise: What is the range of numbers that
can be represented using the Excess-2N-1 notation?
Answer: 2N-1- 1 to -2N-1
Bias
Notation
The excess notation is a special case of the biased
notation. For instance, excess-8 is biased around 8 (i.e.0 has the bit pattern
associated with decimal 8). Instead of using the magic number, any number
(bias) can be used.
Note: This concept becomes important when we
address the IEEE Single Precision Floating-Point standard.
Floating-Point Notation
The
floating-point notation is used:
a. To represent integers that are larger than the maximum value that
can be held by a bit-pattern (the maximum value that can be held by 8 bits is
255).
b. To represent real numbers.
Large
Integers
Consider a really large number 1,234,567. The
number requires seven places to represent the value. If the number of places
available to represent the number is limited to say four places, certain digits
have to be dropped. The selection of digits to be dropped is based on
git. In this case, we will drop
thethelastvaluethreeassociateddigits‘567’with. the di
The
resulting number is:
1,234,000
The loss of ‘567’ is a loss of precision but if the most
significant digits were to be eliminated, say ‘123’, then the resulting number
is 4,567, which presents an even
greater loss of precision.
Rules for
determining significance (integers):
1. A nonzero digit is always significant
2. The digit '0' is significant if it lies
between other significant digits
3. The digit '0' is never significant if it
precedes all the nonzero digits Self-Exercise: What are the significant digits
in 0012340? Answer: 0012340
1,234,000
can be represented using only four digits as 1234 * 103. This
representation is
exponentialcalledformthe .
The
exponential form consists of two parts:
mantissa – (the
significand) the significant digits (1234)
exponent – the power to which the base is
raised before being multiplied by the mantissa. (3).
In the binary fixed-point notation, the radix position is fixed at a
certain point within the bit pattern. To illustrate the notation, we will
consider an 8-bit bit-pattern (byte) as shown below.
a7 a6 a5 a4 a3 a2 a1 a0

We can define the fixed-point
notation to be:
Two
special forms of representation are:
1.234scientific*106–thenotation
the most
significant digit.
0.1234normalized*107–thenotation
of the most significant digit.
, in which the decimal point is to the right of
, in which the decimal point is to the left
- a7, a6, a5, a4 – contain the integer part in two’s complement
form
- a3, a2, a1, a0 – contain the fractional part in normal binary
form
a7 a6 a5 a4
|
Value
|
a3 a2 a1 a0
|
Value
|
0000
|
0
|
0000
|
0
|
0001
|
1
|
0001
|
0.0625
|
0010
|
2
|
0010
|
0.125
|
0011
|
3
|
0011
|
0.1875
|
Real
Numbers
Mathematically, real numbers are set of rational
and irrational numbers. A rational number≠ is any number that can be
represented as a ratio of two integers (a/b, b 0). Irrational numbers are real
numbers that are not rational i.e. they cannot be represented as a ratio of two
integers (typically numbers whose decimal expansion never ends and never enters
a periodic pattern).
There are
a number of ways of representing a real number in a computer.
1.
One
notation is the fixed point, wherein the decimal point (radix) is fixed at some
position between the digits. The digits to the left of the radix are integer
values and those to the right of the radix are fractions of some fixed unit.
For example: 10.82 = 1 x 101 + 0 x 100 + 8 x 10-1 + 2 x 10-2, the radix is between 0 and 8.
This notation is limited both in the range of numbers that can be represented
as well as in the precision of the numbers that can be represented.
2.
Another
notation is to use rational notation (represent the real number as a ratio of
two integers). This representation is not always possible because all real
numbers are not necessarily rational!
3.
The
floating-point notation is the most common of the representation schemes. It is
based on either the scientific or normalized representation of the number.
Binary
Fixed-Point Notation
0100
|
4
|
0100
|
0.250
|
0101
|
5
|
0101
|
0.3125
|
0110
|
6
|
0110
|
0.375
|
0111
|
7
|
0111
|
0.4375
|
1111
|
-1
|
1000
|
0.5
|
1110
|
-2
|
1001
|
0.5625
|
1101
|
-3
|
1010
|
0.625
|
1100
|
-4
|
1011
|
0.6875
|
1011
|
-5
|
1100
|
0.75
|
1010
|
-6
|
1101
|
0.8125
|
1001
|
-7
|
1110
|
0.875
|
1000
|
-8
|
1111
|
0.9375
|
Table 8. Integer
part values Table 9. Fractional Part Values
Tables
8,9 show the possible values that the different segments of the fixed-point
notation can take.
Self-Exercise: What is the range
of values that the fixed-point notation can take?
Answer: -8.9375 to 7.93725
Self-Exercise: What is the
precision of the notation?
Answer: The precision is 0.0625.
The
notation makes an assumption about the representation (primarily the location
of the radix and the format used for the integer part). For two computers (or
two programs for
x
|
1. Determine the value of the exponent
|
||||||||||||
by
means of standards. At the end of this handout, we will discuss the IEEE
standard
|
2. If the exponent
|
is
negative,
|
|||||||||||
used
for representing floating point numbers.
|
a. Add x
leading 0’s to the fractional part
|
||||||||||||
Example12.
What is the value represented by the bit pattern 11011110? Assume the radix
|
b. Insert
a radix point before the 0’s
|
||||||||||||
point
is between bit positions 3 and 4.
|
x
|
3. If the exponent is positive,
|
|||||||||||
a. Move the radix point to the right places.
|
|||||||||||||
The bit
pattern can be split into the integer part 1101 and the fractional part 1110.
|
4. Convert the binary number into decimal
|
||||||||||||
1101
|
=
|
-3
|
5. Add the sign based on the a7 bit.
|
||||||||||
1110
|
=
|
0.875
|
Self-Exercise:
Why does step 3 in the algorithm above not address adding zeroes?
|
||||||||||
11011110
|
=
|
-3.875
|
Answer:
From the excess table above, it is clear that the maximum positive exponent
|
||||||||||
move to
the right of
|
the
fraction part,
|
is 3.
Moving the radix point right does not
|
|||||||||||
Modified
Significance Rules
|
hence,
padding (adding leading 0’s or trailing 0’s) to the right is not needed.
|
||||||||||||
Excess
Number
|
Actual Value
|
Bit-Pattern
|
|||||||||||
Rules
for determining significance:
|
7
|
3
|
111
|
||||||||||
6
|
2
|
110
|
|||||||||||
1.
|
A ‘1’
bit is always significant
|
5
|
1
|
101
|
|||||||||
2.
|
The bit
|
'0' is
significant if it lies between other significant bits
|
4
|
0
|
100
|
||||||||
3
|
-1
|
011
|
|||||||||||
3. The bit '0' is never significant if it
precedes all 0’s, even if it follows an embedded
|
2
|
-2
|
010
|
||||||||||
1
|
-3
|
001
|
|||||||||||
etween bit positions 3 and
4)radix. (in the example above, the radix is
|
0
|
-4
|
000
|
||||||||||
4. The bit '0' is significant if it follows an
embedded radix point and other significant
|
Table 10. Excess-4 Conversion Table
|
||||||||||||
bits
|
|||||||||||||
Self-Exercise:
What are the significant bits in the bit pattern00.010000?
|
Example
13. What is the decimal value associated with 01111001(assume that the bit
|
||||||||||||
pattern
is in 8-bit floating point format)?
|
|||||||||||||
Answer:
00.010000
|
Answer:
Split the byte into the respective components,
|
||||||||||||
8-bit
Floating Point Notation
|
10010
|
111
|
|||||||||||
The
8-bit floating point notation can be describe based on the byte shown below:
|
Step 1.
Convert the excess-four exponent into its decimal equivalent.
|
||||||||||||
111 =
3, hence the exponent = 3.
|
|||||||||||||
a7 a6 a5 a4 a3 a2 a1 a0
|
|||||||||||||
Step 2.
Since the exponent is positive, move the radix point three place to the
|
|||||||||||||
Where
|
right.
|
||||||||||||
-
|
a7
|
– sign
bit
|
The
binary number = 0.1001 * 23
|
||||||||||
-
|
a6
|
a5
|
a4
|
– exponent in excess-4 notation
|
|||||||||
-
|
a3
|
a2
|
a1
|
a0
|
–
fractional part in normal binary
|
= 100.1
|
|||||||
The
algorithm for converting from 8-bit
floating point to decimal is detailed below:
|
Step 3.
Convert the binary number to decimal, 100.1 = 4.5
|
||||||||||||


Step 4.
Add the sign, hence 01111001 = 4.5
The algorithm for converting from
decimal to 8-bit floating point is
detailed below:
1. Set the sign bit (a7) to 1 if
the number is negative, 0 otherwise
2. Convert the number into binary representation.
3.
Normalize
the binary number (move the radix point to the left of the most significant
bit).
4. Convert the exponent into excess-4 notation
5. Set a6 a5 a4 to the
exponent value.
6. Select the 4 most significant bits and enter them
into the fraction part (a3 a2 a1 a0).
There are some limitations of the 8-bit floating-point notation. Some of
them are listed below:
- The
maximum positive exponent is 3
- The lowest
negative exponent is –4
- The
precision is determined by the exponent
Example 14: Convert 11/4 into
8-bit floating-point notation.
Answer:
Step 1.

0 a6 a5 a4 a3 a2 a1 a0
Step 2. Convert the number into binary form,11/4 = 2 ¾ = 0010.11 Step 3.
Normalize 0010.11 = 0.1011 * 22
Step 4.
Convert the exponent to excess-4, 2 = 110
Step 5.
Fill in the exponent
0 1 1 0 a3 a2 a1 a0
Step 6.
Fill in the fraction
0 1 1 0 1 0 1 1
Hence 2 ¾
= 01101011
IEEE 754
Single Precision Floating Point Notation
Given that the normalization procedure always has a
1 in the first significant bit position, it is possible to use scientific
notation instead and always implicitly assume that the first bit is 1.This
allows us to gain an additional bit of precision in the representation.
Sign
Bit
|
8 bit
exponent in
|
23 bits
to represent a 24 bit
|
|||||||
bias
127 notation
|
mantissa
|
||||||||
Figure 2. IEEE 32
bit floating point notation
The single
precision representation uses 32 bits as shown in Figure 2 above. The
standard also defines a double precision
standard that uses 64 bits. We will only be discussing the single precision
standard.
The method for converting decimal numbers into the
754 standard representation is as follows:
1. If the number is negative
a. set the sign bit to ‘1’
2. If the number is positive
a. Set the sign bit to ‘0’
3. Convert the decimal number into binary form.
4.
Convert
the binary number into scientific notation (move the radix immediately right of
the most significant bit).
5. Convert the exponent into bias-127 notation
6. Fill in the 8 bits demarcated for the exponent
7.
Fill in
the bits (except the most significant bit) into the mantissa from left to
right. Pad any remaining spaces with 0’s.
The exponent is computed as bias-127. The algorithm for computing the biased
exponent is as follows:
1.
Add 127
to the decimal value of the exponent (The exponent value is negative if the
radix is moved to the left).
2. Convert the decimal value into binary
Example 15. Convert 194.375 into
IEEE-754 single precision representation.
Answer:
Step1.
Fill in the sign bit
0
Step 2.
Convert the decimal number into binary.
194
|
=
|
11000010
|
0.375
|
=
|
0.011
|
194.375
|
=
|
11000010.011
|
Step 3.
Convert to scientific notation
1.1000010011
* 27
Step
4.Convert the exponent into bias-127 notation
7 + 127 =
134 = 10000110
Step
5.Fill in the exponent
010000110
Step6.
Fill in the mantissa without the most significant bit.
01000011010000100110000000000000
Hence 194.375 = 0 1 0 0 0 0 1 1 0
1 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
Endnote
This handout was developed for
Unified Engineering, Department of Aeronautics and
Astronautics, MIT. Any errors of
commission or omission are mine
Comments
Post a Comment