Data and the
Binary Code System zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
‘Data’, a plural noun, is the term used to describe information which is stored in and processed by
computers. It is essential to zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
know
how such data are represented electronically before we can
begin to understand how it can be communicated between computers, communication devices
(e.g. facsimile machines)
or
other data storage devices.
As
a necessary introduction to the concept
of ‘digital’ transmission, this chapter is devoted to a description
of
tha method of representing
textual and numeric information which is called the ‘binary code’. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
4.1
THE
BINARY
CODE zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Binary code
is a means
of
representing numbers. Normally, numbers are quoted in
decimal
(or
ten-state)
code.
A
single digit in decimal code may represent any of ten
different
unit
values, from nought to nine, and is written as one of the figures zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
0,
1
,2, 3,
4,
5,
6,
7,
8,
9.
Numbers greater than nine are represented by two or more digits: twenty
for example is represented by two digits, 20, the first
‘2’
indicating the number of ‘tens’,
so
that ‘twice ten’ must be added to
‘0’
units, making twenty in all. In a three digit
decimal number, such as 235, the first digit indicates the number
of
‘hundreds’ (or ‘ten
tens’), the second digit the number
of
‘tens’ and the third digit the number
of
‘units’.
The principle extends to numbers
of
greater value, comprising four or indeed many
more digits.
Consider now another means
of
representing numbers using only a
two-state
or
binary
code system. In such a system a single digit is restricted to one of two values,
either zero or one. How then are values of two or more to be represented? The answer,
as in the decimal case, is to use more digits. ‘Two’ itself is represented as the two digits,
one-zero or 10. In the binary code scheme, therefore,
10
does not mean ‘ten’ but ‘two’.
The rationale for this is similar to the rationale of the decimal number system with
which we are all familiar.
43
Networks and Telecommunications: Design and Operation, Second Edition.
Martin P. Clark
Copyright © 1991, 1997 John Wiley & Sons Ltd
ISBNs: 0-471-97346-7 (Hardback); 0-470-84158-3 (Electronic)
44 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
DATA
AND zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
THE
BINARY
CODE
SYSTEM
In decimal the number one-thousand three-hundred and forty-five is written ‘1345’.
The rationale is
(1 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
X
103)
+
(3
X
10’)
+
(4
X
10)
+
5
The same number in binary requires many more digits, as follows.
1345(decimal)
=
10101000001(binary)
(binary)
=
1
X 210
+ox
29
+l
X28
+o
X 27
+l
X26
+ox
25
+o
X 24
+o
X
23
+O
X
22
+ox2
+l
(decimal)
1024
+O
+
256
+O
+
64
CO
+O
$0
+o
+o
+l
=
1345
Any number may be represented in the binary code system, just as any number can be
represented in decimal.
All numbers when expressed in binary consist only of
Os
and
Is,
arranged as a series
of binary digits
a
term which is usually shortened to the jargon zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
bits.
The zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
string
of
bits
of
a binary number are usually suffixed with a
‘B’,
to denote a binary number. This
prevents any confusion that the number might be a decimal one. Thus 41 is written
‘101001B’.
4.2
ELECTRICAL REPRESENTATION AND STORAGE OF
BINARY CODE NUMBERS
The advantage of the binary code system is the ease with which binary numbers can be
represented electrically. As each digit, or
bit,
of
a
binary number may only be either
0
or 1, the entire number can easily be transmitted as
a
series of ‘off’ or ‘on’ (some-
times also called
space
and
mark)
pulses of electricity. Thus forty one (101001B) could
be represented as on-off-on-off-off-on, or mark-space-mark-space-space-mark. The
number could be conveyed between two people on opposite sides of a valley, by flashing
a torch, either on or
off,
say every half second. Figure 4.1 illustrates this simple binary
USING THE BINARY CODE TO REPRESENT TEXTUAL INFORMATION zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
45 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
(
Transmitter
)
flashing torch zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Figure
4.1
A simple
binary
communication system
communication system in which two binary digits (or bits) are conveyed every second.
The speed at which the binary code number, or other information can be conveyed is
called the zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
information conveyance rate
(or more briefly the
information rate).
In this
example the rate is two bits per second, which can be expressed also as
2
bit/s.
Figure
4.1
illustrates a means of transmitting numbers, or other binary coded data by
a series of ‘on’ or
‘off
electrical states. Transmission of data, however, is not in itself
sufficient to permit proper exchange of information between the computers or other
equipment located at either end of the line; some method of data storage is needed as
well. At the sending end the data have to be stored prior to transmission, and at the
receiving end a storage medium is needed not only for the incoming data, but also for
the computer programmes required to interpret it.
4.3
USING THE BINARY CODE TO REPRESENT
TEXTUAL INFORMATION
The letters of the alphabet can be stored and transmitted over binary coded
communication systems in the same way as numbers, provided that they have first
been
binary-encoded.
There are four notable binary coding systems for alphabetic text.
In chronological order these are the
Morse code,
the
Baudot code
(used in Telex, and
also known as international alphabet number
2
IA2),
EBCDIC (extended binary coded
46 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
DATA AND THE BINARY CODE SYSTEM zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
A zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
B zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
C
D
E
F zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
G
H
I
J zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
K
L
M
N
0
.
P
Q
R
S
T
U
v
W
X
Y
2
0
1
2
3
4
5
6
7
8
9
?
Figure
4.2
The
Morse
code zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
decimal interchange code),
and
ASCII (American (national) standard code for
information interchange,
also known as international alphabet
IA5).
These four
coding schemes are now described briefly.
4.4
MORSE
CODE
The Morse code system of dots and dashes was for use over key and lamp telegraph
systems. It was also used for signalling by heliograph and by flag. Its two binary
elements are
dit
and
da (dot
and
dash).
Thirty-nine characters were coded, as shown in
Figure
4.2.
When transmitting, a short pause is inserted to mark the beginning and end
of each character; and between words there is a longer pause.
As
an example
of
morse code, we see from the figure that the word
Morse
is
transmitted as ‘da da’ (pause) ‘da da da’ (pause) ’dit da dit’ (pause) ‘dit dit dit’ (pause)
‘dit’ (which would be written as
--/---l.
-
./.
.
.l.).
4.5
BAUDOT CODE (ALPHABET IA2)
When the telex system was introduced, the
Baudot Code
(now called
the international
alphabet IA2)
was developed, with significant advantages over the Morse code for
automatic
use. Each character is represented by five binary elements (usually called
mark
and
space),
but seven elements are transmitted in total, because start (space) and
stop (mark) bits are also used. Fixing the number of elements cuts out the need for gaps
or pauses between alphabetic characters, and separate words are delimited without a
break by introducing the space
(SP)
character
(00100).
The regular flow of these signals
suits automatic transmitting and receiving devices, and makes them easier to design.
Figure
4.3
illustrates the Baudot code. Thus the sequence
of
seven bits sent to represent
the letter A are
‘space(start)-mark-mark-space-space-space-mark(stop)’.
ASCII zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
41 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Character zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Pattern
Character
Pattern
Case (letters) (figures)
5
4
3
2
7
Case (letters) (figures)
5
4
3 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
2
7
A
B
C zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
D zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
E
F
G
H
I
J
K
L
M
N
0
P
0001
1
?
11001
01110 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
f
01001
3
00001
!
01101 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
84
11010
10100
8
00110
(Bell)
01011
(
01111
)
10010
11100
01100
9
11000
0
10110
Q
1
R
4
S
T
5
U
7
v
X
l
Y
6
2
W zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
2
Shift (figures to letters)
Shift (letters to figures)
Space
(SP)
Carriage Return<
Line Feed
Blank
10111
01010
00101
10000
001
11
11110
10011
11101
10101
10001
11111
11011
00100
01000
00010
00000
1
=
Mark (Punch hole on paper tape)
0
=
Space
(No
hole)
Figure
4.3
Baudot
code
(International
Alphabet
IA2)
The word zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Baudot
would thus be transmitted in Baudot code as:
order of B A U
D
0
T
transmit 10011 11000 11100 10010
00011
00001
In passing it is also worth mentioning that the term
Baud
is commonly used in data
communications as the unit of rate of signal change on the line transmission medium
(the so-called
Baud rate).
Telex networks usually operate at a rate of
50
Baud
(50
signal
changes per second) and they use the Baudot code. As
5
line state changes (from mark-
to-space, space-to-mark, space-to-space or mark-to-mark) are required to convey each
character, this produces an
information rate
of
50
divided by
5,
that is to say 10
alphabetic characters per second, which incidentally corresponds roughly to ordinary
human speech, when we are speaking or reading deliberately.
4.6
ASCII
With the advent of semi-conductors and the first computers,
1963
saw the development
of a new seven-bit binary code for computer characters. This code encompassed a wider
character range, including not only the alphabetic and numeric characters but also a
range of new
control
characters
which are needed
to
govern the flow of data in and
around the computers. The code, named ASCII (pronounced ‘Askey’) is now common
in computer systems. The letters stand for American (National) Standard Code for
Information Interchange. It is also known as International Alphabet number 5 (IA5)
and is defined by ITU-T recommendation T.50. Figure
4.4
illustrates it.