INFORMATION SOURCES Definition and classification what is information theory in digital communication ?

**INTRODUCTION**

** **As discussed earlier in chapter 1, the purpose of a communication system is to carry information-bearing baseband signals from one place to another place over a communication channel. In the st few chapters, we have discussed a number of modulation schemes to accomplish this purpose. But what is the meaning of the word “Information”. To answer this question we need to discuss. information theory.

Infact, information theory is a branch of probability theory which may be applied to the study of the communication systems. This broad mathematical discipline has made funda-mental contributions, not only to communications, but also computer science, statistical physics and probability and statistics.

Further, in the context of communications, information theory deals with mathematical modelling and analysis of a communication system rather than with physical sources and physical channels. as a matter of fact, Information theory was invented by communication scientists while they were studying the statistical structure of electronic communication equipments. When the communique is readily measurable, such as an electric current, the study of the communication system is relatively easy. But, when the communique is information, the study becomes rather difficult. How to define the measure for an amount of information? And also having described a suitable measure, how can it be applied to improve the communication of information? Information theory provides answers to all these questions. Thus, this chapter is devoted to a detailed discussion of information theory.

**9.2 WHAT IS INFORMATION? **

Before discussing the quantitative measure of information, let us review a basic concept about the amount of information in a message. Few messages produced by an information source contain more information than others. This may be best understood with the help of following example:

Consider you are planning a tour a city located in such an area where rain fall is very rare. To know about the weather forecast you will call the weather bureau and may receive one of the following information:

(i) It would be hot and sunny,

(ii) There would scattered rain,

(iii) There would be a cyclone with thunderstorm.

It may he observed that the amount of information received is clearly different for the three messages. The first message, just for instance, contains very little information because the weather in a desert city in summer is expected to be hot and sunny for maximum time. The second message forecasting a scattered rain contains some more information because it is not an event that occurs often. The forecast of a cyclonic storm contains even more information compared to the emend message. This is because the third forecast is a rearest event in the city. Hence, on an conceptual basis the amount of information received from the knowledge of occurrence of a event may be related to the likelihood or probability of occurrence of that event. The message associated with the least likelihood event thus consists of maximum information. The above amount of information in a message depends only upon the uncertainty of the underlying event rather Now, let us discuss few important concepts related to Information theory in sections to follow.

**9.3 INFORMATION SOURCES **

**(i) Definition**

** **An information source may be viewed as an object which produces an event, the outcome of which is selected at random according to a probability distribution. A practical source in a communication system is a device which produces messages, and it can be either analog or discrete. In this chapter, we deal mainly with the discrete sources since analog sources can be transformed to discrete sources through the use of sampling and quantization techniques, described in chapter 10. As a matter of fact, a discrete information source is a source which has only a finite set of symbols as possible outputs. The set of source symbols is called the source alphabet, and the elements of the set are called **symbols** or **letters.**

**(ii) Classification of Information Sources**

DO YOU KNOW? |

A discrete information source consists of a discrete set of letters or alphabet of symbols. In general, any message emitted by the source consists of a string or sequence of symbols. |

** **Information sources can be classified as having memory or being memoryless. A source with memory is one for which a current symbol depends on the previous symbols. A memoryless source is one for which each symbol produced is independent of the previous symbols.

A discrete memoryless source (DMS) can be characterized by the list of the symbols, the probability assignment to these symbols, and the specification of the rate of generating these symbols by the source.

**9.4 INFORMATION CONTENT OF A DISCRETE MEMORYLESS SOURCE (DMS)**

** **The amount of information contained in an event is closely related to its uncertainty. Messages containing knowledge of high probability of occurrence convey relatively little information. We note that if an event is certain (that is, the event occurs with probability 1), it conveys zero information. Thus, a mathematical measure of information should be a function of the probability of the outcome and should satisfy the following axioms:

(i) Information should be proportional to the uncertainty of an outcome.

(ii) Information contained in independent outcomes should add.

**9.5 INFORMATION CONTENT OF A SYMBOL (i.e., LOGARITHMIC MEASURE OF INFORMATION)**

**(i) Definition**

Let us consider a discrete memoryless source (DMS) denoted by X and having alphabet {x_{1}, x_{2}, … x_{m}}. The information content of a symbol x_{i}, denoted by *I*(x_{i}) is defined by

*I*(x_{i}) = log* _{b}* = – log

*P(x*

_{b}_{i}) …(9.1)

where P(x

_{i}) is the probability of occurrence of symbol x

_{i}.

**(ii) Properties of**

*I*(x_{i})The information content of a symbol x

_{i}, denoted by

*I*(x

_{i}) satisfies the following properties:

*I*(x

_{i}) = 0 for P(x

_{i}) = 1 …(9.2)

*I*(x

_{i}) ≥ 0 …(9.3)

*I*(x

_{i}) ≥ I(x

_{i}) if P (x

_{i}) < P(x

_{j}) …(9.4)

*I*(x

_{i},x

_{j}) =

*I*(x

_{i}) + (x

_{j}) if x

_{i}and x

_{j}are independent …(9.5)

**(iii) Unit of**

*I*(x_{i})The unit of

*I*(x

_{i}) is the bit (binary unit) if b = 2, Hartely or decit if b = 10, and nat (natural unit) if b = e. It is standard to use b = 2. Here the unit bit (abbreviated “b”) is a measure of information content and is not to be confused with the term ‘bit’ meaning binary digit. The conversion of these units to other units can be achieved by the following relationship.

log2a = = …(9.6)

**(ii) Few Important points about**

*I*(x_{i})According to our concept, the information content or amount of information of a symbol x

_{i}, denoted by

*I*(x

_{i}), must be inversely related to P(x

_{i}). Also from our intuitive concept

*I*(x

_{i}) must denoted requirements:

(i)

*I*(x

_{i}) must approach 0 as

*P*(x

_{i}) approaches infinity. For example, consider the message ‘Sun will rise in the east’. This message does not contain any information since sun will rise in the east with probability 1.

(ii) The information content

*I*(x

_{i}) must be a non-negative quantity since each message contains some information. In the worst case

*I*(x

_{i}) can be equal to zero.

(iii) The information content of a message having higher probability of occurrence is less than the information content of a message having lower probability.

Now let us discuss few numerical examples to illustrate all the above concepts.

**EXAMPLE 9.1. A source produces one of four possible symbols during each interval having probabilities P(x**

_{i}) =**P(x**

_{2}) =**P(x**

_{3}) = P(x_{4}) =**.**

**Obtain the information content of each of these symbols.**

**Solution:**We know that the information content

*I*(x

_{i}) of a symbol x

_{i}is given by

I(xi) = log

_{2}

Thus, we can write

*I*(x

_{1}) = log

_{2}= log

_{2}(2) = 1 bit

*I*(x

_{2}) = log

_{2}= log

_{2}2

^{2}= 2 bit

*I*(x

_{3}) = log

_{2}= log

_{2}2

^{3}= 3 bit

*I*(x

_{4}) = log

_{2}= 3 bit

**Ans.**

**EXAMPLE 9 2. Calculate the amount of information if it is given that P(x**

_{i}) =**.**

**Solution :**We know that amount of information

*I*(x

_{i}) of a discrete symbol x

_{i}is given by,

*I*(xi) = log2

The above expression may be written as under:

**equation**

Substituting given value of P(x

*) in above expression, we obtain*

_{i}*I*(x

_{i}) = = 2 bits

**Ans.**

**EXAMPLE 9.3. Calculate the amount of information if binary digits (binits) occur with equal likelihood in a binary PCM system.**

**Solution:**We know that in binary PCM, there are only two binary levels i.e., 1 or 0. Since, these two binary levels occur with equal likelihood, their probabilities of occurrence will be,

P(x

_{i}) (‘0’ level) = P(x

_{2}) (‘1’ level) =

Therefore, the amount of information content will be given as

*I*(x

_{1}) = log

_{2}

and

*I*(x

_{2}) = log

_{2}

or

*I*(x

_{1}) = log

_{2}2

and

*I*(x

_{2}) = log

_{2}2

or

*I*(x

_{1}) =

*I*(x

_{2}) = = 1 bit

**Ans.**

Hence, the correct identification of binary digit (binit) in binary PCM carries 1 bit of information,

**Ans.**

**EXAMPLE 9.4. In a binary PCM if ‘0’ occur with probability**

**and ‘1’ occur with probability equal to**

**then calculate the amount of information carried by each binit.**

**Solution:**Here, given that the binit ‘0’ has P(x

_{i}) =

and binit ‘1’ has P(x

_{2}) =

Then, amount of information is given as,

*I*(x

_{i}) = log

_{2}

with

*P*(x

_{1}) = ,

we have

*I*(x

_{1}) = log

_{2}4 = = 2 bits

and with

*P*(x

_{2}) = .

we have I(x

_{2}) = log

_{2}= = 0.415 bits

**NOTE**Here, may observed that binit ‘0’ has probability and it carries 2 bits of information whereas binit ‘1’ has probability and it carries 0.415 bits of information. Thus. this reveals the fact that if probability of occurrence is less, then the information carried is more, and vice-versa.

**EXAMPLE 9.5. If there are**

*M*equally likely and independent symbols, then prove that amount of information carried by each symbol will be,

*I***(x**

_{i}) = N bits.**where M = 2**

^{N}and*N*is an integer. (Important)DO YOU KNOW? |

If the units of information are taken to be binary digits or bits, then the average information rate represents the minimum average number of bits per second needed to represent the output of the source as a binary sequence. |

**Solution:** Since, it is given that all the M symbols are equally likely and independent, therefore, the probability of occurrence of each symbol must be

We know that amount of information *I*(x_{i}) of a discrete symbol x_{i} is given by,

* I*(xi) = log2 *…(i)*

Here, probability of each message is, P(x_{i}) =

Hence, equation (i) takes the following form:

*I*(x_{i}) = log_{2} *M* *…(ii)*

Further, it is given that M = 2^{N}

Substituting above value of M in equation (ii), we obtain

*I*(x_{i}) = log_{2} 2^{N} = *N* log_{2} 2 = = *N* bit

**NOTE:** Hence, amount of information carried by each symbol will be ‘*N*‘ bits. We know that M = 2^{N}. This means that there are ‘N’ binary digits (binits) in each symbol. This indicates that when the symbols are equally likely and coded with equal number of binary ‘digits (binits), then the information carried by each symbol (measured in bite) is numerically same as the number of binits used for each symbol.

**EXAMPLE 9.6. Prove the statement stated as under:**

** “If a receiver knows the measage being transmitted, the amount of information carried will be zero”**

Solution : Here it is stated that receiver “knows” the message. This means that only one message is transmitted. Thus, probability of occurrence of this message will be P(x* _{i}*) = 1. This is because only one message and its occurrence is certain (probability of certain event is ‘1’). The amount of information carried by this type of message will be,

*I*(x

_{i}) = log

_{2}substituting

*P*(x

_{i}) = 1

or

*I*(x

_{i}) = 0 bits

This proves the statement that if the receiver knows message, the amount of information carried will be zero.

Also, as

*P*(x

*) is decreased from 1 to 0,*

_{i}*I*(x

*) increases monotonically from 0 to infinity. This shows that amount of information conveyed is greater when receiver correctly identifies less likely messages.*

_{i}**EXAMPLE 9.7. If**

*I*(x*) is the information carried by symbols x*_{1}_{1}and*1*(x_{2}) is the information carried by message x_{2}, then prove that the amount of information carried compositely due to x_{1}and x_{2}is I(x_{1}, x_{2}) =*I*(x_{1}) +*I*(x_{2}).**Solution:**We know that the amount of information is expressed as

*I*(x

_{i}) = log

_{2}

The individual amounts carried by symbols x

_{1}and x

_{2}are,

*I*(x

_{1}) = log

_{2}…(i)

*I*(x

_{2}) = log

_{2}…(ii)

Here P(x

_{1}) is probability of symbol x

_{1}and P(x

_{2}) is probability of symbol x

_{2}. Since messages x

_{1}and x

_{2}are independent, the probability of composite message is

*P*(x

_{1}) P(x

_{2}). Therefore, information carried compositely due to symbols x

_{1}and x

_{2}, will be,

*I*(x

_{1}, x

_{2}) = log

_{2}log

_{2}

**or**

*I*(x

_{1}, x

_{2}) = log2 + log

_{2}

Using equations (i) and (ii), we can write RHS of above equation as,

*I*(x

_{1}, x

_{2}) =

*I*(x

_{1}) +

*I*(x

_{2})

**Hence Proved.**