Channel Capacity¶
Channel capacity represents the fundamental limit of a communication channel, defining the maximum rate at which information can be transmitted with an arbitrarily small probability of error.
Channel Capacity of DMC¶
We consider a discrete memoryless channel (DMC), such as the binary symmetric channel (BSC) with crossover probability (the probability of a bit being flipped).
The capacity is closely tied to the channel’s error characteristics.
Specifically, for the BSC, the capacity is
where
is the binary entropy function.
Recall that measures the uncertainty in a binary random variable with probability .
Consequently, quantifies the maximum rate (in bits per channel use) at which reliable communication is possible over the BSC, capturing the reduction in uncertainty about the input given the output.
General Definition of Channel Capacity¶
More generally, the capacity of any channel is defined as
where is the mutual information between the input random variable (with alphabet ) and the output random variable (with alphabet ).
Formally,
where , is the input probability mass function (PMF), and is the conditional probability of given .
The maximization is performed over all possible input PMFs , subject to:
ensuring is a valid PMF.
The resulting capacity , in bits per channel use (assuming base-2 logarithms), is the highest achievable rate at which the error probability can be made arbitrarily small through appropriate coding.
Units of Channel Capacity¶
Bits per transmission (bits/channel use) if is used. This is standard in digital communication.
Nats per transmission if the natural logarithm (, base ) is used, where .
If the channel transmits one symbol every seconds, then the capacity in bits per second or nats per second is .
This expresses the rate on a time-based rather than a per-use basis, which is important for continuous transmission.
Shannon’s Noisy Channel Coding Theorem¶
Shannon’s Second Theorem (1948), also known as the Noisy Channel Coding Theorem, underpins the concept of channel capacity:
Noisy Channel Coding Theorem¶
Reliable communication over a DMC is possible if the transmission rate (in bits per channel use) is less than .
Conversely, if , reliable communication is impossible, because the error probability cannot be made arbitrarily small regardless of the coding scheme.
It means that there exists a coding scheme (often with sufficiently long block codes) that achieves negligible error for , but no such scheme exists for .
This result establishes as the ultimate limit for reliable communication and provides a theoretical benchmark against which practical systems are measured.
It also guides the design of error-correcting codes and modulation schemes, ensuring transmission rates do not exceed the channel’s fundamental capabilities.
Example: Capacity of the BSC¶
This example is based on [Proakis’s book, Example 6.5-1].
For the BSC with and crossover probability (the probability of a bit being flipped), the capacity is maximized by using a uniform input distribution, i.e., .
Due to the channel’s symmetry:
Output entropy:
(Since .)
Conditional entropy:
since by symmetry.
Hence,
Expanding as , we get
Special Cases
(no errors): bit/channel use (the maximum for a binary channel).
(completely random output): , meaning no information is transmitted (the input and output are independent).
For , one can simply relabel the outputs (interpreting 0 as 1 and vice versa), reducing the effective to .
This shows is symmetric about .
When plotted against the signal-to-noise ratio (SNR) per bit, typically denoted , the capacity increases monotonically, indicating better reliability as SNR rises.
In practical analyses (e.g., BPSK modulation), the crossover probability is related to SNR via the error function.
Capacity of the Discrete-Time Binary-Input AWGN Channel¶
We consider a binary-input additive white Gaussian noise (AWGN) channel with inputs (e.g., BPSK signaling) and continuous output , where .
The conditional probability density function (PDF) is
By symmetry, the capacity is again maximized by a uniform input PMF: .
Then, the channel capacity is
However, computing directly from the continuous output is more conveniently done using:
with .
Thus,
where
is the overall output PDF.
Simplified Form and Behavior¶
While the integral does not have a simple closed-form solution, it can be written as
where
and is the signal-to-noise amplitude ratio.
Due to symmetry ( relates closely to ), the capacity depends on , often expressed as (energy per bit to noise power spectral density ratio) in communication systems.
As increases, grows from 0 (at very low SNR) to 1 bit per symbol (at high SNR), reflecting the channel’s ability to approach error-free operation.
For example, for target rates or , specific thresholds (e.g., 0.188 dB and -0.496 dB) indicate the minimum required SNR—these values are typically found via numerical methods or simulation.
Capacity of Symmetric Channels¶
In certain channel models, such as the Binary Symmetric Channel (BSC) and the binary-input AWGN channel, the channel transition probabilities exhibit symmetry.
A symmetric channel is characterized by a transition probability matrix that satisfies:
Each row of matrix is a permutation of any other row.
Each column of matrix is a permutation of any other column.
As an example, the BSC with crossover probability has transition matrix
Row 1, , is a permutation of Row 2, , and similarly for the columns.
This symmetry implies that the channel treats all inputs equivalently up to a possible relabeling of outputs.
Capacity of Symmetric Channels Under a Uniform Input¶
For symmetric channels, the capacity-achieving input distribution is uniform, i.e., for each .
The resulting capacity is
where is the size of the output alphabet, and is any row of the transition matrix .
Because each row is just a permutation of the others, has the same entropy regardless of which row is chosen:
Thus, for the BSC () with crossover probability , each row is or . Since , we get
confirming the known BSC capacity.
It is noted that output symmetry (as in binary-input AWGN) is sufficient to guarantee a uniform capacity-achieving input, even if the channel does not have a finite transition matrix.
Conditions for Capacity-Achieving Distributions¶
For a general discrete memoryless channel (DMC), the capacity
is achieved by an input distribution that satisfies the following (sometimes referred to as the Kuhn–Tucker conditions in information theory):
for all with .
for all with .
Here, the pointwise mutual information is
and
These conditions ensure that:
For any input that is actually used (i.e., ), the information rate must equal the channel capacity .
If an input is not used (i.e., ), its information rate cannot exceed .
In symmetric channels, the uniform distribution satisfies these conditions because is the same for all .
For non-symmetric channels, finding the optimal distribution often requires iterative methods (e.g., the Blahut–Arimoto algorithm) if uniform inputs do not satisfy the conditions.
Capacity of the Discrete-Time AWGN Channel (Power-Constrained)¶
Consider a discrete-time AWGN channel:
where are i.i.d. Gaussian random variables with mean 0 and variance .
The inputs must satisfy the average power constraint
Sphere Packing Argument¶
For uses of the channel, the transmitted vector must lie within an -dimensional sphere of radius .
Because the noise vector is Gaussian, the received vector
lies in a sphere of radius .
Approximating the maximum number of distinguishable codewords (messages) by the ratio of volumes leads to
Hence, the achievable rate (in bits per channel use) is
This is precisely the channel capacity for an average power constraint .
Achievability follows by choosing to be Gaussian , which maximizes .
Formally,
where denotes differential entropy, and maximizes .
It is noted that the capacity is in bits per channel use, and can be converted to bits per second by multiplying with the sampling rate.
Capacity of the Band-Limited Waveform AWGN Channel (Power-Constrained)¶
Now consider a continuous-time AWGN channel with:
Bandwidth .
Input power constraint .
Noise power spectral density .
It can be shown (via sampling or dimensionality arguments) that this setup is equivalent to discrete-time uses per second of an AWGN channel with:
Power constraint per use: .
Noise variance: .
Per-Use Capacity¶
The discrete-time channel capacity per use is
Because there are uses per second, the total channel capacity in bits per second is
This is Shannon’s well-known capacity formula for a band-limited AWGN channel.
The term is the signal-to-noise ratio (SNR).
Capacity Behavior¶
Increasing Power : As , capacity .
However, the increase is logarithmic ( for large ), meaning there are diminishing returns from simply increasing power.
Increasing Bandwidth :
Positive effect: More uses per second () increase capacity.
Negative effect: Increasing lowers SNR , because the noise power is proportional to .
For fixed , letting :
Using for small , we get
Hence,
So, infinite bandwidth yields a finite limiting capacity, determined by .
Bandwidth Efficiency of Band-Limited Waveform AWGN Channel with Power Constraint¶
The bandwidth efficiency of a communication system quantifies how effectively the available bandwidth is utilized to transmit data.
It is defined as the ratio of the bit rate (in bits per second, b/s) to the bandwidth (in Hertz, Hz):
Often denoted by , this metric indicates the number of bits transmitted per unit of bandwidth and thus provides insight into the spectral efficiency of a signaling scheme.
To investigate the trade-off between bandwidth efficiency and power efficiency, we begin with the capacity of a band-limited AWGN channel under power constraint and noise power spectral density :
For reliable communication, the bit rate must satisfy , ensuring that errors can be made arbitrarily small through suitable coding techniques.
Substituting the expression for into :
Dividing both sides by the bandwidth yields
Since , we obtain
This inequality reveals a fundamental relationship between the bandwidth efficiency and the SNR , identifying the maximum achievable for given values of power and bandwidth .
Relating Bandwidth Efficiency to Power Efficiency¶
To express this connection in terms of power efficiency, we introduce the energy per bit to noise power spectral density ratio, .
For a signaling scheme with bit rate and power , the total energy per symbol is linked to the symbol duration (assuming one symbol per bit for simplicity):
For -ary signaling, , but here we focus directly on the bit rate.
In that case,
Hence,
Substitute into the earlier capacity inequality :
since . Rearranging this:
This formula specifies the minimum required for reliable communication at a given bandwidth efficiency .
It demonstrates that higher (i.e., more bits per Hz) necessitates greater power efficiency, underscoring the trade-off between spectral resources and power resources.
Minimum for Reliable Communication¶
To identify the absolute minimum that enables reliable transmission, we examine the limiting case (vanishingly small bandwidth efficiency). From
use the Taylor expansion for small . Thus,
Therefore,
This is known as the Shannon limit, representing the fundamental boundary below which no communication system can achieve arbitrarily low error probability.
Converting to decibels via , we see this limit applies to any physical communication scheme.
Reaching the limit requires , which corresponds to when is fixed; effectively, one leverages vast bandwidth to reduce the SNR per Hz but compensates with sophisticated coding across many dimensions.
Achieving Capacity With Orthogonal Signals¶
Next, consider a system employing orthogonal waveforms (e.g., frequency-shift keying, FSK) over an AWGN channel.
The probability of error when distinguishing among signals, each having energy , is
where
is the Gaussian tail function, and is the signal amplitude that depends on the SNR.
Since this integral typically lacks a closed-form solution, it is evaluated numerically to assess error performance as a function of .
Infinite Bandwidth Scenario¶
When , the channel capacity approaches
as derived from the infinite-bandwidth limit.
If we let (with denoting the signaling duration), the bit rate can approach various fractions of .
Upper bounds on are often given for two regimes:
Low-rate regime : The exponent remains positive if .
As (and thus becomes large), .
High-rate regime : The exponent is positive if .
Again, with .
Since , reliable communication is possible whenever , and the error probability decreases exponentially as increases.
Orthogonal signaling can achieve this capacity in the limit by exploiting the large-dimensional signal space, thereby approaching the Shannon limit .
The Channel Reliability Function¶
In digital communication over an infinite-bandwidth additive white Gaussian noise (AWGN) channel using -ary orthogonal signals (e.g., frequency-shift keying), the probability of error decreases exponentially with the signaling duration .
This behavior is captured by exponential bounds that provide insight into how reliably information can be transmitted at a given rate.
For such a system, an upper bound on the error probability is expressed as:
Here, is the duration of the signal (related to and the bit rate via for binary encoding), and is the channel reliability function that dictates the exponential rate of decay of as increases.
The reliability function depends on the communication rate (in bits per second) and the channel’s infinite-bandwidth capacity
(derived previously in the limit ), where is the signal power and is the noise power spectral density.
Specifically,
Low-Rate Regime
The exponent is linear, , indicating that decays exponentially as long as .
At , .
High-Rate Regime
The exponent becomes quadratic,
reflecting a slower decay as approaches .
At , , and no longer decreases exponentially.
Comparison With the Union Bound¶
For comparison, the union bound on provides a simpler but looser exponential factor [Proakis, Eq. 4.4-17]:
This bound uses a linear exponent, , which is valid up to , where it becomes zero.
Beyond this point, the union bound becomes trivial (), lacking the nuanced behavior of in the higher-rate regime.
The looseness arises because the union bound overestimates error events by summing pairwise error probabilities, whereas more accurately accounts for their joint probability.
Exponential Tightness of Gallager’s Bound¶
The bound, derived by Gallager (1965), is exponentially tight, meaning it represents the best possible exponential upper bound.
There is no alternative reliability function with for any that still bounds correctly.
This tightness establishes as the definitive measure of error decay for the infinite-bandwidth AWGN channel.
Matching Upper and Lower Bounds¶
The error probability is bounded both above and below:
where is the upper bound constant, and is a positive constant ensuring that applies to both bounds.
These constants and have only a weak dependence on , as shown by:
Hence, and grow sublinearly with (e.g., as constants or ), so their effect on the exponent becomes negligible for large .
The dominant term in the exponent is , confirming that governs the asymptotic error behavior.
Because orthogonal signals are asymptotically optimal as (approaching the capacity ), this reliability function applies universally.
The lower bound implies that no signaling scheme can achieve a faster exponential decay than , making it a fundamental characteristic of digital signaling over the infinite-bandwidth AWGN channel.
For , , and therefore as , in alignment with the noisy channel coding theorem.