TND Book

Channel Modeling¶

End-to-End System Overview¶

Channel Encoding¶

In digital communication systems, the discrete channel encoder plays a critical role in preparing a binary information sequence for transmission over a noisy channel.

The primary function of the encoder is to introduce redundancy into the binary sequence in a controlled and systematic manner.

This redundancy refers to additional bits that do not carry new information but are strategically added to enable the receiver to detect and correct errors caused by noise and interference during transmission.

Without such redundancy, the receiver would have no means to distinguish between the intended signal and the distortions introduced by the channel, rendering reliable communication impossible in the presence of noise.

The encoding process involves segmenting the input binary information sequence into blocks of $k$ bits, where $k$ represents the number of information bits per block.

Each unique $k$ -bit sequence is then mapped to a corresponding $n$ -bit sequence, known as a codeword, where

\boxed{n > k}

(1)

This mapping ensures that each possible $k$ -bit input has a distinct $n$ -bit output, preserving the uniqueness of the information while embedding redundancy.

For example, a simple repetition code might map a single bit (e.g., $k = 1$ , input $(0)$ ) to a three-bit codeword (e.g., $n = 3$ , output $(000)$ ), repeating the bit to add redundancy.

Code Rate¶

The amount of redundancy introduced by this encoding process is quantified by the ratio $n/k$ .

This ratio indicates how many total bits ( $n$ ) are transmitted for each information bit ( $k$ ).

A higher $n/k$ implies more redundancy; for instance, if $k = 4$ and $n = 7$ , then

\frac{n}{k} = \frac{7}{4} = 1.75

(2)

meaning 1.75 bits are sent per information bit, with 0.75 bits being redundant.

The reciprocal of this ratio, $k/n$ , is defined as the code rate, denoted $R_c$ :

\boxed{R_c = \frac{k}{n}}

(3)

The code rate measures the efficiency of the encoding scheme, representing the fraction of the transmitted bits that carry actual information.

For the example above ( $k = 4$ , $n = 7$ ),

R_c = \frac{4}{7} \approx 0.571

(4)

meaning approximately 57.1% of the transmitted bits are information, and the remaining 42.9% are redundancy.

A code rate closer to 1 indicates less redundancy (higher efficiency), while a lower $R_c$ indicates more redundancy (better error protection but lower efficiency).

The choice of $R_c$ balances the trade-off between data throughput and error resilience, depending on the channel’s noise characteristics.

Modulation and Interface to the Channel¶

The binary sequence output from the channel encoder, consisting of $n$ -bit codewords, is passed to the modulator, which serves as the interface between the digital system and the physical communication channel (e.g., a wireless medium or optical fiber).

The modulator’s role is to convert the discrete binary sequence into a continuous-time waveform suitable for transmission over the channel.

In its simplest form, the modulator employs binary modulation, where each bit in the sequence is mapped to one of two distinct waveforms:

A binary $(0)$ is mapped to waveform $s_1(t)$ .
A binary $(1)$ is mapped to waveform $s_2(t)$ .

For example, in binary phase-shift keying (BPSK), $s_1(t)$ and $s_2(t)$ might be sinusoidal signals differing in phase (e.g., $s_1(t) = A \cos(2\pi f_c t)$ and $s_2(t) = -A \cos(2\pi f_c t)$ ), transmitted over a symbol duration $T$ .

This one-to-one mapping occurs at a rate determined by the bit rate of the encoded sequence.

Alternatively, the modulator can operate on blocks of $q$ bits at a time, using $M$ -ary modulation, where $M = 2^q$ represents the number of possible waveforms.

Each unique $q$ -bit block is mapped to one of $M$ distinct waveforms.

For instance, if $q = 2$ , then $M = 2^2 = 4$ , and the modulator might use four waveforms (e.g., in quadrature phase-shift keying, QPSK), such as $s_1(t), s_2(t), s_3(t), s_4(t)$ , each corresponding to a 2-bit sequence $(00, 01, 10, 11)$ .

This increases the data rate per symbol—since each waveform carries $q$ bits—but requires a more complex receiver to distinguish between the $M$ signals.

At the receiving end, the transmitted waveform is corrupted by channel effects (e.g., noise, fading, or interference), resulting in a channel-corrupted waveform.

The demodulator processes this received signal and converts each waveform back into a form that estimates the transmitted data symbol.

In binary modulation, the demodulator might output a scalar value (e.g., a voltage level) indicating whether $s_1(t)$ or $s_2(t)$ was more likely sent.

In $M$ -ary modulation, it might produce a vector in a signal space (e.g., coordinates in a constellation diagram) representing one of the $M$ possible symbols.

This output serves as an estimate of the original binary or $M$ -ary data symbol, though it may still contain errors due to channel noise.

Mathematical Pipeline of the End-to-End Process¶

Source Input (Information Bits):¶

The source input is represented as a binary sequence:

\vec{u} = \{ u_1, u_2, \dots, u_{L} \}, \quad u_i \in \{0, 1\}

(5)

This sequence is segmented into $\frac{L}{k}$ blocks, each containing $k$ bits.

Channel Encoding.¶

Each $k$ -bit block $\vec{u}_i$ is encoded into an $n$ -bit codeword $\vec{c}_i$ using a generator matrix:

\vec{c}_i = \vec{u}_i \times \mathbf{G}, \quad \vec{c}_i \in \{0, 1\}^n, \quad \forall i

(6)

where $\mathbf{G} \in \mathbb{F}_2^{k \times n}$ is the generator matrix over the binary field.

The resulting output bitstream is:

\vec{c} = \{ c_1, c_2, \dots, c_{N} \}, \quad N = \frac{L}{k} \times n

(7)

Bitstream Grouping for Modulation:¶

The bitstream is grouped into $\frac{N}{q}$ symbols, where each symbol $\vec{b}_j$ consists of $q$ bits:

\vec{b}_j = \{ c_{(j-1)q+1}, \dots, c_{jq} \}, \quad \vec{b}_j \in \{0, 1\}^q

(8)

If $N \not\equiv 0 \pmod{q}$ , the bitstream is padded with zeros to ensure its length is divisible by $q$ .

Symbol Mapping (8-QAM):¶

Each $q$ -bit group is mapped to a complex constellation point using the mapping function μ:

\vec{s}_j = \mu(\vec{b}_j), \quad \mu: \{0,1\}^q \rightarrow \mathbb{C}

(9)

where μ assigns $q$ -bit groups to points in an 8-QAM constellation.

Transmission:¶

The simplified transmitted signal is constructed as:

x(t) = \Re\left\{ \sum_j \vec{s}_j \times e^{j2\pi f_c t} \right\}

(10)

where $f_c$ denotes the carrier frequency, and $\Re\{\cdot\}$ represents the real part of the complex expression.

Typically, for passband transmission, $x(t) = \Re\left\{ \sum_j s_j p(t - jT_s) e^{j2\pi f_c t} \right\}$ , where $p(t)$ is a pulse-shaping filter and $T_s$ is the symbol duration.

Example: (7,4) Linear Block Code with 8-QAM Modulation¶

System Parameters

Channel code: $(n, k) = (7, 4)$
Code rate: $R_c = \frac{k}{n} = \frac{4}{7}$
Modulation: 8-QAM
Modulation order: $M = 8$ , bits per symbol: $q = \log_2 M = 3$

Using (7,4) Linear Block Code.¶

A (7,4) linear block code introduces redundancy to facilitate error correction:

$k = 4$ : Number of message (information) bits.
$n = 7$ : Length of each encoded codeword.
$n - k = 3$ : Number of parity (redundant) bits.

For each 4-bit message, the encoder generates a 7-bit codeword. For example, in systematic encoding:

\text{Input} = 1011 \rightarrow \text{Output} = 1011xyz

(11)

where $xyz$ are parity bits computed from the message bits using the generator matrix.

Bitstream Formation¶

The encoder processes a sequence of message bits, dividing it into 4-bit blocks.

Each block is encoded into a 7-bit codeword. For example:

Input messages: 1011, 0010, 1100
Encoded codewords: 1011100, 0010110, 1100101

These codewords are concatenated into a continuous binary stream:

\text{Bitstream} = 101110000101101100101

(12)

This bitstream is then forwarded to the modulator.

Modulator Input: Preparing for 8-QAM¶

For 8-QAM:

Each symbol transmits 3 bits ( $q = 3$ , $M = 2^3 = 8$ ).
Symbols correspond to unique points in an 8-point constellation, typically arranged asymmetrically or as a modified rectangular/circular pattern in the I-Q plane.

Mapping Encoded Bits to 8-QAM Symbols¶

Since the codeword length (7 bits) is not a multiple of 3, the bitstream is treated as continuous and grouped into 3-bit segments:

\text{Bitstream: } 101110000101101100101

(13)

Grouped into 3-bit segments:

101, 110, 000, 101, 101, 100, 101

(14)

In this example, the total bit count (21 bits) is divisible by 3, so no padding is required. Each 3-bit group is mapped to a unique 8-QAM constellation point, such as:

$000 \rightarrow \text{Symbol 1}$
$001 \rightarrow \text{Symbol 2}$
$\dots$
$111 \rightarrow \text{Symbol 8}$

Each symbol is defined by its amplitude and phase for carrier modulation.

Modulation and Transmission¶

Each 3-bit group modulates a carrier wave:

The carrier’s amplitude and phase are determined by the corresponding constellation point (e.g., $101 \rightarrow (I = -1, Q = 2)$ ).
The signal is transmitted over the physical medium (e.g., wireless channel, cable, or optical link).
During transmission, the signal may be subject to noise, fading, or distortion.

Receiver Side (Overview) At the receiver:

Each received analog symbol is demodulated into a 3-bit group.
The demodulated bitstream is re-segmented into 7-bit codewords.
The decoder verifies and corrects errors in each 7-bit codeword using the (7,4) code structure.
The original 4-bit message blocks are recovered.

Additional Steps¶

Depending on the system, optional preprocessing or enhancements may be applied between encoding and modulation:

Interleaving: Rearranges bits to mitigate burst errors.
Scrambling: Prevents long sequences of 0s or 1s to enhance synchronization.
Framing: Adds synchronization headers or delimiters to assist receiver alignment.
Rate Matching: Adjusts the bit rate to match channel bandwidth (less common in fixed systems like (7,4) + 8-QAM).

Transmission:¶

Each symbol is transmitted over the channel using its associated I-Q modulation.

Table 1:Summary of The Pipeline

Stage	Operation
Encoder	(7,4) linear block coding
Bitstream	Concatenate 7-bit codewords
Modulator	Group bits into 3-bit segments
Symbol Mapping	Map 3-bit groups to 8-QAM constellation
Channel	Transmit analog modulated signal
Receiver	Demodulate + decode

Detection and Decision-Making¶

The output of the demodulator is fed to the detector, which interprets the estimate and makes a decision regarding the transmitted symbol.

Hard Decision¶

In the simplest scenario, for binary modulation, the detector determines whether the transmitted bit was a $(0)$ or a $(1)$ based on the scalar output of the demodulator.

For example, if the demodulator provides a value $y$ and a threshold τ is established (e.g., $\tau = 0$ in BPSK), the detector decides:

\begin{cases} y > \tau: & \text{Bit is } 1 \\ y \leq \tau: & \text{Bit is } 0 \end{cases}

(15)

This binary decision is referred to as a hard decision, as it commits definitively to one of two possible outcomes without retaining any ambiguity.

The detection process can be regarded as a form of quantization.

In the hard-decision case, the continuous output of the demodulator (e.g., a real-valued voltage) is quantized into one of two levels, analogous to binary quantization.

The decision boundary (e.g., τ) divides the output space into two regions, each corresponding to a specific bit value.

More generally, the detector can quantize the demodulator output into $Q \geq 2$ levels, forming a $Q$ -ary detector. When $M$ -ary modulation is employed (with $M = 2^q$ waveforms), the number of quantization levels must satisfy $Q \geq M$ to distinguish all possible symbols.

For instance, in QPSK ( $M = 4$ ), a hard-decision detector might utilize $Q = 4$ to map the vector output of the demodulator to one of four symbols.

Soft Decision¶

In the extreme case, if no quantization is performed ( $Q = \infty$ ), the detector forwards the unquantized, continuous output directly to the subsequent stage, preserving all information provided by the demodulator.

When $Q > M$ , the detector offers greater granularity than the number of transmitted symbols, resulting in a soft decision.

For example, in QPSK ( $M = 4$ ), a detector with $Q = 8$ might assign the demodulator output to one of eight levels, providing finer resolution within each symbol’s decision region.

Soft decisions retain more information regarding the likelihood of the received signal rather than enforcing a definitive choice, which can enhance error correction in subsequent decoding processes.

The quantized output of the detector—whether hard or soft—is then passed to the channel decoder.

The decoder leverages the redundancy introduced by the encoder (e.g., the additional bits in each $n$ -bit codeword) to correct errors induced by channel disturbances.

For hard decisions, the decoder operates with binary or $Q$ -ary symbols, employing techniques such as Hamming distance minimization.

For soft decisions, it can utilize probabilistic methods, such as maximum likelihood decoding or log-likelihood ratios, to exploit the additional information, typically achieving superior performance in noisy conditions.

Channel Models¶

A communication channel serves as the medium through which information is transmitted from a sender to a receiver.

The channel’s behavior is mathematically modeled to predict and optimize system performance.

A general communication channel is characterized by three key components:

Set of Possible Inputs $\mathcal{X}$ : This is the input alphabet, denoted $\mathcal{X}$ , which comprises all possible symbols that can be transmitted into the channel.
For instance, in a binary system, $\mathcal{X} = \{0, 1\}$ , indicating that the input is restricted to two symbols. The input alphabet defines the domain of signals that the transmitter can send.
Set of Possible Channel Outputs $\mathcal{Y}$ : This is the output alphabet, denoted $\mathcal{Y}$ , which encompasses all possible symbols that can be received from the channel.
In certain cases, $\mathcal{Y}$ may differ from $\mathcal{X}$ (e.g., due to noise altering the signal), but in simple models such as binary channels, $\mathcal{Y}$ might also be $\{0, 1\}$ . The output alphabet represents the range of observable outcomes at the receiver.
Conditional Probability: The relationship between inputs and outputs is captured by the conditional probability
$P\bigl[y_1, y_2, \ldots, y_n \bigm| x_1, x_2, \ldots, x_n\bigr],$
(16)
where $\vec{x} = (x_1, x_2, \ldots, x_n)$ is an input sequence of length $n$ and $\vec{y} = (y_1, y_2, \ldots, y_n)$ is the corresponding output sequence of length $n$ .
This probability distribution specifies the likelihood of receiving a particular output sequence $\vec{y}$ given a specific input sequence $\vec{x}$ .
It encapsulates the channel’s behavior, including effects such as noise, interference, or distortion, and applies to sequences of any length $n$ .

Probabilistic Channel Model.¶

These components $\bigl(\mathcal{X}, \mathcal{Y}\bigr)$ , and the conditional probability provide a complete probabilistic model of the channel, enabling the analysis of how reliably information can be transmitted.

The channel’s characteristics determine the strategies required for encoding, modulation, and decoding to achieve effective communication.

Memoryless Channels¶

A channel is classified as memoryless if its output at any given time depends solely on the input at that same time, with no influence from previous inputs or outputs.

Mathematically, a channel is memoryless if the conditional probability of the output sequence $\vec{y}$ given the input sequence $\vec{x}$ factors into a product of individual conditional probabilities:

P[\vec{y} \mid \vec{x}] = \prod_{i=1}^{n} P\bigl[y_i \mid x_i\bigr] \quad \text{for all } n

(17)

Here, $P[y_i \mid x_i]$ is the probability of receiving output $y_i$ given input $x_i$ at time index $i$ , and the product form indicates that each output $y_i$ is statistically independent of all other inputs $x_j$ ( $j \neq i$ ) and outputs $y_j$ ( $j \neq i$ ), conditioned on $x_i$ .

This property implies that the channel has no “memory” of past transmissions; the effect of an input $x_i$ on the output $y_i$ is isolated to that specific time instance.

In other words, for a memoryless channel, the output at time $i$ depends solely on the input at time $i$ , and the channel’s behavior at each time step is governed by the same conditional probability distribution $P[y_i \mid x_i]$ .

This simplifies analysis and design, as the channel can be characterized by a single-symbol transition probability rather than a complex sequence-dependent model.

The simplest and most widely studied memoryless channel model is the binary symmetric channel (BSC).

In the BSC, both the input and output alphabets are binary, i.e., $\mathcal{X} = \mathcal{Y} = \{0, 1\}$ .

The BSC can be defined with the crossover probability $p$ as:

P[y_i \mid x_i] = \begin{cases} 1 - p, & \text{if } y_i = x_i \\ p, & \text{if } y_i \neq x_i \end{cases}

(18)

This model is particularly suitable for systems employing binary modulation (where bits are mapped to two waveforms) and hard decisions at the detector (where the receiver makes a definitive choice between 0 and 1).

The BSC captures the essence of a basic digital communication channel with symmetric error characteristics, making it a foundational concept in information theory and coding.

The Binary Symmetric Channel (BSC) Model¶

The binary symmetric channel (BSC) model emerges when a communication system is considered as a composite channel, incorporating the modulator, the physical waveform channel, and the demodulator/detector as an integrated unit.

This abstraction is particularly relevant for systems with the following components:

Modulator with Binary Waveforms: The modulator maps each binary input 0 or 1 to one of two distinct waveforms (e.g., $s_1(t)$ for 0 and $s_2(t)$ for 1), as in binary phase-shift keying (BPSK).
This converts the discrete binary sequence into a continuous-time signal for transmission.
Detector with Hard Decisions: The receiver’s demodulator processes the channel-corrupted waveform and produces an estimate, which the detector then quantizes into a binary decision (0 or 1), committing to one of the two possible symbols without ambiguity.

In this setup, the physical channel is modeled as an additive noise channel, where the transmitted waveform is perturbed by random noise (e.g., additive white Gaussian noise, AWGN).

The demodulator and detector together transform the noisy waveform back into a binary sequence.

The resulting composite channel operates in discrete time, with a binary input sequence (from the encoder) and a binary output sequence (from the detector).

This end-to-end system abstracts the continuous-time waveform transmission into a discrete-time model, simplifying analysis.

The BSC model assumes that the combined effects of modulation, channel noise, demodulation, and detection can be represented as a single discrete-time channel with binary inputs and binary outputs.

This abstraction is valid when the noise affects each transmitted bit independently, and the detector’s hard decisions align with the binary nature of the input, making the BSC an appropriate and widely utilized model for such systems.

Characteristics of the Binary Symmetric Channel¶

The composite channel, modeled as a binary symmetric channel (BSC), is fully characterized by the following:

Input Alphabet: $\mathcal{X} = \{0, 1\}$ , the set of possible binary inputs fed into the channel (e.g., the encoded bits from the transmitter).
Output Alphabet: $\mathcal{Y} = \{0, 1\}$ , the set of possible binary outputs produced by the detector after processing the received signal.
Conditional Probabilities: A set of probabilities that define the likelihood of each output given each input, capturing the channel’s error behavior.

For the BSC, the channel noise and disturbances are assumed to cause statistically independent errors in the transmitted binary sequence, with an average probability of error $(p)$ , known as the crossover probability.

The conditional probabilities are symmetric and defined as:

\begin{aligned} P[Y = 0 \mid X = 1] &= P[Y = 1 \mid X = 0] = p, \\ P[Y = 1 \mid X = 1] &= P[Y = 0 \mid X = 0] = 1 - p. \end{aligned}

(19)

These probabilities can be interpreted as follows:

$P[Y = 0 \mid X = 1] = p$ : The probability that an input $(1)$ is received as a $(0)$ (an error).
$P[Y = 1 \mid X = 0] = p$ : The probability that an input $(0)$ is received as a $(1)$ (an error).
$P[Y = 1 \mid X = 1] = 1 - p$ : The probability that an input $(1)$ is correctly received as a $(1)$ .
$P[Y = 0 \mid X = 0] = 1 - p$ : The probability that an input $(0)$ is correctly received as a $(0)$ .

The symmetry arises because the error probability $(p)$ is identical in both directions ( $0 \to 1$ and $1 \to 0$ ), and the correct reception probability is $1 - p$ .

Since the channel is memoryless, these probabilities apply independently to each transmitted bit, consistent with:

P[\vec{y} \mid \vec{x}] = \prod_{i=1}^{n} P[y_i \mid x_i].

(20)

The BSC is often depicted diagrammatically as a transition model with two inputs and two outputs, connected by arrows labeled with probabilities $(p)$ and $(1 - p)$ .

Channel diagram of the BSC considered above.

The cascade of the binary modulator, waveform channel, and binary demodulator/detector is thus reduced to this equivalent discrete-time channel, the BSC.

This model simplifies the analysis of error rates and informs the design of error-correcting codes, as $(p)$ (typically $0 < p < 0.5$ ) quantifies the channel’s reliability.

Discrete Memoryless Channels (DMC)¶

The binary symmetric channel (BSC), discussed previously, is a specific instance of a broader class of channel models known as the discrete memoryless channel (DMC).

A DMC is characterized by two key properties:

Discrete Input and Output Alphabets: The input alphabet $\mathcal{X}$ and output alphabet $\mathcal{Y}$ are finite, discrete sets.
For example, $\mathcal{X}$ might consist of $M$ symbols (e.g., $\{0, 1, \ldots, M-1\}$ ), and $\mathcal{Y}$ might consist of $Q$ symbols (e.g., $\{0, 1, \ldots, Q-1\}$ ), where $M$ and $Q$ are integers.
Memoryless Property: The channel’s output at any given time depends only on the input at that same time, with no dependence on prior inputs or outputs.
Mathematically, this is expressed as:
$P[\vec{y} \mid \vec{x}] = \prod_{i=1}^{n} P[y_i \mid x_i]$
(21)
for an input sequence $\vec{x} = (x_1, x_2, \ldots, x_n)$ and output sequence $\vec{y} = (y_1, y_2, \ldots, y_n)$ .

A practical example of a DMC arises in a communication system using an $M$ -ary memoryless modulation scheme.

Here, the modulator maps each input symbol from $\mathcal{X}$ (with $\lvert\mathcal{X}\rvert = M$ ) to one of $M$ distinct waveforms (e.g., in $M$ -ary phase-shift keying, M-PSK).

The detector processes the received waveform and produces an output symbol from $\mathcal{Y}$ , consisting of $Q$ -ary symbols (e.g., after hard or soft quantization, where $Q \geq M$ , ensures all $M$ inputs can be distinguished).

The composite channel—comprising the modulator, physical channel, and detector—is thus a DMC, as the modulation and detection processes preserve the discrete and memoryless nature of the system.

The input-output behavior of the DMC is fully described by a set of conditional probabilities $P[y \mid x]$ , where $x \in \mathcal{X}$ and $y \in \mathcal{Y}$ .

There are $M \times Q$ such probabilities, one for each possible input-output pair.

For instance, if $M = 2$ (binary input) and $Q = 2$ (binary output), as in the BSC, there are $2 \times 2 = 4$ probabilities (e.g., $P[0\mid0], P[1\mid0], P[0\mid1], P[1\mid1]$ ).

These conditional probabilities can be organized into a probability transition matrix $\mathbf{P} = [p_{ij}]$ , where:

The rows correspond to inputs $x_i$ , with $i = 1, 2, \ldots, |\mathcal{X}|$ .
The columns correspond to outputs $y_j$ , with $j = 1, 2, \ldots, |\mathcal{Y}|$ .
Each entry $p_{ij} = P[y_j \mid x_i]$ is the probability of receiving output $y_j$ given input $x_i$ .

The matrix $\mathbf{P}$ has dimensions $\lvert\mathcal{X}\rvert \times \lvert\mathcal{Y}\rvert$ (e.g., $2 \times 2$ for the BSC), and each row sums to 1 (i.e., $\sum_{j} p_{ij} = 1$ for each $i$ ), since these rows represent probability distributions over $\mathcal{Y}$ for a given $x_i$ .

This matrix, often illustrated as in the following figure, provides a compact representation of the DMC’s statistical behavior and facilitates analysis of error rates and channel capacity.

Discrete-Input, Continuous-Output Channels¶

In contrast to the DMC, the discrete-input, continuous-output channel model relaxes the constraint on the output alphabet while retaining a discrete input.

This model is defined by:

Discrete Input Alphabet: The input to the modulator is selected from a finite, discrete set $\mathcal{X}$ , with $\lvert\mathcal{X}\rvert = M$ .
For example, in QPSK ( $M = 4$ ), $\mathcal{X} = \{0, 1, 2, 3\}$ , where each symbol corresponds to a unique waveform.
Continuous Output Alphabet: The detector’s output is unquantized, meaning $\mathcal{Y} = \mathbb{R}$ , the set of all real numbers.
This occurs when the demodulator produces a continuous-valued estimate (e.g., a voltage or likelihood measure) without subsequent quantization into discrete levels.

This configuration defines a composite discrete-time memoryless channel, consisting of the modulator, physical channel, and detector.

The channel takes a discrete input $X \in \mathcal{X}$ and produces a continuous output $Y \in \mathbb{R}$ .

Its behavior is characterized by a set of conditional probability density functions (PDFs):

p(y \mid x), \quad x \in \mathcal{X}, y \in \mathbb{R}.

(22)

For each input symbol $x$ , $p(y \mid x)$ is a PDF over the real line, describing the likelihood of observing a particular output value $y$ given $x$ .

Unlike the DMC’s discrete probabilities $\bigl(P[y \mid x]\bigr)$ , here $p(y \mid x)$ is a continuous function, and the probability of $Y$ falling in an interval $[a, b]$ is:

\int_{a}^{b} p(y \mid x) \mathrm{d}y,

(23)

with:

\int_{-\infty}^{\infty} p(y \mid x) \mathrm{d}y = 1 \quad \text{for each } x.

(24)

This model is relevant when the receiver retains the full resolution of the received signal (e.g., soft-decision outputs) rather than forcing a discrete decision, providing more information for subsequent decoding processes.

Additive White Gaussian Noise (AWGN) Channel¶

The additive white Gaussian noise (AWGN) channel is one of the most fundamental examples of a discrete-input, continuous-output memoryless channel in communication theory.

Assuming a specific signal mapping, the channel is modeled by:

Y = X + N,

(25)

where:

$X$ is the discrete input drawn from an alphabet $\mathcal{X}$ (for example, a modulated symbol).
$N$ is a zero-mean Gaussian random variable with variance $\sigma^2$ . Its probability density function (PDF) is:
$p_N(n) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp \Bigl(- \frac{n^{2}}{2 \sigma^2}\Bigr),$
(26)
representing the additive noise in the channel.
$Y$ is the continuous output in $\mathbb{R}$ .

The term white indicates that the noise has a flat power spectral density (i.e., it is uncorrelated across time), while Gaussian refers to its normal distribution.

For a given input $x$ , the output $Y$ is a Gaussian random variable with mean $x$ and variance $\sigma^2$ , thus:

p(y \mid x) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp \Bigl(- \frac{(y - x)^{2}}{2 \sigma^2}\Bigr).

(27)

Multiple Inputs and Outputs¶

Consider a sequence of $n$ inputs $X_i$ , $i = 1, 2, \ldots, n$ . The corresponding outputs are:

Y_i = X_i + N_i, \quad i = 1, 2, \ldots, n,

(28)

where each $N_i$ is an independent, identically distributed (i.i.d.) Gaussian noise term,

N_i \sim \mathcal{N}(0, \sigma^2).

(29)

Because the channel is memoryless, the noise in each output $Y_i$ depends only on $X_i$ . Formally,

p(y_1, y_2, \ldots, y_n \big\vert x_1, x_2, \ldots, x_n) = \prod_{i=1}^{n} p(y_i \big\vert x_i).

(30)

Substituting the Gaussian PDF yields:

p(y_1, y_2, \ldots, y_n \big\vert x_1, x_2, \ldots, x_n) = \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi \sigma^2}} \exp \Bigl(- \frac{(y_i - x_i)^{2}}{2 \sigma^2}\Bigr).

(31)

This factorization confirms the channel’s memoryless nature, as the joint PDF of the output sequence is the product of individual PDFs, each depending only on the corresponding input.

Role of AWGN Channels¶

The AWGN channel is a cornerstone of communication theory, providing an accurate model for systems where thermal noise dominates, such as satellite links and wireless channels.

Its importance extends to analyzing modulation schemes (e.g., BPSK, QPSK) with continuous outputs prior to any quantization, forming the basis for many fundamental results in digital communications.

The Discrete-Time AWGN Channel¶

A discrete-time (continuous-input, continuous-output) additive white Gaussian noise (AWGN) channel is one in which both the input and output take values in the set of all real numbers:

\mathcal{X} = \mathcal{Y} = \mathbb{R}.

(32)

Unlike channels with discrete alphabets, this model permits continuous-valued inputs and outputs, corresponding to a situation with no quantization at either the transmitter or the receiver.

Input–Output Relationship¶

At each discrete time instant $i$ , an input $x_i \in \mathbb{R}$ is transmitted over the channel, producing the received symbol:

y_i = x_i + n_i,

(33)

where $n_i$ represents additive noise.

The noise samples $\{n_i\}$ are independent, identically distributed (i.i.d.) zero-mean Gaussian random variables with variance $\sigma^2$ .

Hence, the PDF of each $n_i$ is:

p_{N_i}(n_i) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp \Bigl(- \frac{n_i^{2}}{2 \sigma^2}\Bigr).

(34)

Given an input $x_i$ , the output $y_i$ is a Gaussian random variable with mean $x_i$ and variance $\sigma^2$ .

Thus, its conditional PDF is:

p(y_i \mid x_i) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp \Bigl(- \frac{\bigl(y_i - x_i\bigr)^{2}}{2 \sigma^2}\Bigr).

(35)

Power Constraint¶

A key practical limitation in this channel model is the power constraint on the input, expressed as an expected power limit:

\mathbb{E}\bigl[X^{2}\bigr] \le P,

(36)

which ensures that the transmitter does not exceed a certain average energy $P$ . Note that $P$ represents average power (energy per unit time), not total energy.

For a sequence of $n$ input symbols:

\vec{x} = (x_{1}, x_{2}, \ldots, x_{n}),

(37)

the time-average power is:

\frac{1}{n} \sum_{i=1}^{n} x_{i}^{2} = \frac{1}{n} \|\vec{x}\|^{2},

(38)

where:

\|\vec{x}\|^{2} = \sum_{i=1}^{n} x_{i}^{2}

(39)

is the squared Euclidean norm of $\vec{x}$ .

As $n$ grows large, the law of large numbers implies that, with high probability, the time-average power $\frac{1}{n}\|\vec{x}\|^{2}$ converges to $\mathbb{E}[X^2]$ .

Thus, the constraint:

\frac{1}{n} \|\vec{x}\|^{2} \le P

(40)

arises naturally.

In simpler terms:

\sum_{i=1}^{n} x_{i}^{2} \le n P.

(41)

Geometric Interpretation¶

Geometrically, the set of all allowable input sequences $\vec{x}$ lies within an $n$ -dimensional sphere of radius $\sqrt{n P}$ centered at the origin, since:

(\sqrt{n P})^{2} = n P \quad\Longleftrightarrow\quad \sum_{i=1}^{n} x_{i}^{2} \le n P.

(42)

This spherical boundary in $n$ -dimensional space is crucial for understanding both the channel capacity and the design of signal constellations under energy constraints.

The AWGN Waveform Channel¶

The AWGN waveform channel describes a physical communication medium in which both the input and output are continuous-time waveforms, rather than discrete symbols.

This can be interpreted as a continuous-time, continuous-input, continuous-output AWGN channel.

To highlight the core behavior of the physical channel, the modulator and demodulator are treated as separate from the channel model, directing attention solely to the process of waveform transmission.

Suppose the channel has a bandwidth $W$ , characterized by an ideal frequency response:

C(f) = 1 \quad \text{for} \quad |f| \leq W

(43)

and

C(f) = 0 \quad \text{otherwise}.

(44)

This indicates that the channel perfectly transmits signals whose frequency components lie in the interval $[-W, +W]$ and suppresses those outside this range.

The input waveform $x(t)$ is assumed to be band-limited, such that its Fourier transform satisfies:

X(f) = 0 \quad \text{for} \quad |f| > W,

(45)

ensuring conformity with the channel’s bandwidth.

At the channel output, the waveform $y(t)$ is given by:

y(t) = x(t) + n(t),

(46)

where $n(t)$ is a sample function of an additive white Gaussian noise (AWGN) process.

The noise has a power spectral density:

\frac{N_0}{2} \quad \text{(W/Hz)},

(47)

indicating that its power is distributed uniformly across all frequencies.

For a channel of bandwidth $W$ , the noise power confined within the interval $[-W, +W]$ is:

\sigma^2 = \int_{-W}^{W} \frac{N_0}{2} df = \frac{N_0}{2} \times 2W = N_0 W.

(48)

As will be clarified later, the discrete-time equivalent of this channel provides a simpler perspective through sampling.

Power Constraint and Signal Representation¶

The input waveform $x(t)$ must obey a power constraint:

\mathbb{E}[x^2(t)] \leq P,

(49)

which restricts the expected instantaneous power of $x(t)$ to $P$ .

For ergodic processes, where time averages equal ensemble averages (as is the case for stationary processes), this is expressed as:

\lim_{T \to \infty} \frac{1}{T} \int_{-T/2}^{T/2} x^2(t) dt \leq P.

(50)

Interpreted over an interval of length $T$ , this stipulates that the average energy per unit time cannot exceed $P$ .

Consequently, this condition aligns with that represented via $\mathbb{E}[x^2(t)]$ .

To analyze the channel in probabilistic terms, $x(t)$ , $y(t)$ , and $n(t)$ are expanded in terms of a complete set of orthonormal functions $\{\phi_j(t)\}$ .

When a signal has bandwidth $W$ and duration $T$ , its dimension in signal space can be approximated by $2WT$ .

This approximation follows from the sampling theorem:

A band-limited signal can be reconstructed from samples taken at the Nyquist rate, i.e., $2W$ samples per second.
Over a time interval $T$ , this yields $2W \times T$ samples, each corresponding to one dimension of the signal space.

Thus, the signal space effectively has $2W$ dimensions per second.

Orthonormal Expansion¶

Using this orthonormal set, the waveforms can be written as:

x(t) = \sum_{j} x_j \phi_j(t),

(51)

n(t) = \sum_{j} n_j \phi_j(t),

(52)

y(t) = \sum_{j} y_j \phi_j(t),

(53)

where $\{\phi_j(t), j = 1, 2, \ldots, 2WT\}$ are orthonormal basis functions (e.g., sinc functions or prolate spheroidal wave functions) satisfying:

\int \phi_i(t) \phi_j(t) dt = \delta_{ij} = \begin{cases} 1, & \text{if } i = j, \\ 0, & \text{if } i \neq j. \end{cases}

(54)

The expansion coefficients are:

x_j = \int x(t) \phi_j(t) dt, \quad n_j = \int n(t) \phi_j(t) dt, \quad y_j = \int y(t) \phi_j(t) dt,

(55)

representing the projections of the signals onto these basis functions.

Since $y(t) = x(t) + n(t)$ , substituting the expansions into this relationship results in:

\sum_{j} y_j \phi_j(t) = \sum_{j} x_j \phi_j(t) + \sum_{j} n_j \phi_j(t).

(56)

By orthonormality, matching coefficients across the sums yields:

y_j = x_j + n_j.

(57)

Because $n(t)$ is white Gaussian noise with power spectral density $\frac{N_0}{2}$ , the noise coefficients $n_j$ are independent and identically distributed (i.i.d.) Gaussian random variables with zero mean and variance $\sigma^2 = \frac{N_0}{2}$ .

Hence, each dimension of the expansion carries a noise variance of $\frac{N_0}{2}$ , consistent with the total noise power spread over the channel’s bandwidth.

Equivalent Discrete-Time Channel¶

The AWGN waveform channel can be reduced to a discrete-time model in which each output coefficient $y_j$ is related to the corresponding input coefficient $x_j$ through:

y_j = x_j + n_j.

(58)

The conditional probability density function (PDF) for each output symbol given the input symbol is:

\begin{split} p(y_j \mid x_j) &= \frac{1}{\sqrt{2\pi \sigma^2}} \exp \Bigl(-\frac{(y_j - x_j)^2}{2\sigma^2}\Bigr) \\ &= \frac{1}{\sqrt{\pi N_0}} \exp \Bigl(-\frac{(y_j - x_j)^2}{N_0}\Bigr), \end{split}

(59)

because:

\sigma^2 = \frac{N_0}{2} \quad \text{and} \quad \sqrt{2\pi \sigma^2} = \sqrt{2\pi \times \frac{N_0}{2}} = \sqrt{\pi N_0}.

(60)

Since the noise coefficients $n_j$ are independent for different values of $j$ , the overall channel is memoryless, which gives:

p(y_1, y_2, \ldots, y_N \mid x_1, x_2, \ldots, x_N) = \prod_{j=1}^{N} p(y_j \mid x_j).

(61)

Vector AWGN Model¶

From the relationship:

y_j = x_j + n_j, \quad \text{with } n_j \sim \mathcal{N}(0, \sigma^2 = N_0/2) \quad \text{for } j = 1, 2, \dots, N,

(62)

this can be rewritten in a compact vector form:

\boxed{\vec{y} = \vec{x} + \vec{n}, \quad \vec{n} \sim \mathcal{N}(\vec{0}, \frac{N_0}{2} \mathbf{I})}

(63)

where:

$\vec{x} = (x_1, x_2, \ldots, x_N)$ is the input vector (coefficients from orthonormal expansion),
$\vec{y} = (y_1, y_2, \ldots, y_N)$ is the output vector,
$\vec{n} = (n_1, n_2, \ldots, n_N)$ is the noise vector,
$\mathbf{I}$ is the identity matrix,
Noise is i.i.d. Gaussian, so the covariance matrix is diagonal with entries $N_0/2$ .

Power Constraint and Parseval’s Theorem¶

The continuous-time power constraint translates directly to the discrete coefficients.

By Parseval’s theorem, for a signal of duration $T$ :

\frac{1}{T} \int_{-T/2}^{T/2} x^2(t) dt = \frac{1}{T} \sum_{j=1}^{2WT} x_j^2.

(64)

In this interval of length $T$ , there are $2WT$ coefficients, so the average power per coefficient is:

\mathbb{E}[X^2] = \frac{1}{2WT} \sum_{j=1}^{2WT} \mathbb{E}[X_j^2].

(65)

Hence:

\lim_{T \to \infty} \frac{1}{T} \int_{-T/2}^{T/2} x^2(t) dt = \lim_{T \to \infty} \frac{1}{T} \sum_{j=1}^{2WT} x_j^2 = 2W \mathbb{E}[X^2] \leq P.

(66)

Solving for $\mathbb{E}[X^2]$ , one obtains:

\mathbb{E}[X^2] \leq \frac{P}{2W}.

(67)

Accordingly, a waveform channel of bandwidth $W$ and input power $P$ behaves like $2W$ uses per second of a discrete-time AWGN channel whose noise variance is $\sigma^2 = \frac{N_0}{2}$ .

This equivalence establishes the connection between the continuous-time channel and its discrete-time counterpart.