TND Book

Performance Analysis of Linear Block Codes¶

Weight and Distance for Linear Block Codes¶

The weight of a codeword $\vec{c}$ in a code $\mathcal{C}$ , denoted $w(\vec{c})$ , is the count of its nonzero components, a measure of its active elements.

Since the all-zero vector $\vec{0}$ is a codeword in every linear block code (due to linearity), each such code includes exactly one codeword with weight zero.

The Hamming distance between two codewords $\vec{c}_1, \vec{c}_2 \in \mathcal{C}$ , denoted $d(\vec{c}_1, \vec{c}_2)$ , is the number of positions where they differ, quantifying their separation.

Notably, a codeword’s weight is its Hamming distance from $\vec{0}$ , linking these concepts directly in the context of linear structures.

Relationship between Weight and Distance¶

In linear block codes, the distance between codewords $\vec{c}_1$ and $\vec{c}_2$ equals the weight of their difference: $d(\vec{c}_1, \vec{c}_2) = w(\vec{c}_1 - \vec{c}_2)$ .

Since $\vec{c}_1 - \vec{c}_2$ is itself a codeword (due to closure under addition), this establishes a one-to-one correspondence between weights and distances.

Consequently, the set of distances from any codeword $\vec{c}$ to all others matches the set of weights of all codewords, a property independent of the chosen $\vec{c}$ .

This uniformity simplifies analysis, as the distance profile remains consistent across perspectives.

In binary codes, subtraction and addition are equivalent modulo-2, so $\vec{c}_1 - \vec{c}_2 = \vec{c}_1 + \vec{c}_2$ , reinforcing this relationship.

Minimum Distance and Minimum Weight¶

The minimum distance of a code, $d_{\min}$ , is the smallest Hamming distance between any two distinct codewords Proakis (2007, Eq. (7.2-11)):

d_{\min} \triangleq \min_{\substack{\vec{c}_1,\vec{c}_2 \in \mathcal{C} \\ \vec{c}_1 \neq \vec{c}_2}} d(\vec{c}_1, \vec{c}_2)

(1)

It determines the code’s error-correcting capability.

The minimum weight, $w_{\min}$ , is the smallest weight among nonzero codewords Proakis (2007, Eq. (7.2-11)):

w_{\min} \triangleq \min_{\substack{\vec{c} \in \mathcal{C} \\ \vec{c} \neq \vec{0}}} w(\vec{c})

(2)

For linear block codes, $d_{\min} = w_{\min}$ , as the distance between any two codewords corresponds to the weight of a nonzero codeword, a key property simplifying code evaluation.

Relationship between Minimum Weight and Parity Check Matrix $\mathbf{H}$ ¶

The minimum weight $w_{\min}$ (or equivalently $d_{\min}$ ) of a linear block code is intricately tied to the parity check matrix $\mathbf{H}$ .

A vector $\vec{c} \in \{0, 1\}^n$ is a codeword if $\vec{c} \, \mathbf{H}^{\sf T} = \vec{0}$ .

For a codeword of minimum weight, this implies that $w_{\min}$ columns of $\mathbf{H}$ are linearly dependent, as their combination yields the zero vector.

Conversely, no fewer than $d_{\min}$ columns can be linearly dependent, since no nonzero codeword has a weight less than $d_{\min}$ .

Thus, $d_{\min}$ is the smallest number of linearly dependent columns in $\mathbf{H}$ , and the dimension of $\mathbf{H}$ ’s column space is $d_{\min} - 1$ , offering a structural insight into the code’s strength.

Relationship Between Hamming Distance and Euclidean Distance of the Codewords¶

In binary antipodal signaling, such as BPSK modulation, the binary components (0 and 1) of a codeword $\vec{c} \in \mathcal{C}$ are mapped to $-\sqrt{\mathcal{E}_c}$ and $+\sqrt{\mathcal{E}_c}$ , respectively, where $\mathcal{E}_c$ is the energy per codeword component.

For a modulated sequence $\vec{s}$ corresponding to codeword $\vec{c}$ , the components are:

s_{mj} = (2c_{mj} - 1)\sqrt{\mathcal{E}_c}, \quad 1 \leq j \leq n, \quad 1 \leq m \leq M

(3)

This mapping transforms binary differences into signal amplitudes.

The squared Euclidean distance between two modulated sequences $\vec{s}_m$ and $\vec{s}_{m'}$ relates to the Hamming distance $d(\vec{c}_m, \vec{c}_{m'})$ as Proakis (2007, Eq. (7.2-14)):

d_{\vec{s}_m,\vec{s}_{m'}}^2 = 4\mathcal{E}_c d(\vec{c}_m, \vec{c}_{m'})

(4)

Each differing bit contributes a distance of $2\sqrt{\mathcal{E}_c}$ , squared to $4\mathcal{E}_c$ .

Consequently, the minimum Euclidean distance $d_{\mathcal{E}_{\min}}$ of the modulated sequences is Proakis (2007, Eq. (7.2-15)):

d_{\mathcal{E}_{\min}}^2 = 4\mathcal{E}_c d_{\min} = 4 R_c \mathcal{E}_b d_{\min}

(5)

where $d_{\min}$ is the minimum Hamming distance, $R_c = k/n$ is the code rate, and $\mathcal{E}_b$ is the energy per bit, linking physical signal separation to code structure.

The Weight Distribution Polynomial¶

An $(n, k)$ linear block code comprises $2^k$ codewords with weights ranging from 0 to $n$ .

The presence of the all-zero codeword ensures one codeword of weight 0, while nonzero codewords have weights from $d_{\min}$ (the minimum distance) to $n$ .

The weight distribution polynomial (WEP), or weight enumeration function (WEF), denoted $A(Z)$ , quantifies this distribution Proakis (2007, Eq. (7.2-16)):

\begin{split} A(Z) &\triangleq \sum_{i=0}^{n} A_i Z^i \\ &= 1 + \sum_{i=d_{\min}}^{n} A_i Z^i \end{split}

(6)

Here, $A_i$ is the number of codewords with weight $i$ .

The polynomial encodes the code’s weight profile, starting with $A_0 = 1$ (for $\vec{0}$ ) and detailing the frequency of each nonzero weight, a critical tool for analyzing error-correcting performance.

Input-Output Weight Enumeration Function¶

The input-output weight enumeration function (IOWEF), denoted $B(Y, Z)$ , extends the WEP by incorporating both codeword weights and the weights of their corresponding information sequences Proakis (2007, Eq. (7.2-25)):

B(Y, Z) \triangleq \sum_{i=0}^{n} \sum_{j=0}^{k} B_{ij} Y^j Z^i

(7)

where $B_{ij}$ represents the number of codewords of weight $i$ generated by information sequences of weight $j$ .

This bivariate polynomial relates input and output weights, with:

A_i = \sum_{j=0}^{k} B_{ij}

(8)

summing contributions across all input weights to yield the WEP’s coefficients.

For linear block codes, $B(0, 0) = B_{00} = 1$ (the zero input maps to the zero codeword), and:

A(Z) = B(Y, Z) \Big|_{Y=1}

(9)

collapsing $B(Y, Z)$ to $A(Z)$ by summing over all input weights, providing a comprehensive view of code behavior.

Conditional Weight Enumeration Function¶

The conditional weight enumeration function (CWEF), denoted $B_j(Z)$ , focuses on codewords tied to information sequences of a specific weight $j$ Proakis (2007, Eq. (7.2-28)):

B_j(Z) \triangleq \sum_{i=0}^{n} B_{ij} Z^i

(10)

It enumerates the weight distribution of all codewords generated by inputs of weight $j$ , offering a detailed breakdown by input weight.

The relationship to the IOWEF is derived as:

B_j(Z) = \frac{1}{j!} \left. \frac{\partial^j}{\partial Y^j} B(Y, Z) \right|_{Y=0}

(11)

This partial derivative extracts the $j$ -th term’s coefficient from $B(Y, Z)$ , isolating the contribution of inputs with weight $j$ .

Together, these functions provide a layered analysis of the code’s structure, connecting input patterns to output characteristics.

EXAMPLE: (7, 4) Linear Block Code Weight Distribution¶

For the (7, 4) linear block code from a prior example, there are $2^4 = 16$ codewords, with weights ranging from 0 to 7.

By generating all codewords from information sequences $\vec{w} = (w_1, w_2, w_3, w_4)$ , the minimum distance is verified as $d_{\min} = 3$ .

The weight distribution reveals one codeword of weight 0 (the all-zero codeword), one of weight 7, seven of weight 3, and seven of weight 4.

Thus, the weight distribution polynomial $A(Z)$ is:

A(Z) = 1 + 7Z^3 + 7Z^4 + Z^7

(12)

This polynomial encapsulates the frequency of each weight, providing a concise summary of the code’s structure and its error-correcting potential based on weight diversity.

Input-Output Weight Enumeration Function¶

For the same code, the input-output weight enumeration function (IOWEF) $B(Y, Z)$ details both input and output weights.

Coefficients are: $B_{00} = 1$ , $B_{31} = 3$ , $B_{32} = 3$ , $B_{33} = 1$ , $B_{41} = 1$ , $B_{42} = 3$ , $B_{43} = 3$ , $B_{74} = 1$ , yielding:

\begin{split} B(Y, Z) = 1 &+ 3YZ^3 + 3Y^2Z^3 + Y^3Z^3 \\ &+ YZ^4 + 3Y^2Z^4 + 3Y^3Z^4 + Y^4Z^7 \end{split}

(13)

The conditional weight enumeration functions (CWEF) are:

\begin{split} B_0(Z) &= 1, \\ B_1(Z) &= 3Z^3 + Z^4, \\ B_2(Z) &= 3Z^3 + 3Z^4, \\ B_3(Z) &= Z^3 + 3Z^4, \\ B_4(Z) &= Z^7 \end{split}

(14)

These functions map input weights (0 to 4) to output weight distributions, offering a detailed view of how information sequence weights influence codeword weights.

Error Probability of Linear Block Codes¶

Linear block codes involve two key error probabilities.

The block error probability (or word error probability) is the likelihood of transmitting a codeword $\vec{c}_m$ and decoding it as a different codeword $\vec{c}_{m'}$ , reflecting whole-codeword errors.

The bit error probability is the probability that an individual information bit is received incorrectly, focusing on per-bit accuracy.

These metrics assess different aspects of code performance, with block errors impacting entire messages and bit errors affecting data integrity at a finer level.

Block Error Probability¶

Due to the linearity of the code, the distances from any codeword $\vec{c}_m$ to all others are consistent across all choices of $\vec{c}_m$ .

Thus, analysis can assume the all-zero codeword $\vec{0}$ is transmitted without loss of generality.

The block error probability $P_e$ occurs when the receiver decodes any nonzero codeword $\vec{c}_m \neq \vec{0}$ .

This is bounded by the sum of pairwise error probabilities $P_{\vec{0} \rightarrow \vec{c}_m}$ , the probability of mistaking $\vec{0}$ for $\vec{c}_m$ :

P_e \leq \sum_{\vec{c}_m \in \mathcal{C} \atop \vec{c}_m \neq \vec{0}} P_{\vec{0} \rightarrow \vec{c}_m}

(15)

This union bound accounts for all possible error events, providing an upper limit on the likelihood of decoding errors.

Upper Bound of $P_e$ ¶

The pairwise error probability $P_{\vec{0} \rightarrow \vec{c}_m}$ depends on the Hamming distance from $\vec{0}$ to $\vec{c}_m$ , which is $w(\vec{c}_m)$ , and varies with modulation.

For codewords of equal weight, $P_{\vec{0} \rightarrow \vec{c}_m}$ is identical, leading to Proakis (2007, Eq. (7.2-34)):

\boxed{ P_e \leq \sum_{i=d_{\min}}^{n} A_i P_2(i) }

(16)

where $P_2(i)$ is the pairwise error probability for distance $i$ , and $A_i$ is the number of codewords of weight $i$ .

Using the Bhattacharyya bound for a memoryless channel:

P_{\vec{0} \to \vec{c}_m} \leq \prod_{i=1}^{n} \sum_{y_i \in \mathcal{Y}} \sqrt{p(y_i|0)p(y_i|c_{mi})}

(17)

Defining $\Delta = \sum_{y \in \mathcal{Y}} \sqrt{p(y|0)p(y|1)}$ , this simplifies to:

P_{\vec{0} \to \vec{c}_m} = P_2(w(\vec{c}_m)) \leq \Delta^{w(\vec{c}_m)}

(18)

Thus:

P_e \leq \sum_{i=d_{\min}}^{n} A_i \Delta^i = A(\Delta) - 1

(19)

Since $\Delta \leq 1$ (from $\sum_{y} (\sqrt{p(y|0)} - \sqrt{p(y|1)})^2 \geq 0$ ), and $\Delta^i \leq \Delta^{d_{\min}}$ , a looser bound is Proakis (2007, Eq. (7.2-43)):

\boxed{ P_e \leq (2^k - 1)\Delta^{d_{\min}} }

(20)

Bit Error Probability¶

Bit error probabilities vary across the $k$ positions in an information sequence.

The average bit error probability $P_b$ is defined as the mean error rate per bit.

Assuming $\vec{0}$ is transmitted, the probability of decoding a codeword of weight $i$ is $P_2(i)$ , with $B_{ij}$ codewords of weight $i$ from input weight $j$ .