TND Book

Hard Decision Decoding¶

Soft decision bounds assume unquantized matched filter outputs, optimizing performance but requiring $M = 2^k$ correlation metrics, which becomes computationally intensive for large $M$ .

Hard decision decoding (HDD) quantizes these outputs digitally, reducing complexity.

By making binary decisions (0 or 1) per bit, HDD trades some performance for practicality, especially beneficial when the number of codewords is large, simplifying receiver design at the cost of precision.

HDD with Binary Decision¶

In HDD, each bit’s matched filter output is quantized to two levels (0 or 1), forming a binary symmetric channel (BSC) with crossover probability Proakis (2007, Eq. (7.5-1)):

p = Q\left(\sqrt{\frac{2 \mathcal{E}_c}{N_0}}\right) = Q\left(\sqrt{2 \gamma_b R_c}\right)

(1)

Here, $Q(\cdot)$ is the Gaussian tail function, and $\mathcal{E}_c = R_c \mathcal{E}_b$ ties error probability to SNR and code rate, modeling the AWGN channel’s effect post-quantization.

Minimum-Distance (Maximum-Likelihood) Decoding¶

In HDD, the decoder receives $n$ bits and compares them to all $M$ possible codewords, selecting the one with the smallest Hamming distance to the received sequence.

This minimum-distance decoding rule is optimal for a BSC, minimizing block error probability by choosing the most likely transmitted codeword based on bit differences, aligning with maximum-likelihood principles under binary quantization.

Minimum-Distance Decoding Rule Implementation¶

A straightforward but inefficient method computes error vectors $\vec{e}_m = \vec{y} + \vec{c}_m$ (modulo-2) for all $M$ codewords $\vec{c}_m$ , where $\vec{y}$ is the received sequence.

Each $\vec{e}_m$ represents the error pattern transforming $\vec{c}_m$ into $\vec{y}$ , with its weight (number of 1s) equaling the Hamming distance.

Selecting the $\vec{c}_m$ yielding the smallest weight $\vec{e}_m$ implements minimum-distance decoding, though this exhaustive approach scales poorly with $M$ .

Syndrome¶

A more efficient HDD method uses the parity check matrix $\mathbf{H}$ .

If $\vec{c}_m$ is transmitted and $\vec{y} = \vec{c}_m + \vec{e}$ is received (where $\vec{e}$ is the error vector), the syndrome is:

\vec{s} = \vec{y} \, \mathbf{H}^{\sf T} = \vec{c}_m \, \mathbf{H}^{\sf T} + \vec{e} \, \mathbf{H}^{\sf T} = \vec{e} \, \mathbf{H}^{\sf T}

(2)

Since $\vec{c}_m \, \mathbf{H}^{\sf T} = \vec{0}$ , $\vec{s}$ depends only on $\vec{e}$ .

This $(n - k)$ -dimensional vector identifies parity check failures, enabling error pattern detection without exhaustively testing all codewords, significantly enhancing decoding efficiency.

Detectable and Correctable Capability of HDD¶

The syndrome $\vec{s} = \vec{e} \, \mathbf{H}^{\sf T}$ reflects the error pattern $\vec{e}$ rather than the transmitted codeword, as $\vec{c}_m \, \mathbf{H}^{\sf T} = \vec{0}$ .

A zero syndrome ( $\vec{s} = \vec{0}$ ) indicates $\vec{e}$ is a codeword, leading to an undetected error if $\vec{e} \neq \vec{0}$ .

Of the $2^n - 1$ nonzero error patterns, $2^k - 1$ match nonzero codewords and are undetectable.

The remaining $2^n - 2^k$ are detectable, as their syndromes are nonzero, but not all are correctable.

With only $2^{n - k}$ distinct syndromes, multiple error patterns map to the same syndrome, limiting correction to one per coset.

Maximum-likelihood (ML) decoding seeks the least-weight error pattern per syndrome, optimizing error correction within this constraint.

Standard Array¶

A standard array is a decoding table for an $(n, k)$ code, listing all $2^k$ codewords in the first row, starting with $\vec{c}_1 = \vec{0}$ .

The second row begins with $\vec{e}_2$ , the minimum-weight non-codeword sequence, followed by $\vec{c}_m + \vec{e}_2$ for each codeword $\vec{c}_m$ .

Subsequent rows follow similarly: select the minimum-weight sequence $\vec{e}_i$ not yet listed, placing it in the first column, and complete the row with $\vec{c}_m + \vec{e}_i$ .

This process continues until all $2^n$ sequences of length $n$ are included, systematically organizing received sequences by error patterns for decoding purposes.

Standard Array Structure¶

The resulting table is an $2^{n-k} \times 2^k$ array:

\begin{array}{llcl} \vec{c}_1 = \vec{0} & \vec{c}_2 & \ldots & \vec{c}_{2^k} \\ \vec{e}_2 & \vec{c}_2 + \vec{e}_2 & \ldots & \vec{c}_{2^k} + \vec{e}_2 \\ \vec{e}_3 & \vec{c}_2 + \vec{e}_3 & \ldots & \vec{c}_{2^k} + \vec{e}_3 \\ \vdots & \vdots & \ddots & \vdots \\ \vec{e}_{2^{n-k}} & \vec{c}_2 + \vec{e}_{2^{n-k}} & \ldots & \vec{c}_{2^k} + \vec{e}_{2^{n-k}} \end{array}

(3)

Each row, a coset, groups $2^k$ sequences sharing the same error pattern (first column), called the coset leader.

By construction, coset leaders have the lowest weight in their coset, aligning with ML decoding’s preference for minimal error correction.

EXAMPLE: Standard Array for (5, 2) Code¶

For a (5, 2) systematic code with generator matrix:

\mathbf{G} = \begin{bmatrix} 1 & 0 & 1 & 0 & 1 \\ 0 & 1 & 0 & 1 & 1 \end{bmatrix}

(4)

the standard array is:

\begin{array}{cccc} 00000 & 01011 & 10101 & 11110 \\ 00001 & 01010 & 10100 & 11111 \\ 00010 & 01001 & 10111 & 11100 \\ 00100 & 01111 & 10001 & 11010 \\ 01000 & 00011 & 11101 & 10110 \\ 10000 & 11011 & 00101 & 01110 \\ 11000 & 10011 & 01101 & 00110 \\ 10010 & 11001 & 00111 & 01100 \end{array}

With $d_{\min} = 3$ , coset leaders include one weight-0 (00000), five weight-1 (e.g., 00001), and two weight-2 (e.g., 11000, 10010) patterns, filling the $2^{5-2} = 8$ rows, though many double-error patterns are excluded due to space constraints.

Error Detection¶

For a transmitted codeword $\vec{c}_m$ and error pattern $\vec{e}_i$ (a coset leader), the received sequence is $\vec{y} = \vec{c}_m + \vec{e}_i$ .

The syndrome is:

\vec{s} = \vec{y} \, \mathbf{H}^{\sf T} = \vec{e}_i \, \mathbf{H}^{\sf T}

(5)

All sequences in a coset share the same syndrome, as it depends solely on $\vec{e}_i$ .

With $2^{n-k}$ cosets, each has a unique syndrome, establishing a one-to-one mapping between cosets (or their leaders) and syndromes, enabling precise error pattern identification within the coset framework.

Decoding with Syndromes¶

Decoding $\vec{y}$ involves finding the least-weight error $\vec{e}_i$ such that $\vec{s} = \vec{e}_i \, \mathbf{H}^{\sf T}$ .

Each syndrome uniquely identifies a coset, and the coset leader $\vec{e}_i$ —the lowest-weight member—is the ML error estimate.

Adding $\vec{e}_i$ to $\vec{y}$ yields the decoded codeword $\vec{c}_m = \vec{y} + \vec{e}_i$ . Only coset leaders (up to $2^{n-k} - 1$ nonzero patterns) are correctable.

Of $2^n - 1$ nonzero error patterns, $2^k - 1$ are undetectable (codewords), $2^n - 2^k$ are detectable, and $2^{n-k} - 1$ are correctable, delineating the code’s error-handling limits.

Example: Decoding Error in (5, 2) Code¶

For the (5, 2) code with the standard array previously defined, consider an error vector $\vec{e} = (1 \, 0 \, 1 \, 0 \, 0)$ .

The syndrome, computed as $\vec{s} = \vec{e} \, \mathbf{H}^{\sf T} = (0 \, 0 \, 1)$ , corresponds to the coset leader $\hat{\vec{e}} = (0 \, 0 \, 0 \, 0 \, 1)$ from the syndrome table.

Adding $\hat{\vec{e}}$ to the received sequence $\vec{y}$ yields an incorrect codeword, as the actual error (weight 2) differs from the decoded error (weight 1), illustrating a decoding failure due to syndrome ambiguity.

Syndromes and Coset Leaders¶

The syndrome table for this code lists is presented in Table 1.

Table 1:Syndrome to Error Pattern Mapping

Syndrome	Error Pattern
000	00000
001	00001
010	00010
100	00100
011	01000
101	10000
110	11000
111	10010

This code corrects all five single-error patterns (weight 1) and two specific double-error patterns (11000, 10010), limited by the $2^{5-2} = 8$ syndromes.

MATLAB’s error detection and correction tools can simulate such scenarios, confirming the code’s capabilities.

Error Detection Capability¶

A zero syndrome ( $\vec{s} = \vec{0}$ ) indicates the received sequence is a valid codeword, one of $2^k$ possibilities.

Since $d_{\min}$ is the minimum separation between codewords, an error pattern of weight $d_{\min}$ can transform one codeword into another, resulting in an undetected error.

For errors fewer than $d_{\min}$ , the syndrome is nonzero, signaling detection.

Thus, an $(n, k)$ code detects up to $d_{\min} - 1$ errors, enabling use with automatic repeat-request (ARQ) systems to request retransmission upon detection.

Error Correction Capability¶

The error correction capability hinges on $d_{\min}$ , constrained by the $2^{n-k}$ syndromes.

Viewing the $2^k$ codewords as points in an $n$ -dimensional space, each is the center of a sphere with radius $t$ (Hamming distance).

The maximum $t$ avoiding sphere overlap is:

\boxed{ t = \left\lfloor \frac{1}{2}(d_{\min} - 1) \right\rfloor }

(6)

Sequences within radius $t$ of a codeword are corrected to it, ensuring unique decoding.

Thus, the code corrects up to $t$ errors, determined by $d_{\min}$ , balancing the trade-off between detection and correction.

Trade-off Between Detection and Correction ( $e_d$ and $e_c$ )¶

Correcting $t$ errors implies detecting at least $t$ , but a code can detect more errors by reducing correction capability.

For $d_{\min} = 7$ , $t = \left\lfloor \frac{6}{2} \right\rfloor = 3$ errors are correctable.

Reducing the sphere radius to 2 allows detection of 4 errors, correcting only 2; or to 1, detecting 5 and correcting 1.

Generally, for $d_{\min}$ , the number of detectable errors $e_d$ and correctable errors $e_c$ satisfy:

e_d + e_c \leq d_{\min} - 1, \quad e_c \leq e_d

(7)

This flexibility allows tailoring the code for detection-heavy (e.g., ARQ) or correction-focused applications, optimizing performance based on channel needs.

Block and Bit Error Probability for HDD¶

In hard decision decoding (HDD) of linear binary block codes, error probability bounds are derived based solely on error correction.

The optimum decoder for a binary symmetric channel (BSC) correctly decodes if the number of errors is less than half the minimum distance $d_{\min}$ , though this is not a strict necessity.

The guaranteed correctable error count is:

t = \left\lfloor \frac{1}{2}(d_{\min} - 1) \right\rfloor

(8)

This $t$ represents the maximum number of errors the code can always correct, leveraging $d_{\min}$ to ensure distinct codeword spheres in Hamming space.

Error Probability Bounds¶

Given the BSC’s memoryless nature, bit errors are independent, with the probability of $m$ errors in $n$ bits given by Proakis (2007, Eq. (7.5-5)):

P(m, n) = \binom{n}{m} p^m (1 - p)^{n-m}

(9)

where $p$ is the crossover probability.

The block error probability $P_e$ is upper-bounded by the probability of exceeding $t$ errors Proakis (2007, Eq. (7.5-6)):

P_e \leq \sum_{m=t+1}^{n} P(m, n)

(10)

For high SNR (small $p$ ), this approximates to the dominant term:

P_e \approx \binom{n}{t + 1} p^{t+1}(1 - p)^{n-t-1}

(11)

This binomial approximation highlights the steep decline in error probability as $p$ decreases, reflecting the code’s robustness at high SNR.

Perfect Code and Quasi-Perfect Code¶

Equality in the block error bound holds for perfect codes, where all $2^n$ possible sequences are either codewords or exactly $t$ errors away from one, fully packing the Hamming space.

Hamming codes ( $n = 2^{n-k} - 1$ , $d_{\min} = 3$ , $t = 1$ ) exemplify this, correcting all single errors precisely.

Quasi-perfect codes have disjoint spheres of radius $t$ around $M$ codewords, with every sequence at most $t + 1$ from a codeword.

Their error probability is Proakis (2007, Eq. (7.5-14)):

P_e = \sum_{m=t+2}^{n} P(m, n) + \left[ \binom{n}{t+1} - \beta_{t+1} \right] p^{t+1}(1 - p)^{n-t-1}

(12)

Here, $\beta_{t+1}$ adjusts for sequences at distance $t + 1$ not uniquely correctable, refining the bound for such codes.

Alternative Upper and Lower Bounds of $P_e$ ¶

Considering two codewords at distance $d_{\min}$ , the lower bound on $P_e$ is the probability of mistaking one for its nearest neighbor Proakis (2007, Eq. (7.5-15)):

P_e \geq \sum_{m=\left\lfloor d_{\min}/2 \right\rfloor +1}^{d_{\min}} \binom{d_{\min}}{m} p^m (1 - p)^{d_{\min}-m}

(13)

This reflects the minimum error rate from confusing closest codewords.

The upper bound multiplies this by the number of possible incorrect codewords Proakis (2007, Eq. (7.5-16)):

P_e \leq (2^k - 1) \sum_{m=\left\lfloor d_{\min}/2 \right\rfloor +1}^{d_{\min}} \binom{d_{\min}}{m} p^m (1 - p)^{d_{\min}-m}

(14)

For large $M = 2^k$ , these bounds widen significantly, as the upper bound scales with the codebook size, making them less tight but still illustrative of $P_e$ ’s range.