Block Codes¶
Codewords¶
A -ary block code, denoted , consists of a set of vectors, referred to as codewords.
Each codeword is of fixed length and is represented as:
where is the index of the codeword (), and each component (for ) is a symbol from an alphabet of size .
The code can be mathematically expressed as:
where is the alphabet with , and denotes the set of all possible sequences of length over .
Alphabet and Symbols¶
The alphabet comprises distinct symbols, such as for a binary code () or for a quaternary code ().
Each component of a codeword, , determines the code’s structure and its applicability in various systems.
Binary Codes¶
When , the alphabet is binary, consisting of symbols , and the code is termed a binary code.
In this case, each codeword is a vector of bits:
For example, a binary codeword of length might be .
Relation Between -ary and Binary Representation¶
If the alphabet size is a power of 2, i.e., for some positive integer , each -ary symbol can be represented as a -bit binary sequence.
For instance, if , the alphabet can be mapped as:
This mapping enables efficient encoding of -ary symbols into binary form for storage or transmission.
Transformation of Nonbinary Codes to Binary Codes¶
A nonbinary code with block length consists of codewords with -ary symbols.
Such a code can be transformed into a binary code of length by representing each -ary symbol (where ) as its -bit binary equivalent.
For example, consider a quaternary code () with block length and a codeword .
Using the mapping above, and , the codeword is transformed into a binary sequence of length :
Formally, a nonbinary code (with ) is converted to a binary code , where each symbol in is expanded into bits.
Example of a -ary Block Code¶
A -ary block code is defined with:
Alphabet: , (ternary).
Block length: .
Number of codewords: .
The code consists of all possible sequences of length :
Each codeword , with , satisfies , where .
This code has no error-correcting capability (minimum Hamming distance 1), but it exemplifies a -ary block code structure.
Conversion to Binary Block Code¶
To convert to a binary block code , each ternary symbol is mapped to a 2-bit sequence ():
Each codeword of length becomes a binary codeword of length:
Transforming the codewords:
,
,
,
,
,
,
,
,
.
The binary code is:
Parameters of ¶
Alphabet: , .
Block length: .
Number of codewords: .
The code , with , preserves the size and has a minimum Hamming distance of 1, consistent with the original code.
Review of Hamming Distance¶
The Hamming distance is a metric to measure the difference between two codewords of equal length.
It quantifies the number of positions at which their corresponding symbols differ, assessing a code’s error-detecting and error-correcting capabilities.
Definition¶
For two codewords and of length over an alphabet , the Hamming distance, denoted , is defined as:
where:
Thus, counts the number of indices (for ) where .
Example¶
Consider a ternary () block code with codewords of length over .
Let two codewords be:
Compute the Hamming distance:
Position 1: , , so .
Position 2: , , so .
Position 3: , , so .
Thus:
The Hamming distance between and is 2, indicating they differ in two positions.
Note that the Hamming distance of 2 refers to the distance between two specific codewords, and , which differ in two positions.
This does not imply that 2 is the minimum distance of the code, as other pairs in the (unspecified) code could have smaller distances.
Application in Error Detection and Correction¶
The minimum Hamming distance of a code, , determines its error-correcting capability.
A code can detect up to errors and correct up to errors, where is the minimum distance.
Code¶
In a binary block code of length , there are possible codewords, representing all possible combinations of bits over the alphabet .
From this set, codewords, where , are selected to form an code.
Each codeword uniquely corresponds to a -bit information block, with the redundant bits enabling error detection or correction.
This encoding process, often implemented via a generator matrix or systematic encoding, defines the code’s structure.
The code rate, given by:
quantifies the fraction of information bits per codeword, reflecting the trade-off between data efficiency and error-correcting capability.
For a general -ary code over an alphabet of symbols, there are possible codewords.
Selecting codewords, where , allows the encoding of -symbol information blocks, with each symbol carrying bits of information.
This generalizes the binary case (), offering increased flexibility in code design for diverse applications, such as high-density data transmission or robust error correction in nonbinary channels.
The code rate remains , and the choice of influences the code’s error-correcting power, often measured by the minimum Hamming distance.
Connection Between Code and -ary Block Code¶
We can say that the code is a specific parameterization of a -ary block code tailored for error control.
Recall the general structure as:
-ary Block Code: A -ary block code consists of codewords, each a vector of length (or in the revised version), with components drawn from an alphabet of symbols.
The code is a subset of all possible sequences:
The number of codewords and the block length ( or ) are general parameters, and the code may be binary () or nonbinary ().
Code: The code is a specific type of block code where the number of codewords is explicitly defined as for a -ary alphabet, and each codeword has length .
Thus, it is a -ary block code with:
The parameters and define the codeword length and the size of the information block, respectively.
Thus, we can see that the code is a -ary block code with a constrained size, where the number of codewords is .
The -ary block code is a more general concept, allowing any , whereas the code fixes to encode exactly -ary symbols of information, introducing structure for error control.
Weight (Overview)¶
A key codeword parameter is its weight, defined as the number of nonzero elements it contains, reflecting its density of active symbols.
Each codeword in a code typically has a unique weight, and the collection of all weights forms the weight distribution.
If all codewords share the same weight, the code is classified as a fixed-weight or constant-weight code.
Linear Block Codes¶
Linear block codes (LBCs), a prominent subset of block codes, have been extensively studied over recent decades due to their advantageous properties.
Their linearity ensures easier implementation and analysis, as operations follow algebraic rules, reducing computational complexity.
LBCs perform comparably to the broader class of block codes, making them practical without significant loss in efficacy.
An LBC is a -dimensional subspace within an -dimensional space, termed an code.
For binary LBCs, contains sequences of length , closed under addition: if , then .
This closure ensures the all-zero vector is always a codeword, a property stemming from the linear structure.
Review of Arithmetic in ¶
In coding theory, particularly for binary linear block codes, the underlying algebraic structure is the finite field .
Arithmetic operations (addition, subtraction, multiplication, and division, where applicable) are computed with the result taken modulo 2, ensuring that outputs remain in the set .
The operations in are defined as follows:
Addition (mod 2):
This is equivalent to the exclusive OR (XOR) operation:
because , and .
Addition is commutative and associative, and the additive identity is 0.
Each element is its own additive inverse: , .
This is often denoted by the symbol in coding theory.
Subtraction: Since has characteristic 2, subtraction is identical to addition:
For example, .
Multiplication (mod 2):
This is equivalent to the logical AND operation.
The multiplicative identity is 1.
Division is only defined for non-zero elements (i.e., 1), and .
Modulo 2: Taking an integer modulo 2 maps it to :
For example, in vector operations, all components are 0 or 1, and sums like arise.
Generator Matrix for LBC¶
In an LBC, the mapping from information sequences of length to codewords of length is represented by a generator matrix , where:
Here, is a -bit information sequence, and is its corresponding codeword.
The matrix encapsulates the linear transformation, enabling systematic encoding by defining how information bits generate codewords, streamlining both design and hardware realization.
The rows of , denoted for , are codewords corresponding to the standard basis vectors of the information space (e.g., , etc.), structured as:
Thus, a codeword is:
where summation occurs in the Galois Field (modulo-2 arithmetic), ensuring binary operations.
The code is the row space of , encompassing all linear combinations of its rows.
Two LBCs, and , are equivalent if their generator matrices share the same row space, possibly after column permutation, indicating that code properties depend on the subspace, not the specific matrix representation.
Parity Check Bits¶
When the generator matrix of a linear block code is structured as Proakis (2007, Eq. (7.2-4)):
where is a identity matrix and is a matrix, the code is classified as systematic.
In systematic codes, the first components of a codeword directly correspond to the information sequence, explicitly retaining the original data.
The subsequent components, termed parity check bits, are generated from the information bits using and provide redundancy to mitigate transmission errors.
It is noteworthy that any linear block code can be converted into an equivalent systematic form, as represented by the above equation, through elementary row operations (such as row swapping or addition) and column permutations, preserving the code’s essential characteristics while achieving this structured format.
Example¶
To demonstrate the conversion of a linear block code into an equivalent systematic form, consider a binary linear block code (codeword length , information bits ) with the generator matrix:
This matrix is not in systematic form, as it lacks the structure , where is the identity matrix.
Perform Elementary Row Operations
The objective is to transform the first columns into the identity matrix .
The first column of is already .
To achieve in the second column, we need a 0 in position (1, 2) (currently 1) and a 1 in position (2, 2) (already 1).
Perform the row operation: (in , addition is XOR):
The resulting matrix is:
The first two columns now form , so the matrix is in systematic form:
Note that no column permutations were required here, as the first two columns were transformed into using row operations.
Preservation of Code Characteristics
The matrix generates codewords equivalent to those of . For an information sequence , the codeword using is:
The original matrix produces:
The codewords span the same code (up to permutation), preserving the error-correcting properties, such as the minimum distance.
This example shows how a single row operation transforms into the systematic , maintaining the code’s essential characteristics.
Dual Code and Parity Check Matrix¶
An orthogonal complement of code consists of all -dimensional binary vectors orthogonal to every codeword in .
Since constitutes a -dimensional subspace within the -dimensional binary vector space, its orthogonal complement forms an -dimensional subspace.
This subspace defines an code, denoted , referred to as the dual code of .
The generator matrix of , an matrix, has rows orthogonal to those of , the generator matrix of .
This matrix, known as the parity check matrix of and denoted , satisfies the condition that for every codeword :
This orthogonality property enables to serve as a mechanism for verifying membership in , leveraging the algebraic structure of linear codes for error detection.
Parity Check Matrix Properties¶
A binary -dimensional vector satisfies if and only if it resides in the orthogonal complement of the row space of , which corresponds exactly to .
Thus, this equation establishes a necessary and sufficient condition for to be a codeword, rendering an effective tool for codeword identification.
Given that the rows of are codewords, it follows that:
For systematic codes where , the parity check matrix is expressed as Proakis (2007, Eq. (7.2-7)):
In the binary field (GF(2)), where modulo-2 arithmetic applies, , simplifying to .
This configuration ensures that the product holds, as the interaction between and aligns with the identity matrices to satisfy the orthogonality condition.
Example¶
We illustrate the concepts of the dual code and parity check matrix, consider a linear block code over , with codeword length and information bits .
The dual code is a code.
Define the Code
Let the generator matrix of be:
This is in systematic form, , where:
The codewords are generated as for . Computing all codewords:
:
:
:
:
Thus:
This is a 2-dimensional subspace of .
Find the Dual Code
Note that all arithmetic is mod 2.
The dual code consists of all such that for all .
Since is spanned by the rows of , must be orthogonal to:
Row 1: , so
Row 2: , so
Let , . Then , , so .
The codewords are:
:
:
:
:
Thus:
This is a code, with dimension .
Construct the Parity Check Matrix
The parity check matrix of is the generator matrix of , an matrix. For a systematic , we have:
Thus:
The rows and are in .
Verify Orthogonality
For every , we must have , where:
Verify for each codeword:
:
:
:
:
All satisfy .
Error Detection
The matrix enables error detection. For a received vector , the syndrome is . If , then . For example, test :
This indicates , detecting an error.
This example demonstrates that is the parity check matrix of , generates , and supports error detection, as described.
EXAMPLE: (7, 4) Linear Block Code¶
Consider a (7, 4) linear block code with the generator matrix:
This structure, featuring a 4×4 identity matrix followed by a 4×3 matrix , identifies it as a systematic code, where the first four bits of each codeword mirror the information sequence.
The corresponding parity check matrix is:
For an information sequence , the codeword is computed as:
Here, the first four components copy the input, while the last three (parity bits) are linear combinations over GF(2), enhancing error detection capability.
- Proakis, J. (2007). Digital Communications (5th ed.). McGraw-Hill Professional.