Coding Theory MATH422

Chapter 1 - Introduction to Codes

This is where most (though not all) students realize this course isn't about programming

Code Definitions

A code \(C\) is given set of \(M\) codewords, which are in turn finite sequences of symbols from the code's Alphabet \(F_q\). E.g. \(C_0 = \set{000, 010, 101, 111}\) is a code.

A word or vector is any sequence of symbols from \(F_q\), so the set of words of size \(n\) is \(F_q\times F_q \times \dots \times F_q = (F_q)^n\). Not all words are codewords, i.e. \(C \subseteq F_q\times F_q \times \dots \times F_q\). Generally, we endeavour to encode words into codewords.

A code can be written as an \(M\times n\) array where the rows are the codewords of \(C\). E.g. \(C_0\) from earlier is represented by the matrix \(\begin{bmatrix}0&0&0\\0&1&0\\1&0&1\\1&1&1\end{bmatrix}\).

Basic Code Parameters

Definition
An \((n, M, d)\) code has \(M\) codewords of length \(n\) (i.e. a length of \(n\)) with minimum distance \(d\)

Distance

The Hamming distance \(d(\vec{x}, \vec{y})\) between vectors \(\vec{x}, \vec{y} \in (F_q)^n\) is defined as the number of places in which they differ.

The Hamming distance is a distance function because it satisfies

  1. \(d(\vec{x}, \vec{y})=0\) if and only if \(\vec{x} = \vec{y}\)
  2. \(d(\vec{x}, \vec{y})=d(\vec{y}, \vec{x})\) for all \(\vec{x}\) and \(\vec{y}\)
  3. \(d(\vec{x}, \vec{y}) \leq d(\vec{x}, \vec{z}) + d(\vec{z}, \vec{y})\) for all \(\vec{x}, \vec{y}, \vec{z}\) (the triangle inequality)

The minimum distance \(d(C)\) of code \(C\) is the smallest distance between any two codewords in the code. So, \(d(C_0)=1\) because \(000\) and \(010\) have distance \(1\).

Error Detection and Correction

Theorem 1.9

Theorem
A code \(C\) can detect up to \(s\) errors in a given codeword if \(d(C)\geq s+1\), and correct up to \(t\) errors if \(d(C)\geq 2t+1\).

By corollary (1.10) a code \(C\) can detect up to \(d(C)-1\) errors and correct up to \(\displaystyle\left\lfloor\dfrac{d(C)-1}{2}\right\rfloor\) errors

Channels

We assume vector \(\vec{x}\) is transmitted and vector \(\vec{y}\) is received, possibly having been distorted.

A communication channel is q-ary symmetric if each symbol has the same probability \(p < \frac{1}{2}\) of being received in error, and each of the \(q-1\) possible errors for a given symbol is equally likely.


Pages: 1-10

Chapter 2 - The main coding theory problem

The main problem of coding theory is that I decided to take coding theory

What Makes a Good Code?

"Good" codes generally have

[!Field] Coding Theory The main problem of coding theory is the optimization of one parameter \(n, M, d\) of a code given values for the other two.

Equivalent Codes

Two \(q\)-ary codes are equivalent of one can be obtained from the other by

  1. Permutation of the positions of the code → permutation of the columns of the code's matrix
  2. Permutation of the symbols of the code → (internally) relabelling the symbols in a column of the code's matrix

Distances between codewords are invariant under this operation, so equivalent codes have the same parameters \(n, M, d\), and thus have the same error detection and correction capabilities.

LEMMA 2.3 Any \(q\)-ary \((n, M, d)\) code over alphabet \(F_q=\set{0, 1, \dots, q-1}\) is equivalent to an \((n, M, d)\) code containing the zero vector \(\vec{0}=00\dots 0\)

Optimizing \(M\)

We use \(A_q(n, d)\) to denote the largest \(M\) such that a \(q\)-ary \((n, M, d)\) code exists, i.e. the number of codewords that can exist in a code given \(n\) and \(d\).

Aside: the number of \(q\)-ary \((n, M, \_)\) codes is \(\displaystyle{2^{n}\choose M}\)

Binary Codes

Theorem 2.7

Theorem
For odd \(d\), a binary \((n, M, d)\) code exists if and only if a binary \((n+1, M, d+1)\) code exists.

Corollary 2.8

Corollary
For odd \(d\), \(A_2(n+1, d+1)=A_2(n,d)\). Thus, for even \(d\), \(A_2(n, d)=A_2(n-1, d-1)\)

Spheres and Sphere Packing

A sphere \(S(\vec{u}, r)\) of radius \(r\) around vector \(\vec{u}\) is the set of vectors in \((F_q)^n\) whose distance from \(\vec{u}\) is less than \(r\), i.e. \(S(\vec{u}, r)=\set{\vec{v} \in (F_q)^{n} \mid d(\vec{u}, \vec{v})\leq r}\)

For \(t\) error-detecting codes, we have \(d(C)\geq2t+1\), implying that the spheres with radius \(t\) centered on the codewords of \(C\) are disjoint. This implies that we can simply pick the (closest) sphere a received vector is in to decode it. This is an instance of nearest neighbour decoding.

A sphere of radius \(r\) in \((F_q)^n\) contains \(\displaystyle\sum\limits_{i=0}^{r} {n \choose i}(q-1)^i\) vectors

Sphere Packing Bound

Theorem
A \(q\)-ary \((n, M, 2t+1)\)-code satisfies \(M\times\displaystyle\sum\limits_{i=0}^{t} {n \choose i}(q-1)^{i}\leq q^n\)

Codes that reach the sphere packing bound are perfect codes; the spheres of radius \(t\) centered at a perfect code's codewords "fill" all of \((F_q)^n\) without overlapping.

Balanced Block Designs

Balanced Block Design

Definition
A balanced block design is a set \(S\) of \(v\) points/varieties with a collection of \(b\) subsets of itself, called blocks. For fixed \(k, r, \lambda\), we have

The parameters \((b, v, r, k, \lambda)\) are not independent; we find the following constraints (among others)

A balanced block design is symmetric if \(v=b\), which implies \(k=r\) as well.

We can describe a balanced block design by an incidence matrix, where the columns correspond to blocks, rows correspond to points, and each entry is \(0\) or \(1\) depending on whether a particular point is in a particular block.

Aside: balanced block designs have applications beyond coding theory, e.g. statistical testing combinations of fertilizers on different crops.


Pages: 11-29

Chapter 3 - Finite Fields

It's pronounced "gal-WUAH"

Recap: Algebraic Structures

Field

Definition
A field \(F\) is a set of elements equipped with addition \(+\) and multiplication \(\cdot\) operations that satisfies the following properties:

  1. Closure under \(+\) and \(\cdot\)
  2. Commutative \(+\) and \(\cdot\)
  3. Associative \(+\) and \(\cdot\)
  4. Distributivity: \(a\cdot(b+c)=a\cdot b+a\cdot c\)

A field must also have the identity elements \(0\) and \(1\), satisfying for all \(a\in F\):

  1. Additive Identity: \(a+0=a\)
  2. Multiplicative Identity: \(a \cdot 1=a\)
  3. Additive inverse: \(-a\) exists where \(a+(-a)=0\)
  4. Multiplicative Inverse: \(a^{-1}\) exists where \(a\cdot a^{-1}=1\)

The following properties are implied by this definition:

Aside: combining \(+\) and \(\cdot\) with inverses lets us define operations like \(-\) and \(\div\)

Abelian Ring (in terms of field)

Definition
A abelian ring is also a set equipped with \(+\) and \(\cdot\) that has the same properties as a field except the guarantee of multiplicative inverses for all elements.

Finite Fields: Basic Definitions

A finite field with order \(n\) is a field with a finite number \(n\) of elements.

Theorem 3.2

Theorem
If a field of order \(q\) exists, \(q\) must be a prime power, i.e. \(q=p^h\) for some prime \(p\).

All fields for a given \(q\) share the same structure; the structure in general is known as the Galois field of order \(q\), denoted \(\text{GF}(q)\).

Modular Arithmetic

Integers \(a\) and \(b\) are congruent modulo \(m\) (denoted \(a\equiv b \mod m\)) if \(a=km+b\) for some integer \(k\). Informally, \(a\) and \(b\) are congruent if they have the same remainder when divided by \(m\).

We find that for \(a\equiv a'\) and \(b\equiv b'\), we get \(a+b\equiv a'+b'\) and \(ab\equiv a'b'\), which further implies \(a^n=(a')^n\) (all \(\mod m\)).

Euler Totient Function

We define the Euler totient function or Euler indicator as the function \(\varphi(n) := |\set{m\in \mathbb{N} \mid a \leq m \leq n, \text{GCD}(m, n)=1}|\)

The Chinese remainder theorem states that \(\varphi(mn)=\varphi(m)\varphi(n)\) if and only if \(\text{GCD}(m, n)=1\).

Primitive Elements

The order of an element \(\alpha\) of a finite field \(\mathbb{F}\) is the smallest natural number \(e\) such that \(\alpha^e=1\).

The nonzero elements of any finite field can be written as powers of a single element

A primitive element \(\alpha\) is an element of order \(q-1\) in a finite field \(F_q\)

Polynomials

Minimal Polynomials

Every element \(\beta\) of a finite field \(F_{q^r}\) is a root of the equation \(\beta^q-\beta=0\) and is a root of some polynomial \(f(x)\in F_p[x]\).

For element \(\beta\in F_{p^r}\), minimal polynomial of \(\beta\) is the monic polynomial \(m(x)\in F_p[x]\) of least degree with \(\beta\) as a root.

For \(\beta\in F_{p^r}\) and \(f(x)\in F_p[x]\) with \(\beta\) as a root, then \(f(x)\) is divisible by the minimal polynomial of \(\beta\).

Primitive Polynomials

A primitive polynomial of a field is a the minimal polynomial of a primitive element of a field.

Reciprocal Polynomials

The following statements are equivalent:

  1. If \(\alpha\in F_{q^r}\) is a nonzero root of \(f(x)\in F_p[x]\), then \(\alpha^{-1}\) is a root of the reciprocal polynomial of \(f(x)\)
  2. Polynomial is irreducible \(\iff\) reciprocal polynomial is irreducible
  3. If \(m(x)\) is the minimal polynomial of some nonzero \(\alpha\in F_{p^r}\), then a scalar multiple of the reciprocal polynomial of \(\alpha\) is a minimal polynomial of \(\alpha^{-1}\)
  4. A polynomial is primitive \(\implies\) a scalar multiple of its reciprocal polynomial is primitive

Alternate Interpretation of Finite Fields

Consider \(F_4 = F_2[x]/(x^2+x+1)\), with elements \(\set{0, 1, x, x+1}\). \(\alpha=x\) is a primitive element in this field (since \(x^{2} +x+1=0\implies x=1\)), we can "solve" \(x^2+x+1=0\) using the quadratic formula to find \(\alpha=\dfrac{-1+\sqrt{1-4}}{2}=-\dfrac{1}{2}+i\dfrac{\sqrt{3}}{2}\), i.e. we treat \(\alpha\) like a complex number.

Being a third root of unity is equivalent to being a primitive element of \(F_4\); we can think of \(F_4\) as \(\set{0, 1, x, x+1}\) or \(\set{0, 1, e^{2\pi i/3},e^{-2\pi i/3}}\)

Aside: \(F_4 = F_2[x]/(x^2+x+1)\) (or more accurately, \(F_4 \cong F_2[x]/(x^2+x+1)\)) because, as mentioned, every field with the same number elements is isomorphic. By the quotient construction of \(F_2[x]/(x^2+x+1)\), it has \(4\) elements (namely \(\set{0, 1, x, x+1}\)), so it behaves the same was as any "other" field with \(4\) elements.

Application: ISBN Codes

An ISBN-Code is a \(10\)-digit number \(x_1x_2\dots x_{10}\) satisfying \(x_{10}=\displaystyle\sum\limits_{i=1}^{9}ix_i\mod 11\)


Textbook pages: 31-40, Notes pages: 47-55.

Chapter 4 - Vector Spaces over Finite Fields

Linear Algebra I Speedrun

For future chapters, we will find it useful to perform operations on codewords themselves, specifically the operations defined in a vector space.

For prime power \(q\), we define scalars as \(\text{GF}(q)\) and vectors as \(V(n, q)=\text{GF}(q)^n\). We define vector addition and multiplication as we do for column vectors in linear algebra.

Vector Space Axioms

Definition
A vector space is a set \(V\) (e.g. \(V(n, q)\)) with operations \(+\) and \(\cdot\) satisfying the following properties:

  1. Closure under \(+\)
  2. Associative \(+\)
  3. Additive identity \(\vec{0}\)
  4. Additive inverse \(=\vec{u}\)
  5. Commutative \(+\)
  6. Closure under \(\cdot\)
  7. Distributive law: \(a(\vec{u}+\vec{v}) = a\vec{u}+a\vec{v}, (a+b)\vec{u}=a\vec{u}+b\vec{u}\)
  8. Associative \(\cdot\)
  9. Multiplicative identity

Note that commutative multiplication and multiplicative inverses were not defined

Aside: properties 1-5 define a vector space as an abelian group under \(\cdot\).

A subspace is a subset of a vector space that is also a vector space. A subset \(V_0 \subseteq V\) of a vector space is a subspace iff it is closed under \(+\) and \(\cdot\).

A set \(\set{\vec{v}_1, \vec{v}_2, \dots, \vec{v}_r}\) are linearly dependent if there exist scalars \(a_1, a_2, \dots, a_r\) such that \(a_1\vec{v}_1+a_2 \vec{v}_2 + \dots + a_r \vec{v}_r=\vec{0}\).

A basis of vector space \(C\) is a linearly independent set of vectors in \(C\) that generate \(C\), i.e. a minimal generating set.

The dimension of \(C\) (denoted \(\dim{C}\)) is \(k\) if a basis for \(C\) has \(k\) vectors.


Pages: 41-45

Chapter 5 - Linear Codes

Linear Code Definitions

Linear Code

Definition
A linear code \(C_\ell\) over \(\text{GF}(q)\) is a subspace of \(V(n, q)\) for positive integer \(n\). So, a linear code is closed under addition and scaling: for any words \(\vec{u}, \vec{v} \in C_\ell\), \(\vec{u}+\vec{v}\in C_\ell\) and \(a\vec{u}\in C_\ell\) for scalar \(a \in \text{GF}(q)\).

A linear \([n, k]\) code is a \(k\)-dimensional subspace of \(V(n, q)\). We may also refer to this as a linear \([n, k, d]\) code to specify minimum distance.

A linear code must contain \(\vec{0}\) by the definition of a vector (sub)space.

Weight

The weight \(w(\vec{v})\) of a vector \(\vec{v}\) in a linear code is the number of non-zero components of \(\vec{v}\), i.e. \(w(\vec{v})=d(\vec{0}, \vec{v})\)

But why Tho?

Advantages of Linear Codes

Disadvantages of Linear Codes

Generator Matrices

The generator matrix \(G_C\) of a linear code \(C_\ell\) is a \(k\times n\) matrix whose rows form the basis of a linear \([n, k]\) code.

Equivalence of Linear Codes

Two linear codes are equivalent if one can be obtained from the other by permuting the positions of the code or scaling symbols in a fixed position

Theorem 5.4

Theorem
Two \(k\times n\) matrices generate equivalent linear codes over \(\text{GF}(q)\) if one matrix is obtainable from the other by

  1. R1 Permuting the rows
  2. R2 Scaling a row
  3. R3 Adding a scaled row to another row
  4. C1 Permuting the columns
  5. C2 Scaling a column

Standard Form

Standard Form

Definition
The standard form of a generator matrix \(G_C\) for code \(C\) is \([I_k \mid A]\), where \(I_k\) is the \(k\times k\) identity matrix and \(A\) is a \(k\times (n-k)\) matrix. Standard form

THEOREM 5.5 Standard form can be obtained by performing the operations R1, R2, R3, C1, C2 on the generator matrix in question.


Pages: 47-54

Chapter 6 - Encoding and Decoding Linear Codes

Encoding

Let \(\vec{u}=u_1 u_2 \dots u_k \in V(k, q)\) be one of the \(q^k\) possible words. We encode \(\vec{u}\) by multiplying it by the generator matrix \(G_C\) for our code \(C\). So, our encoded message is \(\vec{u}G=\displaystyle\sum\limits_{i=1}^{k} u_i \vec{r_i}\) where \(\vec{r_i}\) is the \(i\)th row of \(G_C\).

When \(G\) is in standard form (i.e. \(G=[I_k|A]\)), then \(\vec{u}G=x_1 x_2 \dots x_k \tilde{x}_{k+1}\tilde{x}_{k+2}\dots \tilde{x}_{k+n}\), where \(\tilde{x}_{k+i}=\displaystyle\sum\limits_{j=1}^{k}a_{ji}u_j\), \(a_{ji}\) being the \((j, i)\)th entry of \(A\).

Decoding

For sent vector \(\vec{x}\) and received vector \(\vec{y}\), we define the error vector \(\vec{e}\) as \(\vec{y}-\vec{x}\).

Cosets

For \([n, k]\) linear code over \(\text{GF}(q)\) and vector \(\vec{a}\in V(n, q)\), we define the coset \(\vec{a}+C\) of \(C\) as \(\vec{a}+C=\set{\vec{a}+\vec{x} \mid \vec{x}\in C}\).

Lagrange's Theorem (Theorem 6.4)

Theorem
For \([n, k]\) code \(C\) over \(\text{GF}(q)\):

For a given coset, the vector with the smallest weight is the coset leader.

Slepian Array

Slepian / Standard Array

Definition
The Slepian or standard array of a linear \([n, k]\) code \(C\) is the (a) \(q^{n-k}\times q^k\) array of containing all the vectors in \(V(n, q)\) where

In particular, the we order the cosets such that \(A[i, j]=A[i, 1]+A[1, j]\)

The Slepian can be constructed as follows

  1. List the codewords of \(C\), starting with \(\vec{0}\)
  2. Chose the word \(\vec{a}\in V(n, q)\) of the smallest weight that isn't already in the array. List the coset \(\vec{a}+C\) in that row, where \(\vec{a}+\vec{x}\) is under \(\vec{x}\) for each \(\vec{x}\) in the first row.
  3. Keep repeating 2) until the array is complete.

E.g. the Slepian of \(C_{\ell 0}\) is \(\begin{bmatrix}000& 011& 101& 110\\100&111&101&110\end{bmatrix}\)

Decoding a Linear Code

Finally

We decode received vector \(y\) by finding it in the Slepian. The vector at the beginning of its row is the error vector \(\vec{e}=\vec{y}-\vec{x}\), so the first vector in its column will be the nearest neighbour in \(C\), and thus the decoded vector since \(\vec{x}+\vec{e}=\vec{y}\)

Probability of Error Correction

For binary \([n, k]\) code \(C\) with \(\alpha_i\) coset leaders of weight \(i\) (for \(i\in\set{0, 1, \dots, n}\)), then the probability \(P_\text{corr}(C)\) that an arbitrary codeword is decoded correctly is \(P_{\text{corr}}(C)=\displaystyle\sum\limits_{i=0}^{n} \alpha_i p^{i}(1-p)^{n-i}\), where \(p\) is the probability of a bit being flipped due to channel noise.


Pages: 55-61

Chapter 7 - Dual Codes, Parity-Check Matrices and Syndrome Decoding

Dual Codes

The dual code \(C^\perp\) of linear \([n, k]\) code \(C\) is the set of vectors \(\vec{v}\in V(n, q)\) that are orthogonal to every codeword of \(C\), i.e. \(C^\perp=\set{\vec{v}\in V(n, q) \mid \vec{v} \cdot \vec{u}=0 \text{ for all } \vec{u}\in C}\)

LEMMA 7.2 If such \(C\) has generator matrix \(G\), then \(\vec{v}\in C^\perp\) if and only if \(\vec{v}G^T=\vec{0}\), where \(G^T\) is the transpose of \(G\).

THEOREM 7.3 \(C^\perp\) is a linear code of dimension \(n-k\), i.e. \(C^\perp\) is a linear \([n, n-k]\) code.

THEOREM 7.5 For any linear \([n, k]\) code \(C\), \((C^\perp)^\perp=C\)

Parity Check Matrices

The parity-check matrix \(H_C\) for \([n, k]\) code \(C\) is a generator matrix of \(C^\perp\).

The rows of the parity-check matrix are parity checks on the codewords. Namely, they constrain certain linear combinations to be \(0\), encoding the additional structure built into the codewords.

Finding a Parity-check Matrix

Theorem 7.6

Theorem
If \(G_C=[I_k|A]\) is the standard form of a generator matrix for linear \([n, k]\) code \(C\), then the parity-check matrix \(H_C\) is defined as \(H_C=[-A^{T}| I_{n-k}]\)

A parity check matrix \(H_C\) is in standard form if \(H_C=[B\mid I_{n-k}]\)

Syndromes

Syndrome Definitions and Theorems

For vector \(\vec{y}\in V(n, q)\), its syndrome \(S(\vec{y})\) is defined as the \(1\times(n-k)\) row vector \(S(\vec{y})=\vec{y}{H_C}^T\), where \(H_C\) is the parity-check matrix of linear \([n, k]\) code \(C\).

LEMMA 7.8 Two vectors \(\vec{u}\) and \(\vec{v}\) are in the same coset of \(C\) if and only if they have same syndrome, i.e. \(S(\vec{u})=S(\vec{v})\)

Syndrome Decoding

For large \(n\), array decoding is inefficient because it requires searching every entry in the array. As \(n\to\infty\), syndrome decoding becomes more efficient compared to array decoding because it leverages LEMMA 7.8 to find the coset of \(\vec{y}\) in \(O(n)\) time.

First, we must augment the standard array by appending the syndrome \(S(\vec{e})\) of each coset leader \(\vec{e}\) to the end of its corresponding row.

Syndrome Decoding

Algorithm
The syndrome decoding algorithms is as follows for received vector \(\vec{y}\)

  1. Calculate the syndrome \(S(\vec{y})=\vec{y}H^T\) of \(\vec{y}\).
  2. Locate \(S(\vec{y})\) in the syndromes column of the array.
  3. In the row where \(S(\vec{y})\) is located, find \(\vec{y}\) and decode as normal, i.e. the column header of this column is the decoded vector.

Pages: 67-74

Chapter 8 - Hamming Codes

Hamming codes are a family of linear, single-error-correcting codes over any \(\text{GF}(q)\) with elegant encoding and decoding schemes. Hamming codes are most conveniently defined by their parity check matrices.

Hamming Code Definition

Binary Hamming Code

Definition
The binary Hamming code \(\text{Ham}(r, 2)\) is the code whose parity-check matrix \(H\) has dimensions \(r\times(2^{r}-1)\) and whose columns are the distinct non-zero vectors of \(V(r, 2)\).

The redundancy \(r=n-k\) of the code is the number of check-symbols the code has.

Theorem 8.2

Theorem
For \(r \geq 2\), the binary Hamming code \(\text{Ham}(r, 2)\):

  1. is a \([2^{r}-1, 2^{r}-1-r]\) code
  2. has minimum distance \(3\), and is thus single-error correcting
  3. Is a perfect code

Decoding Hamming Codes

\(\text{Ham}(r, 2)\) being perfect implies the following properties

Decoding Hamming Codes

Algorithm

  1. Calculate the syndrome \(S(\vec{y})=\vec{y}H^T\) of the received vector \(\vec{y}\)
  2. If \(S(\vec{y})=\vec{0}\), then (we assume) \(\vec{y}\) was the codeword sent, so no error occurred
  3. If \(S(\vec{y})\ne\vec{0}\), we assume one error occurred; \(S(\vec{y})\) is the binary number indicating the position of the error.

Extended Binary Hamming Codes

We obtain the extended binary Hamming code \(\hat{\text{Ham}}(r, 2)\) by adding a parity-check to \(\text{Ham}(r, 2)\). These codes are no better at decoding completely (in fact, they are worse because they use an extra bit), but provide more error detection, making them better for incomplete decoding.

The parity-check matrix \(\hat{H_C}\) for \(\hat{\text{Ham}}(r, 2)\) is created by right-appending a column of \(0\)s, then bottom-appending a row of \(1\)s to to the parity-check matrix \(H_C\) for \(\text{Ham}(r, 2)\).

The decoding process is as follows:

Relating \(d(C)\) and Linear Independence

Theorem 8.4

Theorem
For \([n, k]\) linear code \(C\) over \(\text{GF}(q)\) with parity-check matrix \(H_C\) any \(d(C)-1\) columns of \(H_C\) are linearly independent, but any set of \(d(C)\) columns of \(H_C\) are linearly dependent.

\(q\)-ary Hamming Codes

For \(d(C)=k\), any \(k\) columns of \(H_C\) must be linearly independent. So, for given redundancy \(r\), a \([n, n-r, k]\) code can be constructed by finding a set of nonzero vectors in \(V(r, q)\) where any \(k\) columns are linearly independent.

For \(q=3\), a vector \(\vec{v}\in V(r, q)\) has \(q-1=2\) nonzero scalar multiples, so can be partitioned into \(\dfrac{q^{r}-1}{q-1}\) equivalence classes, where \(\vec{u}\sim\vec{v} \iff \vec{u}=\lambda\vec{v}\) for some \(\lambda\), i.e. \(\vec{u}\) and \(\vec{v}\) are linearly dependent. We form \(H_C\) by taking one column from each equivalence class.

Finding \(H_C\) for a \(q\)-ary Hamming code

Algorithm
A parity-check matrix \(H_C\) for \(\text{Ham}(r, q)\) can be formed by listing all the nonzero \(r\)-tuples in \(V(r,q)\) whose first nonzero entry is \(1\).

THEOREM 8.6 \(\text{Ham}(r, q)\) is a perfect single-error-correcting code.

Decoding \(q\)-ary Hamming Codes

Hamming codes are perfect, single-error correcting codes, so its nonzero coset leaders are the vectors of weight \(1\) in \(V(r, q)\). So, \(S(\vec{y})=\vec{0}\) implies no errors and \(S(\vec{y})\ne \vec{0}\) implies an (assumed) single error. A coset leader for \(\vec{y}\) looks like \(0\dots0b0\dots0\), where the \(b\) is at the \(j\)th entry. So, \(S(\vec{y})=b\vec{H_j}\), where \({H_C}_j\) is the \(j\)th column of \(H_C\). So, the error is corrected by subtracting \(b\) from the \(j\)th entry of \(\vec{y}\).

Shortening Codes

We can shorten a code \(C\) of length \(n\) to code \(C'\) of length \(n-1\) by selecting any codewords in \(C\) with symbol \(\lambda\) at position \(j\) (both fixed), then deleting the \(j\)th entry from each word to form \(C'\).


Pages: 81-90

Chapter 12 - Cyclic Codes

Insert Ring Cycle pun here

A linear code \(C\) is a cyclic code if each cyclic shift of a codeword is also a codeword, i.e. for any codeword \(a_1a_2a_3a_4\dots a_n=\vec{a}\in C\), \(a_na_1a_2a_3\dots a_4\in C\) as well.

Often, for non cyclic \(C\), we can find an equivalent cyclic \(C'\) by interchanging coordinates.

Polynomials

\(\text{GF}(q)[x]\), now denoted \(F[x]\) is the set of polynomials with coefficients in \(F\). \(f(x)=f_0+f_1 x + \dots + f_m x^{m}\in F[x]\) has degree \(m\), denoted \(\deg f(x)=m\), and leading coefficient \(f_m\). \(F[x]\) is a vector space, but not a field since multiplicative inverses do not exist.

Division Algorithm

For any polynomials \(a(x), b(x)\in F[x]\), there exists a unique quotient \(q(x)\) and remainder \(r(x)\) such that \(a(x)=q(x)b(x)+r(x)\), where \(\deg r(x)<\deg b(x)\).

The Ring of Polynomials \(\mod f(x)\)

Polynomials \(g(x)\) and \(h(x)\) are congruent \(\mod f(x)\), denoted \(g(x)\equiv h(x) \mod f(x)\), if \(g(x)-h(x)\) is divisible by \(f(x)\), i.e. \(f(x)\mid [g(x)-h(x)]\).

We define \(F[x]/f(x)\) as the ring of polynomials over \(F\) modulo \(f(x)\). This ring's domain comprises every polynomial in \(p(x)\in F[x]\) such that \(\deg{p(x)}<\deg{f(x)}\) (i.e. "smaller" polynomials), and addition and multiplication are "carried out \(\mod f(x)\)".

Reducibility

Polynomial \(f(x)\) is reducible in field \(F[x]\) iff there exist \(a(x), b(x)\in F[x]\) satisfying \(\deg a(x), \deg b(x) < \deg f(x)\) where \(f(x)=a(x)b(x)\). Informally, \(f(x)\) is reducible if it can be "reduced" into smaller factors.

Lemma 12.3: Useful Observations for Factoring Polynomials

Lemma

Cyclic Codes

Definition and Characterization as Polynomials

We consider the ring \(F[x]/(x^{n}-1)\), i.e. polynomials modulo \(x^{n}-1\).

Theorem 12.6 - Characterizing cyclic codes

Theorem
A code \(C\) in \(R_n\) is a cyclic code if and only if we have, for polynomials in \(C\),

For polynomial \(f(x)\) in \(R_n\), we define a cyclic code \(\langle f(x) \rangle\) as the subset of \(R_n\) consisting of all (polynomial) multiples of \(f(x)\), reduced \(\mod x^{n}-1\), i.e. \(\langle f(x)\rangle=\set{r(x)f(x)\mid r(x)\in R_n}\)

Generator Polynomials

Theorem 12.9

Theorem
If \(C\) is a non-zero cyclic code in \(R_n\), then

This \(g(x)\) is the generator polynomial of \(C\)

We find all the cyclic codes in \(R_3\) are \(V(3, 2)\) in its entirety, \(\set{000, 110, 011, 101}\), \(\set{000, 111}\), \(\set{000}\) with respective generator polynomials \(1\), \(x+1\), \(x^2+x+1\), \(x^{3}-1=0\)

Theorem 12.12

Theorem
If \(C\) is a cyclic code with generator polynomial \(g(x)=g_0 + g_1 x + \dots + g_r x^r\), then the generator matrix \(G_C\) of \(C\) is the \(n-r\times n-r\) matrix \(G=\begin{bmatrix}g_0 & g_1 &g_2 &\dots &g_r&0&0&\dots &0\\0&g_0&g_1&\dots&g_{r-1} & g_r &0&\dots &0\\ \vdots &\vdots &\vdots &\vdots &\vdots &\vdots &\vdots &\ddots &\vdots \\ 0 & 0 & 0 & \dots & g_{r-4} & g_{r-3} & g_{r-2} & g_{r-1} & g_r \end{bmatrix}\), where row \(i<n-r\) is formed by cycling \((g_0, g_1, \dots, g_r, 0, \dots, 0)\) \(i\) times.

To find all the \(q\)-ary cyclic codes of length \(k\), we

Check Polynomials and Parity-Check Matrices

The check polynomial of \(C\) generated by \(g(x)\) is the polynomial \(h(x)\) such that \(g(x)h(x)=x^{n}-1\).

For any polynomial \(c(x)\in R_n\) is a codeword of code \(C\) if and only if \(c(x)h(x)=0\), where \(h(x)\) is the parity-check polynomial of \(C\).

Note: Although it is true that \(\dim{\langle h(x) \rangle}=\dim C^{\perp}=n-k\), \(h(x)\) does not generate the dual code \(C^\perp\) of \(C\). Namely, the fact that \(h(x)c(x)=0\) in \(R_n\) does not carry the same meaning as the corresponding vectors in \(V(n, q)\) being orthogonal.

If \(C\) is an \([n, k]\) code with parity-check polynomial \(h(x)=h_0 + h_1 x + \dots + h_k x^k\), then:

Hamming Codes Are Cyclic

Irreducible polynomial \(p(x)\) of degree \(r\) is a primitive polynomial iff \(x\) is a primitive element in \(F[x]/p(x)\). Informally, a primitive element of a finite field is an element that "generates" every member of the field if raised to a high enough power (we will formally define this later).

If \(p(x)\) is a primitive polynomial of degree \(r\) over \(\text{GF}(2)\), then the cyclic code \(\langle p(x) \rangle\) is the Hamming code \(\text{Ham}(r, 2)\).

More generally, \(\text{Ham}(r, q)\) is a cyclic code if \(r\) and \(q-1\) are relatively prime.


Pages 141-144, 146-153

Chapter ??? - Extras

primitive elements

minimal polynomials

ideals

Chapter xx - BCH Codes

Chernousov goes off the rails!

Hamming codes are cool, but they can only correct one error. We need more…

Basic Definitions

BCH Code Definition

Definition
Let element \(\alpha\) be of order \(n\) in a finite field \(F_{q^s}\). A \([n, d]\) BCH code has length \(n\) and design distance \(d\) is a cyclic code generated by the product of distinct minimal polynomials in \(F_q[x]\) of elements \(\alpha, \alpha^{2}, \dots, \alpha^{d-1}\).

To encode codeword \(a_0a_1\dots a_{k-1}\), we represent it by polynomial \(f(x)=a_0+a_1x+\dots\) and multiply it by its generator polynomial \(g(x)\) to get codeword \(c(x)=f(x)g(x)\).

Example

The field \(F_{2^4}=F_2[x]/(x^4+x+1)\) has primitive element \(\alpha=x\). We can construct a \([15, 7]\) code that can correct \(2\) errors by finding a generator polynomial \(g(x)\) with roots \(\alpha, \alpha^{2},\alpha^{3},\alpha^{4}\). We find such a \(g(x)\) in the product of the minimal polynomials of \(\alpha\) and \(\alpha^3\): \(g(x)=m_1(x)m_3(x)=(x^4+x+1)(x^4+x^3+x^2+x+1)=x^8+x^7+x^6+x^4+1\).

Decoding BCH Codes

Decoding BCH codes is polynomially analogous to syndrome decoding: for sent codeword \(c(x)\) and received codeword \(y(x)\), we define the error polynomial \(e(x)=y(x)-c(x)\).

The first syndrome \(S_1\) is computed by substituting \(\alpha\) into \(y(x)\): \(S_1 := y(\alpha)=c(\alpha)+e(\alpha)\)

We can define each subsequent syndrome (up to syndrome \(d-1\)) by using the corresponding power of \(\alpha\): for \(k\geq d-1\), \(S_k=y(\alpha^{k})=c(\alpha^k)+e(\alpha^k)=e(\alpha^k)=e_1^{k}+e_2^{k}+\dots+e_t^{k}\)

BCH Decoding Scheme

Procedure
To decode a BCH code, we must determine if there is a value of \(t\) and choices of field elements \(e_1, e_2, \dots, e_t\) that are consistent with all the syndromes, i.e. \(S_1, S_2, \dots, S_{d-1}\).

We define the error locator polynomial \(\sigma(x) := (e_1 x-1)(e_2 x-1)\dots(e_t x-1)=b_t x^{t}+ b_{t-1}x^{t-1}+\dots+b_1 x+1\).

Example

We wish to decode the \([15, 7]\) code generated by \(g(x)=x^9+x^6+x^5+x^4+x+1\). Assume our message is \(110\,0000\), so we transmit \(110 \, 011 \, 100 \, 100 \, 000\) (i.e. \(c(x)=(1+x)g(x)\)). Say we receive \(110 \, 010 \, 101 \, 100 \, 000\), which has \(2\) errors.

Chapter yy - Golay Codes

Hamming didn't finish his homework

There exist other nontrivial, non-Hamming codes with the same parameters as Hamming codes that also satisfy the sphere-packing bound \(\displaystyle M\sum\limits_{k=0}^{t}{n\choose k}(q-1)^k=q^n\) (i.e. are perfect codes).

Non-Hamming Triples

Golay discovered three other triples \((n, M, d)\) satisfying the sphere-packing bound that are not parameters of a Hamming code:

(non-perfect) linear codes for \((23, 2^{12}, 7)\) (binary) and \((11, 3^{6}, 5)\) (ternary) exist; these are the Golay codes.

These triples also appear to be combinatorial results in addition to coding-theoretical (algebraic) ones

The \([23, 12, 7]\) Binary Golay Code \(G_{24}\)

It is convenient to extend the Golay \([23, 12, 7]\) code into the extended Golay \([23, 12, 8]\) code by adding an extra parity bit that makes the weight of every codeword even.

\[G = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 0 & 0 & 1 \end{bmatrix}\]

The extended Golay \([23, 12, 8]\) code (I didn't want to copy this into TeX myself)

The extended Golay \([23, 12, 8]\) code generated by \(G_{24}\) has distance \(8\).

Self-Orthogonality

Any two rows of the matrix representing \(G_{24}\) are orthogonal to each other.

A linear code \(C\) is self-orthogonal iff \(C\subset C^\perp\) and self-dual iff \(C=C^\perp\)

We find that since \(C_{24}, C_{24}^\perp\) have the same dimension and \(C_{24}=C_{24}^\perp\), the parity check matrix \(H_{24}=[A|I_{12}]\) is a generator matrix for \(C_{24}\).

The \([11, 6]\) Ternary Golay Code \(G_{12}\)

\[G_{12}=\begin{bmatrix} 1&0&0&0&0&0&0&1&1&1&1&1 \\ 0&1&0&0&0&0&1&0&1&2&2&1 \\ 0&0&1&0&0&0&1&1&0&1&2&2 \\ 0&0&0&1&0&0&1&2&1&0&1&2 \\ 0&0&0&0&1&0&1&2&2&1&0&1 \\ 0&0&0&0&0&1&1&1&2&2&1&0 \end{bmatrix}\]