the structure of tail-biting trellises: minimality and ...hajir/m499c/koetter... · trellis for a...

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 9, SEPTEMBER 2003 2081

The Structure of Tail-Biting Trellises: Minimality andBasic Principles

Ralf Koetter, Member, IEEE,and Alexander Vardy, Fellow, IEEE

Abstract—Basic structural properties of tail-biting trellises areinvestigated. We start with rigorous definitions of various typesof minimality for tail-biting trellises. We then show that biproperand/or nonmergeable tail-biting trellises are not necessarily min-imal, even though every minimal tail-biting trellis is biproper. Next,we introduce the notion of linear (or group) trellises and prove, byexample, that a minimal tail-biting trellis for a binary linear codeneed not have any linearity properties whatsoever. We observe thata trellis—either tail-biting or conventional—is linear if and only ifit factors into a product of elementary trellises. Using this result,we show how to construct, for any given linear code , a tail-bitingtrellis that minimizes the productof state-space sizes among all pos-sible linear tail-biting trellises. We also prove thateveryminimallinear tail-biting trellis for arises from a certain character-istic matrix, and show how to compute this matrix in time ( 2)from any basis for . Furthermore, we devise a linear-program-ming algorithm that starts with the characteristic matrix and pro-duces a linear tail-biting trellis for which minimizes the max-imum state-space size. Finally, we consider a generalized productconstruction for tail-biting trellises, and use it to prove that a linearcode and its dual have the same state-complexity profiles.

Index Terms—Block codes, characteristic matrix, codes ongraphs, convolutional codes, linearity, minimal trellises, tail-bitingtrellises, trellis-complexity.

I. INTRODUCTION

T RELLIS representations of linear block codes have re-ceived much attention in recent years [3], [14], [16], [17],

[21], [31], [35]. Such representations not only illuminate codestructure, but also often lead to efficient trellis-based decodingalgorithms. It is now well known that, given a specific coordi-nate ordering, there exists a unique, up to isomorphism, minimal(conventional) trellis for any linear block code. The minimaltrellis for a linear code simultaneously minimizes all the con-ceivable measures of trellis complexity, and can be easily con-structed from a generator matrix or a parity-check matrix for.

On the other hand, much less is known abouttail-biting trel-lises. Tail-biting trellis representations are interesting for sev-eral reasons. First, the complexity of a tail-biting trellis may be

Manuscript received June 5, 2002; revised May 11, 2003. This work was sup-ported by the David and Lucile Packard Foundation and by the National ScienceFoundation. The material in this paper was presented in part at the IEEE Infor-mation Theory Workshop, Killarney, Ireland, July 1998 and in part at the IEEEInternational Symposium on Information Theory, Lausanne, Switzerland, July2002.

R. Koetter is with the Coordinated Science Laboratory, University of Illinoisat Urbana-Champaign, Urbana, IL 61801 USA (e-mail: [email protected]).

A. Vardy is with the Department of Electrical and Computer Engineering,the Department of Computer Science, and the Center for Wireless Communi-cations, University of California, San Diego, La Jolla, CA 92093 USA (e-mail:vardy@kilimanjaro. ucsd.edu).

Communicated by S. Litsyn, Associate Editor for Coding Theory.Digital Object Identifier 10.1109/TIT.2003.815769

much lower than the complexity of the best possible conven-tional trellis. It is shown in [3], [35] that the number of statesin a tail-biting trellis for a linear code can be as low as thesquare rootof the number of states in the minimal conventionaltrellis for . Second, tail-biting trellises may be considered asthe simplest form of afactor graphwith cycles. The recent de-velopment of iterative decoding techniques [35], [15] has led toa vivid interest in factor graphs. While the performance of itera-tive decoding on general graphs with cycles remains somewhatof a mystery, iterative decoding of tail-biting trellises is by nowreasonably well understood [1], [7], [23], [26].

Thus, the major remaining problem with tail-biting trellises isthat of efficient construction. Given a linear block codeoverthe finite field , how can one construct a minimal tail-bitingtrellis for ? Although severalexamplesof such trellises areknown [3], the understanding of minimal tail-biting trellises isnot on par with conventional trellises. Our goal in this work isto lay out some of the foundations of the theory of tail-bitingtrellises.

We start with a definition of both conventional and tail-bitingtrellises in the next section. We then definereduced andone-to-onetail-biting trellises. In addition to the usual labelingof edges in a trellis, we will also need to deal with labelingsof vertices. Thus, we introduce the concepts of alabeledtrellis and the correspondinglabel code. We further define arepresentation codeof a labeled trellis as a subset of the labelcode of , from which the entire trellis can be uniquely recon-structed. This definition leads to the notion of arepresentationmatrix for a (linear) trellis, which is introduced in Section IV.

In Section III, we present rigorous definitions of various typesof minimality for tail-biting trellises. The fact that the minimalconventional trellis for a linear code is unique makes it pos-sible to define minimality in a number of equivalent ways: anyreasonable definition leads to the same unique minimal trellis(cf. [31], [33]). The situation is considerably more involved fortail-biting trellises. Minimal tail-biting trellises are usually notunique, and special care needs to be taken to define minimalityitself. We thus distinguish between different types of minimalitythat correspond to different (partial) orders on the set of trellisesfor a given code. We also show that every minimal tail-bitingtrellis is biproper; on the other hand, biproper and/or nonmerge-able tail-biting trellises are not necessarily minimal. It followsthat iterative merging of vertices in a nonminimal trellis leadsto minimality for conventional trellises but not for tail-bitingtrellises.

In Section IV, we introduce the notion oflinear (or group)trellises. Loosely speaking, linearity means that the set ofedge/vertex label sequences is closed under componentwise op-eration in the appropriate algebraic domain—a field or a group.

0018-9448/03$17.00 © 2003 IEEE

2082 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 9, SEPTEMBER 2003

In [11], we show how to decide whether a given (unlabeled)trellis is linear: we provide an algorithm that constructs anappropriate labeling of the vertices if the trellis is linear, andhalts if it is not. An interesting outcome of this investigationis that trellis linearity is inherently a graph-theoretic property:we give an example of two trellises over the trivial alphabet

one of which is linear while the other is not. Anothersurprising observation is that a minimal tail-biting trellis fora linear code may be nonlinear: we construct a minimal trellisfor a simple linear code over that has no linearity properties1

whatsoever.We then discuss the relationship between trellis linearity and

theproduct construction. The product construction for conven-tional trellises was introduced by Kschischang and Sorokine in[16]. It was later employed in the context of tail-biting trellisesby Calderbank, Forney, and Vardy [3]. One of our key results in[11] is that a trellis—either tail-biting or conventional—is linearif and only if it may be obtained by a product construction. Thus,every linear trellis factors into a product of elementary trellisesover a field, and every abelian-group trellis factors into elemen-tary trellises over a group.

It follows that the search for a minimal linear trellis reducesto the task of specifying the appropriate input to the productconstruction: given a linear code, we need to specify a setof generatorsfor and then specify aspan for each gener-ator. In contrast to the case of conventional trellises, when con-structing a tail-biting trellis, one canfreely choosethe span ofeach generator, and there are at least ways to do so for acode of length and dimension . Nevertheless, in Section V,we present a general solution to the problem of constructingminimal linear tail-biting trellises for linear codes. To this end,we introduce the notion of acharacteristic matrixfor a linearcode . This is an matrix that can be easily computed intime from any basis for . We show that although min-imal tail-biting trellises are generally not unique,everyminimallinear tail-biting trellis for necessarily arises from the charac-teristic matrix for . We also investigate some of the propertiesof the characteristic matrix, and of the associatedspan matrix.This investigation enables us to show, in particular, that certaintail-biting trellises for a linear code and for its dual havethe same state-complexity profile.

In Section VI, we start with the characteristic matrix for alinear code , and show how to construct a linear tail-bitingtrellis for that minimizes theproductof the vertex-space sizesamong all possible linear trellises for(a -minimal trellis).We also propose a linear-programming algorithm that producesa linear tail-biting trellis for which minimizes themaximumvertex-space size (a -minimal trellis). As is common inthe case of integer-programming problems, we do not have aprecise estimate of the running time of the proposed algorithm.Nevertheless, we found that it works extremely well in practice.

Finally, in Section VII, we discuss a generalization of theproduct construction for tail-biting trellises. The idea is toreplace thesum of the edge labels in the original Kschis-

1Examples given in [3] show that a minimal tail-biting trellis for a linear codeover may be a group trellis over the additive group of. Such trellises arestill essentially linear—they factor into a product of elementary trellises overa group, as described in Section IV. Our example is fundamentally different inthat the minimal trellis we construct is not a group trellis and cannot be factored.

chang–Sorokine construction [16] with anarbitrary function.The generalized product construction enables us to produce atail-biting trellis for a linear code directly from a parity-checkmatrix for , rather than from a generator matrix. This leadsto the following duality result: given any linear trellis for ,there exists adual trellis that represents the dual code ,such that the state-complexity profile of is componentwiseless than or equal to that of. Forney [6] derived a similarresult using quite different tools, in the context of normal(generalized) state realizations of codes on graphs.

We note that all of the results in this paper deal with trellisrepresentation oflinear codes, given afixed orderof the timeaxis. In the case of conventional trellises, comprehensive theoryhas been developed on this subject (cf. [31]), and it is fair to saythat minimal conventional trellises for linear codes are by nowvery well understood. On the other hand, the problem of findingthe bestpermutationof the time axis for a given linear code isstill open, even in the case of conventional trellises. The situ-ation is different for tail-biting trellises in that very little wasknown regarding both problems—the permutation problem aswell as the minimality problem for a fixed time axis. While ourresults here do not answer all the questions that pertain to min-imal tail-biting trellises for a fixed time axis, we do provide so-lutions to some of the main problems in this domain. We thushope to bring the theory of tail-biting trellises on par with thatof conventional trellises.

II. PRELIMINARIES

We start with the definitions of conventional and tail-bitingtrellises. We also introduce a number of concepts related to tail-biting trellises, that will be used throughout the paper.

An edge-labeled directedgraph is a triple , con-sisting of a set of vertices, a finite set called thealphabet,and a set of ordered triples , with and

callededges. We say that an edge beginsat , ends at , and has label .

Definition 2.1: A conventional trellis ofdepth is an edge-labeled directed graph with the followingproperty: the vertex set can be partitioned as

(1)

such that every edge in begins at a vertex of and endsat a vertex of , for some . The sets

are called thevertex classesof . The orderedindex set induced by the partition in (1) iscalled thetime axisfor .

If every vertex in lies on at least one path from a vertex into a vertex in , we say that is reduced. The sequence

of edge labels along each path of lengthin defines an or-dered -tuple over the label alphabet. We say that repre-sentsa block code of length over , if is precisely the setof all such -tuples.

Tail-biting trellises have been traditionally used (cf. [19]) asa means of terminating a convolutional code without incurringa rate loss. More recently, tail-biting trellises forblock codeswere considered in [3], [8], [10], [18], [26], [25], [27], [28],

KOETTER AND VARDY: THE STRUCTURE OF TAIL-BITING TRELLISES 2083

[35], and other works. Such trellises may be viewed [3], [35]as a generalization of a conventional trellis to a circular timeaxis.

Definition 2.2: A tail-biting trellis of depthis an edge-labeled directed graph with the following property:

the set can be partitioned into vertex classes

(2)

such that every edge ineither begins at a vertex of and endsat a vertex of , for some , or begins ata vertex of and ends at a vertex of .

As for conventional trellises, we refer to the set of indicesfor the vertex partition in (2) as thetime

axis for . Due to the tail-biting nature of the definition, it isconvenient to identify with , the ring of integers modulo.Thus, when dealing with tail-biting trellises, all index arithmeticwill be implicitly performed modulo . In particular, an indexinterval is defined as follows:

if

if .(3)

We refer to such intervals as closedcyclic intervals. Later(for the definition ofspan in Section IV), we will also needsemi-open cyclic intervals , defined as .

A cycleof length in a tail-biting trellis is a subgraph ofconsisting of aclosedpath through vertices. It is easy to seethat the length of any cycle is a multiple of the depth, and wewill be mostly concerned with cycles of lengthexactly . Fromnow on, whenever we refer to cycles inwe mean cycles oflength , unless specified otherwise.

Clearly, any cycle in contains exactly one vertex from eachvertex class, and each vertex in a cycle has degree two. The no-tion of a cycle in a tail-biting trellis is analogous to the notionof a path in a conventional trellis. In particular, we say that atail-biting trellis is reducedif every vertex and every edge inbelongs to at least one cycle. Notice that, in contrast to conven-tional trellises (cf. [14], [33]), the edges of a reduced tail-bitingtrellis are explicitly required to belong to a cycle. For example,the trellis in Fig. 1 is reduced as a conventional trellis, but notreduced as a tail-biting2 trellis, although every vertex of be-longs to a cycle. Cutting at any position produces a reducedconventional trellis.

The set of edge labels along a cycle in is an -tupleover the label alphabet . Notice that this

-tuple is naturally ordered only up to cyclic shifts. In orderto make it into a vector, we need to specify where the corre-sponding cyclestarts. We henceforth postulate that all cyclesin start at a vertex of , unless stated otherwise. With this,every cycle in defines a vector .We will refer to such vectors asedge-label sequencesin . Wesay that representsa block code over if is preciselythe set of all the edge-label sequences in. Observe that ifwe were to postulate that cycles instart at a different time,

2When depicting tail-biting trellises, we implicitly identify the leftmost vertexclass in a figure with the rightmost vertex class. Thus, as a tail-biting trellis, thetrellis in Fig. 1 has only 10 vertices.

Fig. 1. Nonreduced tail-biting trellis.

then thesame trelliswould represent a different code, namely,a cyclic shift of (cf. Section V).

We let denote the code represented by a trellis, ei-ther conventional or tail-biting, and refer to as theedge-label codeof . A trellis is said to beone-to-oneif there isa one-to-one correspondence between cycles inand code-words in . For one-to-one trellises, we denote bythe unique cycle of that corresponds to .

In addition to the usual labeling ofedgesin a trellis, we will bealso interested in certain labelings ofverticesin a trellis. Specif-ically, suppose that, for all, each vertex in a vertex class islabeled by a sequence of lengthover , where

We also require that all vertex labels within the same vertex classare distinct. We will use the termlabeled trellis to refer to atrellis endowed with a vertex labeling of this kind.

Let . Then every cycle in a labeledtail-biting trellis defines an ordered sequence of lengthover , consisting of the labels of edges and vertices in. Wewill refer to such a sequence as alabel sequencein .

Definition 2.3: The set of all the label sequences in a labeledtail-biting trellis is called thelabel codeof and denoted by

; it is a block code of length over .

If is a conventional labeled trellis, then the label codeis defined in a similar fashion as the set of edge-vertex labelsequences of all thepathsof length in . Notice thateverylabeled trellis represents its label code in a one-to-onemanner: there is a one-to-one correspondence between cycles(or paths) in and codewords in . This follows immedi-ately from the fact that vertex labels within each vertex class aredistinct.

Given a labeled trellis , it is straightforward to determine itslabel code by simply reading-off the label sequences of all thecycles (paths) in . On the other hand, if is reduced, then itis also possible to uniquely determinefrom its label code. In-deed, given , we can construct cycle-by-cycle; this workssince every edge in belongs to at least one cycle. Thus, wemay describe a reduced labeled trellisby specifying the labelcode .

Further notice that asubsetof is often sufficient to com-pletely describe . Each codeword in is a label sequencethat uniquely determines a cycle in. If we take enough code-words so that every edge of is contained in one of the cor-responding cycles, then the entire trellis may be reconstructedfrom these codewords.

Definition 2.4: We say that representsthe trellisif the cycles corresponding to the codewords ofcover all

the edges of . We call a representation codefor .


Fig. 2. Minimal tail-biting trellises for a(5; 3; 2) binary linear code.

The following example illustrates this definition. When spec-ifying label sequences, we adopt the convention of writing thevertex labels in a smaller font and underlining them.

Example 2.1:Consider the labeled trellis over the alphabet, depicted below. As , this trellis may be

thought of either as a conventional or as a tail-biting trellis.

The edge-label code of this trellis isso that .

However, the set of only three codewords

0 0 0 0 0 1 0 1 0 0 1 0 (4)

is sufficient to represent the trellis, as may be seen by direct ver-ification. Notice that is not unique. For example, replacing thefirst two label sequences in (4) by0 0 0 1 and 0 1 0 0 ,we obtain another representation code for.

III. D EFINITION AND PROPERTIES OFMINIMAL TRELLISES

Let be a trellis, either conventional or tail-biting, of depth . We will refer to the ordered sequence

as thevertex-class profileof . For a given code , the vertex-class profiles of all possible trellises for form a partially or-dered set under componentwise comparison. Specifically, wesay that a trellis is smaller than or equal to another trellis,denoted as , if

for all (5)

If moreover equality doesnot hold in (5) for all , we say thatis strictly smaller than and write . Minimality

for conventional trellises was first defined by Muder [22], andthis definition is by now well established in trellis theory [21],[31]. The following definition reduces to that of Muder [22] inthe case of conventional trellises.

Definition 3.1: A trellis is minimal under , or simplyminimal, if a trellis such that does not exist.

It is a remarkable fact [14], [22] that in the case of conven-tional trellises for linear (or rectangular) codes, the partial or-dering contains, up to graph isomorphism, auniqueminimalelement. The unique minimal trellis attains the smallest possiblevertex count simultaneously at all times, and minimizes all con-ceivable measures of trellis complexity [31].

Unfortunately, this is not true for tail-biting trellises. Givena nontrivial linear code , there are always several nonisomor-phic tail-biting trellises for that are minimal under , butincomparable to each other. As a simple example, considerminimal conventional trellises for the cyclic shifts of .After an obvious permutation, we obtaintail-biting trellises

for , such that each contains a singlevertex at time . This construction is illustrated in Fig. 2 for the


(a) (b)

Fig. 3. Two tail-biting trellises for a(7; 2; 3) binary linear code.

binary linear code generated by , ,and . The tail-biting trellises areobviously minimal under , but incomparable to each other.

A. Total Minimality Orders

A total order on a set is an order under which any twoelements of are comparable. Since for every linear code thereare many incomparable -minimal trellises, a total order onthe set of tail-biting trellises is desirable in order to strengthenthe notion of minimality introduced in Definition 3.1. In thispaper, we focus on the following total orders:

(6)

(7)

We write and if the inequalities on theright-hand side of (6) and (7)) hold with equality. Similarly, wewrite and if these inequalities are strict.We say that a trellis is -minimal if there is no trellissuch that . Minimality with respect to the orderis defined as follows: is -minimal if there is no trellis

such that either or and .An order preservesthe order described by ifimplies that . It is obvious that the two total ordersintroduced in (6) and (7) preserve . This immediately impliesthe following.

Proposition 3.1: The set of -minimal trellises and the setof -minimal trellises for a given code are subsets of the setof -minimal trellises for the same code.

Even though a tail-biting trellis that is minimal under theproduct order is typically also minimal under , thisis not always the case as is shown in the following example.

Example 3.1:Consider the binary linear code generatedby and . Two tail-biting trellises and

for are shown in Fig. 3(a) and (b), respectively.

We have andSince a conventional trellis for any cyclic shift

of contains at least four vertices at some position,is-minimal. It can be shown that is -minimal. Yet

and .

Remark: In the next section, we introducelinear tail-biting trellises. If is a linear code over and is alinear trellis for , then the vertex-class profile

consists of powers of . Thus, inthis context, it is convenient to define , and usethe logarithmic profile , which is known[16], [17], [31] as thestate-complexity profileof . The corre-sponding partial order is obviously equal to the partial order

defined by (5), so that our definition of minimality remainsunchanged. Notice that a -minimal trellis, as defined by(7), minimizes the maximum state complexity ,whereas a -minimal trellis, as defined by (6), minimizes theaverage state complexity .

In addition to and , several other total orders maybe of interest. In the following, we let denote the set of edgesthat begin in a vertex of , and define

if

(8)

if

(9)

if

(10)

if

(11)

We shall see that most interesting orders preserve the order,although some do not. It is obvious that the vertex-sum order in(8) preserves . Though less obvious, the edge-product orderin (9) also preserves for linear trellises. We will show inSection IV that for linear trellises

(12)

where if is one-to-one, andotherwise. Hence, for the class of linear trellises, theedge-product order is equal to the vertex-product order.


(a) (b)

Fig. 4. Two tail-biting trellises for a(7; 4; 2) binary linear code.

Clearly, a tail-biting trellis that is minimal under a total orderthat preserves is necessarily minimal under . Thus,Proposition 3.1 holds for the vertex-sum and the edge-productorders as well. On the other hand, the edge-max order and theedge-sum order do not preserve , as can be seen from thefollowing example.

Example 3.2:For a trellis of depth , we let

denote theedge-class profileof . Consider the tail-biting trel-lises and shown in Fig. 4(a) and (b), respectively. It canbe seen by direct verification that these trellises represent thesame binary linear code , generated by ,

, , and . We have

and

so that . On the other hand, the edge-class profiles aregiven by

and

Thus, and .

To effectively deal with the and orders in (10)and (11), we define a partial order based on the edge-class pro-file of a trellis. That is, we write if for all

. As in (5), we write ifand for some . It is obvious that all of the edgeorders defined in (9)–(11) preserve the order, and it followsfrom (12) that the vertex-product order also preservesforlinear trellises.

We conclude this subsection with the following observation.Let denote the cyclic shift, that is , for

. Let be the code obtained from bya cyclic shift (to the left) by one position. Given a tail-bitingtrellis , let denote the trellis obtained from

by the mapping and for all , or,equivalently, by postulating that the cycles instart in ratherthan in . The following proposition follows immediately fromthe fact that although and represent different codes, theyare, essentially, one and the same trellis.

Proposition 3.2: Let be any of the minimality orders onthe set of tail-biting trellises for a given code, introduced in(5)–(11). Then, if and only if . Thus,is an -minimal trellis for if and only if is an -min-imal trellis for .

B. Properties of Minimal Trellises

We first recall some definitions (cf. [14], [31], [33]). Giventwo vertices and that belong to the same vertex class in atrellis , one canmergethese vertices by replacing them witha single vertex that inherits all the edges incident uponand .We say that the verticesand aremergeableif merging themdoes not alter the edge-label code of. A trellis is said tobe nonmergeableif there are no mergeable vertices in. It isknown [31], [33] that a conventional trellis for a linear (or, moregenerally, rectangular) code is minimal if and only if it is non-mergeable. Obviously, any minimal trellis, either conventionalor tail-biting, must be nonmergeable. The question is whether anonmergeable tail-biting trellis must be also minimal. The fol-lowing example settles this question in the negative.

Example 3.3:Let be the single-parity-check code,consisting of all the even-weight binary vectors of length. Itcan be seen by direct verification that the tail-biting trellises

, , and , depicted in Fig. 5(a)–(c), respectively, all repre-sent .

The trellis is mergeable; merging vertices, as indicated bythe dotted lines, produces the trellis. Observe that cannotbe merged further: merging any two vertices in this trellis wouldproduce codewords of odd weight. Yet, is not -minimalsince .

Remark: The fact that a nonmergeable conventional trellis isnecessarily minimal makes it possible to construct theunique


(a) (b) (c)

Fig. 5. Biproper, nonmergeable, and minimal tail-biting trellises.

minimal trellis for a linear code by means of iterative mergingof vertices inanynonminimal trellis for the same code. As wehave seen, this does not work for tail-biting trellises. Evidently,more “art” is involved in constructing minimal tail-biting trel-lises than minimal conventional trellises.

A trellis , either conventional or tail-biting, is said to bebiproper if the edges beginning at any vertex ofare labeleddistinctly and the edges ending at any vertex ofare also la-beled distinctly (for conventional trellises, we also need to re-quire that ). In Theorem 4.4 of Section IV, weshow that if a linear tail-biting trellis is nonmergeable then itmust be biproper (a definition of linearity for trellises is givenshortly). This immediately establishes the following structuralproperty of -minimal linear tail-biting trellises.

Theorem 3.3:Every linear -minimal tail-biting trellis isbiproper.

Proof: Every -minimal trellis is nonmergeable.

It is known [31], [33] that a conventional trellis is biproper ifand only if it is nonmergeable. However, a biproper tail-bitingtrellis need not be nonmergeable, even if it is linear. For ex-ample, the trellis in Fig. 5(a) is linear and biproper, yet itis mergeable. Thus, for linear tail-biting trellises, we have es-tablished the followingproper inclusion chain:

- minimal trellises

nonmergeable trellises (13)

biproper trellises

This is strikingly different from the situation with conven-tional trellises for linear codes. For conventional trellises, thethree classes in (13) coincide and consist of a single (up to iso-morphism) element: the unique minimal trellis.

IV. L INEAR TRELLISES AND THEPRODUCTCONSTRUCTION

We henceforth impose an algebraic structure on the trellisalphabet: we assume, unless stated otherwise, that isthe finite field with elements. Thus, all the edge and vertexlabels in a labeled trellis are symbols or sequences over. Itfollows that the edge-label code and the label codeare codes over .

Definition 4.1: A labeled trellis is said tobe linear over , or simply linear, if is reduced and

is a linear code over . An (unlabeled) trellis is said to belinear if there exists a vertex labeling ofsuch that the resultinglabeled trellis is linear.

The foregoing definition applies to both conventional andtail-biting trellises. Note that a linear trellis always representsa linear code, although linear codes may be also represented bynonlinear trellises (see Example 4.2 later). The theory of lineartrellises is developed in detail in [11]. Some of our results from[11] are briefly reviewed in this section.

Remark: Another important class of trellises corresponds tothe case where is a group code. A group code of length

over a group is defined as a subgroup of the direct productgroup . As in Definition 4.1, we say that a trellis is agrouptrellis if is reduced and there exists a labeling of the verticesof such that the corresponding label code is a groupcode. It is obvious that every linear trellis is also a group trellisover the additive group of , but notvice versa. Group trellisesnaturally arise in two different ways. It is possible that the trellisalphabet is itself an additive group which does not admit aninvertible multiplicative structure. For example, group trellisesfor codes over are of this type; linear trellises for -linearcodes do not exist. It is also possible that vertices of a trellisfor a linear code over cannot be labeled in such a way that

is a linear code over , but can be labeled so thatis a group code over the additive group of. Group trellisesof this kind were considered in [3], where it is shown that suchtrellises can have fewer vertices than any linear trellis for thesame code.

Here, we consider group trellises only briefly. For codes overan alphabet of prime size, the notions of group and linear trel-lises coincide (since , the only group of order , is isomor-phic to ). For other alphabets, the theory of group trellises issimilar to the theory of linear trellises, at least if the underlyinggroup is abelian. In particular, all the results of this paper holdessentially without change for group trellises over an abeliangroup.

A. Basic Properties of Linear Trellises

Certain properties of linear trellises are apparent directly fromthe definition. It is obvious that a reduced labeled trellisislinear if and only if a linear combination of any two label se-quences in is another label sequence in. Further, if a la-beled trellis is linear, then the vertex labels within each vertexclass constitute a linear space; this space is just a projectionof on the corresponding position. Since the labels within


(a) (b)

(c) (d)

(e) (f)

Fig. 6. Linear and nonlinear trellises over the trivial alphabetA = f0g.

a vertex class are distinct by definition, one can identify thevertices of a labeled trellis with their labels. We henceforthadopt this convention and say that, in a linear trellis, the vertexclasses are linear spaces. With this conven-tion, the edge classes also become linearspaces (note that the assumption thatis reduced is essen-tial here; otherwise, the edge classes ofare not necessarilylinear). The fact that andare linear spaces over implies, in particular, that the vertex-class profile and the edge-class profile of a lineartrellis consist of powers of .

Given an unlabeled trellis over , how can one decidewhether is linear? An obvious necessary condition is that thesize of each vertex class inis a power of , namely,for all and for some integers . Given that thiscondition is satisfied, one could simply try all the possible la-belings of the vertices of and see whether one of them pro-duces a linear trellis. Since the vertices ofcan be labeled in

different ways, the complexity of this approach is about. In [11], we derive a more efficient al-

gorithm for this purpose: the algorithm decides whether a giventrellis is linear (and finds a linear labeling) in time that is onlypolynomial in the size of the trellis.

Example 4.1:Consider the tail-biting trellises , , anddepicted in Fig. 6. All the three trellises are reduced. The

edge-label alphabet for all the three trellises is ; sinceall the edge labels are zero, they are suppressed in Fig. 6.

In Fig. 6(a), (c), and (e), we apply Algorithm B of [11, p. 337]to determine whether the trellises, , and , respectively,are linear over . The algorithm successfully finds a linear la-beling for and , but halts in Fig. 6(a) indicating thatis not linear. We also determine linearity of the same trellisesover . The overall results are that is nonlinear, is linearover both and , while is linear over but not linearover .

An interesting conclusion from this example is that trellis lin-earity is inherently a graph-theoretic property: it may be definedand thought of as a characteristic of unlabeled graphs. We arenot aware of any prior work on linearity, or similar concepts, ingraph theory.

Recall that a subcode is a representation code forif the cycles corresponding to the label sequences incover

all the edges of . If is linear, then it can be specified by agenerator matrix. We say that a matrixover is arepresen-tation matrixfor if it generates a code that represents. Thefollowing proposition is proved in [11].

Proposition 4.1: A labeled trellis is linear if and only ifthere exists a linear representation code foror, equivalently,if can be described by a representation matrix.

The general idea of the proof is to show that if there existsa linear representation codefor , then theedge classes are linear spaces over , fromwhich the linearity of easily follows. An alternative proof is


obtained by showing that every representation matrixforcan be extended to a generator matrix for .

B. Linearity and Minimality

Linear trellises have much useful structure, and it would benice if minimal tail-biting trellises for linear codes were linear.This is certainly true for conventional trellises [11], [31]. Fortail-biting trellises, it was shown in [3] that a minimal trellis for alinear code over may be a group trellis over an additive groupof . Such trellises are still essentially linear—they factor intoa product of elementary trellises, as shown in the next section.No other types of deviation from linearity have been observedin the literature so far. It is, therefore, surprising that there existminimal tail-biting trellises for simple binary linear codes thathave no linearity properties whatsoever.

Example 4.2:Let be the binary linear code gen-erated by and . A nonlinear tail-biting trellis

for is depicted below

It is not difficult to show that this trellis is -minimal. Notethat there are exactly three -minimal linear trellises for ;however, they all have four vertices at some position (theseare just cyclic shifts of the minimal conventional trellis for).Thus, up to isomorphism, the trellis above is, in fact, the unique

-minimal trellis for .

This example dashes any hope that minimal tail-biting trel-lises for linear codes would be linear or “close” to linear. Nev-ertheless, we focus on linear trellises in this paper. Our reasonsfor doing so are best described by drawing an analogy withcodes. Although the best codes are not necessarily linear, themajor part of research in coding theory is concerned with linearcodes. Indeed, linear codes have a certain amount of structurewhich makes it possible to study them, and it appears that manygood codes are linear. Similarly, linear tail-biting trellises haveuseful structure which enables us to study them. Furthermore,it appears that most (though not all) good tail-biting trellises forlinear codes are found in the class of linear trellises. The latterstatement can be made precise: it can be shown that a tail-bitingtrellis for a linear code that attains the cutset bound [3], [35] atenough positions has to satisfy certain linearity constraints. Formore on this, see [12].

C. The Product Construction

The notion of a trellis product and the corresponding productconstruction were introduced by Kschischang and Sorokine [16]for conventional trellises. See also [11], [14], [31]. The productconstruction was extended to tail-biting trellises in [3], whereit was used to construct minimal tail-biting trellises for severalwell-known codes.

Definition 4.2: Let andbe two trellises of depth , either conventional or tail-biting,over the alphabet , and assume that is endowed with an

associative addition operation. Then theproduct is thetrellis whose vertex classes and edge classesare the Cartesian products, defined as follows:

and (14)

and (15)

Specifically, there is an edge from a vertexto a vertex if and only if is an

edge in for some , and is an edge infor some . The label of this edge is the sum .

Kschischang and Sorokine [16] showed (in the context ofconventional trellises) that if and ,then the product trellis represents the code

and (16)

It is obvious that the trellis product operation is associative, andhence, an expression of the form is well de-fined, providing only that the trellises all havethe same depth. Now let be a linear code of length anddimension over the field , and let be abasis for . Each generates a one-dimensional subcode of,which we denote by . Thus,

(17)

It follows from (16), (17) that if are trellises for, respectively, then their product

represents . This almost completes the description ofthe product construction. It remains to specify the trellises

.To do so, we need to introduce the notions ofspanandel-

ementary trellis. Although simple, these concepts have turnedout to be fundamental in trellis theory, and special care has to betaken to define them properly. Given a codeword , aspanof , denoted , is a semiopen interval such thatthe corresponding closed interval contains all the nonzeropositions of . Although a semiopen interval cannot contain theentire time axis by definition, we over-ride this technicality by explicitly allowing , in additionto the above. The foregoing definition of span holds for bothconventional and tail-biting trellises. The only difference is inthe interpretation of an interval: cyclic intervals for tail-bitingtrellises and conventional intervals for conventional trellises.

Given a vector over the field, along with its span , the cor-

responding labeledelementary trellis can be constructed asfollows: has vertices, labeled , at those po-sitions that belong to and a single vertex, labeled, at otherpositions. There is an edge from a vertex to avertex if and only if , or , or the two ver-tices have the same label . The label of this edge is

, where is the label of for all and (also) the label offor all . It is easy to see that the elementary trellisis

linear, contains exactlycycles, represents the one-dimensionalcode over .


(a)

(b) (c)

Fig. 7. Different elementary trellises for the vector(01011).

Example 4.3:Let . Thereare eight possible spans for in a tail-biting trellis, namely,

, and .Of these, only , , and are possible spans forin a conventional trellis.

Eight different elementary trellises for , corresponding tothe eight possible spans, are depicted in Fig. 7(a) along with thecorresponding representation matrices. Note that an elementarytrellis depends not only on and , but also on the ambientfield . For example, two elementary trellises for are de-picted in Fig. 7(a) and (b). While the span is in both cases,in one case we assume and in the other case .

We say that a trellis may be obtained by aproductconstructionif the trellis factors as ,where are elementary trellises for somecodewords of , and for some choice ofspans for .

Remark: For group trellises, the definition of span remainsunchanged and the definition of an elementary trellis should bemodified in the following way. First, we let denote the cyclicgroup generated by, and take as the order of this group. Theelementary trellis will again have vertices at those positionsthat belong to and a single vertex at those posi-tions that do not. The single vertex may be labeled, and the

vertices may be labeled by the elements of. Two verticesand are connected by an edge if and

only if , or , or the two vertices have the same label. In the latter case, the label of

is given by . If , then the label of is equal to thethentry in the label of , while if , the label of is equal tothe th entry in the label of .

D. The Product Form and the Factorization Theorem

While it is clear that the product construction yields lineartail-biting trellises, it is by no means clear if there exist lineartail-biting trellises that are not obtained by the product construc-tion. In this subsection, we review the factorization theoremof [11], which shows that the product construction is exhaus-tive—that is, a tail-biting trellis is linear ifand only ifit can beconstructed as a product of elementary trellises.3

The only assumption we need is that thereexistsa labelingof the vertices of a tail-biting trellis such that the resultinglabel code is -linear. This turns out to be enough to provethat , as an unlabeled graph, may be obtained as a product ofunlabeled graphs that correspond to elementary trellises. Thereader is referred to [11] for more details.

Suppose that is a set that generates a linearcode of length over (that is, containsa basis for ). Let be an arbitrary set oflinearly independent vectors. A representation matrix for the

3Note that this is not true for trellis structures with a more elaborate time axis,such as trellis formations [13].


trellis that is obtained as a product of the elementary trellisesis given by

...(18)

where if and otherwise, for all. Such representation matrices will be useful in

the remainder of this paper. We formalize the notion as follows.

Definition 4.3: A representation matrix is said to bein product formif each row of is a representation matrix foran elementary trellis, the nonzero vertex labels in each of therows of are identical, and the set of thesevertex labels islinearly independent.

It is easy to see that any representation matrixin productform represents a trellis that is obtained by the product construc-tion. Indeed, the individual factors can be readily identified withthe rows of . Thus, a trellis may be obtained by the productconstruction if and only if there exists a representation matrixfor in product form.

Theorem 4.2:Let be a linear tail-biting trellis. Then thereexists a representation matrix for in product form. Equiva-lently, can be factored as , where

are elementary trellises.

Theorem 4.2 is of fundamental importance. In particular,this factorization theorem implies that in constructing minimallinear tail-biting trellises for linear codes, we only need toinvestigate tail-biting trellises obtained by the product construc-tion (cf. Section V). A detailed proof of Theorem 4.2 is givenin [11]. The general idea is to decomposeinto a product ofan elementary trellis and another linear trellis. The somewhattechnical proof of [11] consists of a long series of lemmas thatare not used otherwise in this paper.

E. More Properties of Linear Tail-Biting Trellises

We conclude this section with a proof of several structuralproperties of linear tail-biting trellises that follow from the fac-torization theorem (Theorem 4.2). These properties are formu-lated as Theorem 4.4, Corollary 4.5, and Theorem 4.6 in whatfollows. First, however, we need a simple lemma, whose proofis left as an exercise for the reader.

Lemma 4.3:Suppose that is a linear trellis,either conventional or tail-biting. If either or is mergeable,then so is .

This lemma, in conjunction with Theorem 4.2, is used in theproof of the following theorem, which, in turn, was used in theproof of Theorem 3.3 of Section III-B.

Theorem 4.4:A linear tail-biting trellis that is nonmerge-able is also biproper.

Proof: Assume to the contrary that is notproper. Then there exist two distinct edges and

in that begin at the same vertex and

have the same label . We assume without loss of gener-ality (w.l.o.g.) that . Since is linear, it must also con-tain the edge , whereBy Theorem 4.2, there exists a representation matrixforin product form, which generates a representation code. Let

be a codeword of such that containsthe edge . Then is a linear combination of the rowsof . Thus, for some nonzero

and some rows of .Since is in product form, the vertex labels of different rows in

are linearly independent. It follows that, otherwise the linear combination

would be nonzero in the firstposition. Now let

be the product of the corresponding elementary trellises.Since , we have , and canbe regarded as a conventional trellis. It is obvious thatislinear, and, therefore, it contains the edge . But italso contains the edge by construction.Thus, is not proper. It is known [31], [33] that a conventionaltrellis which is not proper must be mergeable. Henceismergeable. Since can be factored as , Lemma4.3 implies that must also be mergeable. A similar argumentshows that must be mergeable if it is not co-proper.

An interesting consequence from Theorem 4.4 is as follows.Let be a set of generators for a linear code,let be a choice of spans for these generators,and let be the corresponding tail-biting trellis for .

Corollary 4.5: If is -minimal, then no two spans instart at the same position and no two spans

in end at the same position.

Proof: If there are two spans that start at the same posi-tion, then is not proper. If there are two spans that end at thesame position, then is not co-proper. Such a trellis cannot be

-minimal by Theorems 3.3 and 4.4.

Corollary 4.5 extends to the case of tail-biting trellises a prop-erty that is well known [16], [31] for conventional trellises. It hasbeen previously proved for tail-biting trellises in [25] and in [2,Theorem 1], using different methods. In contrast, the followingproperty of linear trellises, either conventional or tail-biting, isestablished here for the first time.

Theorem 4.6:Let be a linear trellis over such that no el-ementary factor of has span . Then, for some ,the sizes of the vertex and edge classes satisfy

(19)

Proof: Given a trellis of depth , let usdefine

and

It is easy to see from Definition 4.2 that if factors as ,then and . Now,


if is an elementary trellis over , then(unless , in which case since consistsof disjoint cycles). The theorem then follows from the factthat , where areelementary trellises (cf. Theorem 4.2). Moreover, observe that

if and only if is one-to-one.

We observe that if a linear trellis contains elementary fac-tors with span , then (19) still holds, but will be strictly lessthan for a one-to-one trellis.

V. CONSTRUCTION OFMINIMAL TAIL -BITING TRELLISES

Let be a linear code of length and dimension over .In this section, we show how the product construction of Sec-tion IV-C can be used to construct linear -minimal tail-bitingtrellises for . First, we need to briefly review the relationshipbetween the product construction and the unique conventionalminimal trellis for .

Given a nonzero vector , welet denote thesmallestinteger such that , and saythat startsat position . Similarly, we let denote thelargestinteger such that , and say that endsat .Clearly, is a valid choice of span for, and we saythat is theconventional spanof .

Definition 5.1: A basis for issaid to be inminimal-span formif no two vectors in startat the same position and no two vectors inend at the sameposition, that is, if are distinct and

are distinct.

It is known [16], [21], [31] that a trellisis the minimal conventional trellis for if and only if

is a basis for in minimal-span form andfor all . In general, a

basis in minimal-span form is not necessarily unique. However,it is known [16], [31] that any two bases in minimal-span formgive rise to the same set of conventional spans. These spans,called theatomic spansin [16], are thus uniquely determinedby . Furthermore, any set of codewords of with atomicconventional spans is a basis forin minimal-span form.

Lemma 5.1:Let andbe two bases for in minimal-span form. Let

and

where the spans determining the elementary trellises are theatomic conventional spans. Then .

Proof: The trellises and are minimal conventionaltrellises for . The lemma thus follows from the fact that theminimal conventional trellis for is unique.

Lemma 5.1 shows that it does not matter which basis in min-imal-span form is chosen for. However, the fact that such abasis is not unique presents a notational difficulty. To overcomethis difficulty, we impose a lexicographic order on the setof vectors in , which extends to a lexicographic order on theset of bases for in the obvious way. We can then speak ofthelexicographically first basis for in minimal-span form.

A. Minimal Tail-Biting Trellises and the Characteristic Matrix

The factorization theorem (Theorem 4.2) reduces the searchfor -minimal linear tail-biting trellis for a linear code tothe task of specifying a set of generators foralong with theirspans. This task is still daunting: for a code of lengthanddimension over , there are ways to select a set ofgenerators (basis), and at least ways to specify the spansof these generators for each basis. In this subsection, we provethatevery -minimal linear tail-biting trellis for arises froma small set of only “characteristic” generators, with the span ofeach characteristic generator being completely determined. Theproof is presented in a series of lemmas, leading to subsequentTheorems 5.5 and 5.6 .

Lemma 5.2:Let be a -minimallinear tail-biting trellis for a code of dimension . Thenand the codewords form a basis for .

Proof: If , then are linearly de-pendent. Suppose w.l.o.g. that can be expressed as a linearcombination of . Then

is a linear trellis such that and .Hence . To complete the proof, observe thatif and only if , which happens ifand only if and form a basis for .

Lemma 5.3:Let be a -minimallinear tail-biting trellis for a code of dimension . Let

be a basis for in minimal-span form. Then for every, either or for some

.Proof: Let , and suppose that .

Then is an ordinary (as opposed to cyclic) interval. Hence,and by the definition of span. It is ob-

vious that if is -minimal then and ,otherwise, we can replace by a strictly smaller elemen-tary trellis for the same code . It remains to prove that

for some . Sinceis a basis for , the codeword can be expressed as a linearcombination of the elements of , that is,

(20)

for some nonzero . Since no two code-words in start at the same position and no two codewordsend at the same position, it follows that

(21)

(22)

W.l.o.g. suppose that

If , we are done. Otherwise, ,in view of (22). Now, since is a basis forby Lemma 5.2, it follows that can be expressed as thelinear combination

for some . If , thenis also a basis for , and, therefore, the trellis

represents . But since and


, it follows that and .Therefore, , and is not -minimal. Hence, .Now suppose that

and(23)

where the first nonzero entry is in position .Consider the codeword given by

Clearly, is a basis for , and, therefore, thetrellis represents . By (23), we have

. Since , we also have. It follows that and . Therefore,

and is not -minimal. Thus, the assumptionthat leads to a contradiction, which proves that

. Applying exactly the same argumentto completes the proof of the lemma.

We point out that if has a unique basis in minimal-spanform, then the fact that for someimplies that . Otherwise, the set of codewordssuch that for some formswhat is known [16] as an atomic equivalence class, and choosingany codeword from this equivalence class in the product con-struction produces the same trellis for(cf. Lemma 5.1).

Next, we generalize the result of Lemma 5.3. To this end, letdenote a cyclic shift to the left times, and consider the

corresponding cyclic shift of , namely, .

Lemma 5.4:Let be a -minimallinear tail-biting trellis for a code of length and dimension .Let be any integer in the set . Let

be a basis in minimal-span form for the code. Then, for each , either or

for some .Proof: This follows immediately by combining Lemma

5.3 and Proposition 3.2. Notice that, in general, the intervalis a cyclic interval.

Lemma 5.4 makes it possible to characterize all the-min-imal linear tail-biting trellises for in terms of a small set ofcharacteristic generators. Specifically, for each , letdenote the lexicographically first basis for in min-imal-span form. Let denote a cyclic shift to the righttimes, with being the identity map. For each , de-fine . Then is a subset of for all . Foreach , we choose the span ofas the (cyclic) interval

, where .

Definition 5.2: Let be a linear code of length. A char-acteristic generatorfor is a pair consisting of a codeword

and a (cyclic) intervalsuch that are nonzero. The set of all the characteristicgenerators for is given by

(24)

with the understanding that for each, where . Thecharacteristic matrixfor is

the matrix having the elements of as its rows.

Notice that one can have two distinct characteristic generatorsthat correspond to the same , but different choices ofspan for (cf. Example 5.1). The set in (24) is still aset,not a multiset, since the elements ofshould be regarded aspairs . Also, the definition of a characteristic generatoras a pair completely specifies the elementary trellisfor each . Nevertheless, to simplify notation, wewill often refer to characteristic generators as simply vectors,and write , provided no confusion arises.

Theorem 5.5:Every linear tail-biting trellis for a linearcode that is minimal under can be constructed as

where are some linearly independent charac-teristic generators for .

Proof: By Lemma 5.2, the trellis can be constructed as, where is a basis

for . It is obvious that for all in thisconstruction, since, otherwise, we can replaceby a strictlysmaller trellis for using the span .Hence, there exists at least one such that . This im-plies that for some . Thus,

and belong to the same atomic equivalence class of. In view of Lemma 5.1, we can replace by the char-

acteristic generator in the construction of , andthe theorem follows.

A similar argument makes it possible to characterize minimallinear tail-biting trellises for a linear code with respect to allthe minimality orders introduced in Section III-A.

Theorem 5.6:Every linear tail-biting trellis for a linear codethat is minimal under any of the minimality orders defined in

(6)–(11) can be constructed as , whereare some linearly independent characteristic

generators for .Proof: The vertex-product order , the vertex-max

order , and the vertex-sum order preserve the orderdescribed by . By Proposition 3.1, the set of minimaltrellises with respect to these three orders is a subset of theset of -minimal trellises, and Theorem 5.6 follows directlyfrom Theorem 5.5. The other three orders , , and

, introduced in (9)–(11), preserve the order defined bythe edge-class profile of the trellis.Observe that the proof of Lemma 5.3 (and, hence, also ofLemma 5.4 and Theorem 5.5) holds without change if wereplace “ -minimal” with “ -minimal” throughout. Thus,Theorem 5.5 holds for -minimal trellises as well, and theclaim with respect to the , , orders follows.

Theorems 5.5 and 5.6 constitute our main result in this sec-tion. It follows from these theorems thatall the minimal lineartrellises for a linear code arise through the product construc-tion from the characteristic matrix for . In the next subsec-tion, we show how to compute this matrix in time from


any basis for . In a later subsection, we derive some of themany interesting properties of the characteristic matrix.

B. Computation of the Characteristic Matrix

We first determine the size of the setof characteristic gen-erators. It follows directly from (24) that

However, the size of turns out to be much smaller than .In fact, we shall see in Theorem 5.9 that . This theoremis a direct consequence of the next two lemmas.

Lemma 5.7:Let be the set characteristic generators fora linear code . Then the spans of any two elements ofstartat different positions.

Proof: Assume, to the contrary, that there exist two dis-tinct characteristic generators with

and with .Notice that Definition 5.2 implies that , , , and are allnonzero. It also follows from Definitions 5.1 and 5.2 that noneof the sets can contain both and . Thus,assume that and , where . We distin-guish between two cases.

Case 1: .Here . We know that the image of

under the cyclic shift is the conventional intervalfor some vector in , a basis for

in minimal-span form. Now consider . Sinceall the nonzero positions ofare in , it follows that all thenonzero positions of are in .Furthermore, and are nonzero. Hence, thevector starts at position and ends at posi-tion . It follows that is in the same atomicequivalence class of as the vector . Since isthe lexicographically first basis for in minimal-span form,we have . By a similar argument,belongs to , the lexicographically first basis for inminimal-span form, and . The inequalities

and

contradict each other. This is so because andare both conventional intervals, so that in

both cases we are lexicographically comparing the stringsand (where the subscripts

are modulo ).Case 2: .Here , so suppose w.l.o.g. that the (cyclic) interval

is a proper subset of the (cyclic) interval . Asbefore, let be an element of the basis for

in minimal-span form, and consider . By thesame argument as in Case 1, we find that while

, since . Furthermore, is linearlyindependent from , since any linear combination ofvectors in produces a codeword of that starts atposition other than , by (20) and (21). It follows

that is a basis for such that thetotal span of is less than the total span of , namely,

This is a contradiction, since it is well known [16], [31] that abasis in minimal-span form minimizes the total span among allpossible bases for the same code.

The proof of Lemma 5.7 shows that the spans of any two char-acteristic generators for end at different positions as well.Along with Theorem 5.5, this yields an alternative proof forCorollary 4.5. The next lemma further restricts the spans ofcharacteristic generators.

Lemma 5.8:Let denote the support of a linear codeof length , and let . Then if and only if there isa characteristic generatorfor with span .

Proof: Let be a characteristic generator forwith . Then by Definition 5.2. Since

, it follows that . Consider , a basis forin minimal-span form. Suppose there is no charac-

teristic generator with span . Then, byDefinition 5.2, there is no vector with . Butthis means that for all in we have

, since if then . Therefore,or, equivalently, .

Theorem 5.9:The size of the set of characteristic genera-tors for is equal to .

Proof: The spans of characteristic generators have to startat some position, and no two of them can start at the same posi-tion by Lemma 5.7. By Lemma 5.8, the total number of positions

such that the span of some characteristic generator startsat , is .

It follows from Theorem 5.9 that the characteristic matrixfor is a matrix, and columns ofare identically zero. For simplicity, we henceforth assume that

(otherwise, we can always puncture out the all-zerocolumns of ). Then the characteristic matrix is ansquare matrix, and no column of is identically zero.

We now show how the characteristic matrixcan be com-puted from an arbitrary given basis for .The first step is to convert this basis into the minimal-span form.As is well known [16], [21], [31], this can be easily accom-plished by a greedy sequence of elementary row operations, asfollows:4

(25)

The loop in (25) necessarily terminates after at mostiterations, since the total span strictlydecreases at each iteration. Thus, the total number of elementaryrow operations performed in (25) is at most . To ensurethat the resulting basis is the lexicographically first basis for

4The pieces of pseudocode in (25) and (26) assume operations over, tosimplify notation. It should be clear that the procedures described in (25) and(26) extend in the obvious way to nonbinary codes.


in minimal-span form, we need to select the lexicographicallyfirst codeword in each atomic equivalence class. Upon the ter-mination of the loop in (25), this can be easily done asfollows:

(26)

The complexity of (26) is also at most . This immediatelyyields an algorithm for the computation of the character-istic matrix. For each of the cyclic shifts of , compute thelexicographically first basis in mimimal-span form using (25)and (26), then rotate cyclically to the right and form the set ofcharacteristic generators as in (24).

Example 5.1:Let be the binary linear self-dual code gener-ated by and . The matrices givenby

are generator matrices in minimal-span form for, ,, and , respectively. Rotating to the right, as in (24),

produces the characteristic matrix forgiven by

However, we can do better by observing that for each, the sets and in (24) have at

least characteristic generators in common. An efficientalgorithm for computing the set of characteristic generators,based upon this observation, is shown at the bottom of thepage. As in (25) and (26), Algorithm A assumes elementaryrow operations over to simplify notation.

We now show by induction on that, just before the exe-cution of the cyclic shift step, the set is thelexicographically first basis in minimal-span form for the code

. This is certainly true for . As an induction

hypothesis, assume that this is true for . Then, by Lemma5.8 and Definition 5.1, the fact that guaranteesthat there is a unique such that .After the cyclic shift, the set is clearly a basisfor . Note that for all , the cyclicshift decreases and by one. Thus, all the vectorsin start at different positions and endin different positions. Furthermore, is the unique vector in

with the property that . Thisproperty obviously persists through the execution of the min-imal-span reduction step. In addition, this step makes sure that

for all . Hence, after theminimal-span reduction step, the set is a basisfor in minimal-span form. Moreover, by induction hypoth-esis, each vector in is lexicographicallyfirst in its atomic equivalence class (this property is preservedby the cyclic shift). After the lexicographic ordering step,isalso lexicographically first in its atomic equivalence class. Thus,

is the lexicographically first basis for inminimal-span form. It follows that during the update step, thepair

is indeed a characteristic generator for. It remains to prove thatAlgorithm A eventually terminates with . Notice that

and, therefore, modulo . Thus,after at most iterations, the set contains characteristic gen-erators with spans ending at every position in .Hence, , and we are done.

Note that at each iteration, Algorithm A performs at mostelementary row operations. Since the algorithm terminates

after at most iterations, its overall complexity is . Thecomplexity of precomputation is also , as can be seenfrom (25) and (26).

Remark: As pointed out by a referee, since the lexicographicorder on the atomic classes of was introduced for nota-tional convenience only, in practice there is no reason to insist oncomputing thelexicographically firstset of characteristic gener-ators. This makes it possible to simplify Algorithm A as follows.In the precomputation step, omit (26) and letbeanybasis for

in minimal-span form. Delete the lexicographic ordering step.

An arbitrary basis for a linear code of length and dimension .The set of characteristic generators for.

Pre-computation step: Using (25) and (26), compute , the lexicographically first basis for in min-imal-span form. Set . Also set , where for all .

Cyclic shift step: Find the unique , such that . Then do for all .Minimal-span reduction step: While so that , do (and go back to the

minimal-span reduction step).Lexicographic ordering step: While such that is a subset of and

, do (and go back).Update step: Set . If is not already in , adjoin this pair to . If , return

and exit; else go to the cyclic shift step.


In the update step, adjoin the pairto only if does not already containanycharacteristic gen-erator with span . This simplification, sug-gested by the referee, potentially eliminates a number of ele-mentary row operations at each iteration, although the overallcomplexity of the algorithm remains .

Example 5.2:Consider the binary Hamming codegenerated by

and

For this code, Algorithm A terminates in five iterations. Theresulting bases in minimal-span form forare given by

(27)

respectively. The vector denoted in Algorithm A is alwaysshown as the top row in (27). Observe that during the fourthiteration, nothing is adjoined to the set since the pair

is alreadythere. The resulting characteristic matrix for, along with thecorresponding spans, is given by

(28)

Note that the minimal conventional trellises forare all minimal under (cf. Fig. 2). It follows

that the set of characteristic generators defined in (24) isthe smallestset of generators from whichall the -minimaltail-biting trellises for can be constructed. On the other hand,not every choice of characteristic generators in produces a

-minimal tail-biting trellis for under the product construc-tion. For example, the generators in (28) arelinearly dependent and the trellis does notrepresent . As another example, the generatorsin (28) are independent and the trellisdoes represent, but it is not -minimal.

C. Properties and Duality of the Characteristic Matrix

We now derive a number of interesting properties of the char-acteristic matrix for . As before, we assume that ,where is the length of .

Theorem 5.10:Let

be the set of characteristic generators for a linear codeoflength and dimension . Then

(29)

Proof: Let denote the minimal con-ventional trellis for . Since is also a linear -minimaltail-biting trellis for , it follows from Theorem 4.2 andTheorem 5.5 that is a product of elementary trellises forsome characteristic generators. W.l.o.g., suppose that these

generators are . Since in , thespans of do not contain the integer .On the other hand, we will now show that for all

. Indeed, assume, to the contrary, thatdoes not contain . Then and

is a conventional (as opposed to cyclic) interval. Sinceisnonzero at positions and by Definition 5.2, it follows that

and . Writing as a linear combina-tion of , it is easy to see (cf. [16], [21]) that

for some . But this contradictsLemma 5.7. We have thus shown that is contained in thespan of exactly characteristic generators. By a similarargument applied to the cyclic shift , there are at exactlycharacteristic generators whose span contains the integer, forall . Thus, each is counted exactly

times in , and (29) follows.

Theorem 5.10, as well as other properties of the characteristicmatrix , can be conveniently expressed in terms of the associ-ated span matrix , defined as follows.

Definition 5.3: Let be the set of characteristic generatorsfor a linear code of length . With each charac-teristic generator , we associate the correspondingspan vector defined by if

and if . The matrix having thespan vectors as its rows is called thespan matrixfor .

Note that the entries of the span matrixare integers, notelements of a finite field. Thus, the total span ofgiven by


is just the sum (over ) of all theentries in and Theorem 5.10 can be rendered as follows:

(30)

This property makes it possible to establish an interesting du-ality result, relating to the span matrix of the dual code .First, however, we need the following lemma.

Lemma 5.11:Let and be a pair of dual codes of length. Let and be the sets of char-

acteristic generators for and , respectively. Then there isa one-to-one correspondence between the elements ofand , such that if then

Proof: Fix an integer . Let be the uniqueelement of such that for some . We put

in correspondence with the uniquesuch that for some . In other words, for all

, the function pairs the characteristic generator forwhose spanendsat position with the characteristic gen-

erator for whose spanstartsat position . It follows fromLemmas 5.7 and 5.8 that is, indeed, a one-to-one corre-spondence. Now, by Definition 5.2, the (supports of the) vectors

and overlap at position . Since and are orthogonalto each other, they must overlap in at least one more position.If is this position, then . It is easyto see that any two (cyclic) intervals and such that

satisfy ,and the lemma follows.

Theorem 5.12:Let and be a pair of dual codes oflength . Let andbe the span matrices forand , respectively. Then is thecomplement of , namely, , where is theall-one matrix.

Proof: In terms of span matrices, Lemma 5.11 says thatthere is a one-to-one correspondence between the rows ofand

such that ifthen

(31)

where is the all-one vector of length and the inequality in(31) is componentwise. This, in turn, implies that

(32)

But (30) shows that the first sum on the left-hand side of (32)is equal to , while the second sum is

. It follows that the lower bound (32) musthold with equality, which is only possible if (31) also holds withequality and .

Theorem 5.12 shows that there is a strong duality relationshipbetween the rows of , the characteristic matrix for , and therows of , the characteristic matrix for . It is well known[31] that the minimal conventional trellises forand have

the same state-complexity profile. It is, therefore, natural to askwhether a similar relationship holds for tail-biting trellises. Thefollowing proposition might be useful in this context.

Proposition 5.13: Let and be a pair of dual codes oflength and dimensions and ,respectively. Let , whereare some characteristic generators for. Then there existsa trellis

(33)

where are some characteristic gener-ators for the dual code , such that and have the samestate-complexity profile.

Proof: Let and denote therows of and , respectively, listed so that for allIt follows from the proof of Theorem 5.10 that

(34)

Now, assume w.l.o.g. that are the first rowsof the characteristic matrix . Take as thelast rows of . (In general, if are the

rows of that are in one-to-one correspondence withthe generators for , we always choose the gen-erators for as the remaining rowsof .) Then, the state-complexity profiles of and aregiven by and

, respectively. We have

The sum on the left-hand side is precisely, while the sum on the right-hand side

is equal to the all-zero vector by (34). It follows that. Thus, the state-complexity

profile of is equal, componentwise, to the state-complexityprofile of .

Unfortunately, Proposition 5.13 doesnotprove that given any-minimal tail-biting trellis for , there exists a tail-biting

trellis for with the same state-complexity profile. To provethis, one would have to show that if thecharacteristic gen-erators are linearly independent, then the char-acteristic generators in (33) are also linearlyindependent. While we conjecture that this is true, we do nothave a proof.

VI. M INIMIZATION WITH RESPECT TOSPECIFICORDERS

We now return to the problem of constructing minimal tail-biting trellises. Specifically, given a linear code, we wouldlike to construct a linear tail-biting trellis for that minimizesthe average state complexity (a -minimal trellis) and/or themaximum state complexity (a -minimal trellis), amongall possible linear trellises for . The factorization theorem of[11] shows that there are roughly different linear tail-biting trellises for . Proposition 3.1 and Theorem 5.5 narrow


the search down to at most different trellises, constructedfrom the rows of the characteristic matrix. However, isstill a large number. Thus, efficient procedures are needed to findlinear tail-biting trellises that are minimal under , , andother total orders.

A. Minimization With Respect to the Product Order

Minimization with respect to the product order is rela-tively straightforward. Consider a linear tail-biting trellis

that factors as . Thenwe have

(35)

by (14) and the definition of an elementary trellis. To mini-mize the right-hand side of (35), we order thecharacteristic generatorsin such a way that for all .The complexity of permuting the rows of in this way is atmost . We then find a -minimal tail-biting trellisfor by taking the first linearly independent rows of this(reordered) characteristic matrix. The following piece of pseu-docode performs this simple computation; it starts withand , while assuming that the rows of

are ordered as above:

(36)

Upon the termination of the for-loop in (36), a -minimal tail-biting trellis for is given by . Thecomplexity of the for-loop itself is clearly .

Example 6.1:Let be the Hamming code consid-ered in Example 5.2. Arranging the rows of (28) in order of in-creasing span-length yields the following characteristic matrix:

(37)

for . The first four rows have the shortestspans, but they are linearly dependent. Hence, a-minimaltail-biting trellis for is given bywith state-complexity profile . Thisis just the well-known conventional trellis for theHamming code (cf. [31, Fig. 3]). But it is certainly not theunique -minimal trellis for . For example, the tail-bitingtrellis with state-complexityprofile achieves the same averagestate complexity , but it is

also -minimal for with maximum state complexity. This tail-biting trellis is depicted below:

In fact, there are exactly four -minimal trellises for ,which form a subset of the 16 -minimal trellises for , whichin turn form a subset of the 20 -minimal trellises for , whichare themselves a subset of the 56 different trellises forthat canbe constructed from the rows of. Note that is quasi-cyclicwith a period of , so all these sets come in multiples oftrellises that are cyclic shifts of each other.

An interesting question is under what conditions does a linearcode have aunique -minimal trellis. An answer to thisquestion may be derived by slightly modifying the procedure in(36). As before, we assume that thecharacteristic generatorshave been ordered so that

Given an ordered sequence of codewordsand another codeword , let de-note the smallest integer such that

if such an integer exists; otherwise, set

As in (36), start with and . Then

(38)

It can be shown that if the condition in(38) is not satisfied, then the characteristic generatorcannot be used in the construction of a-minimal trellis. Thisleads to the following proposition. We omit the proof.

Proposition 6.1: A linear code of lengthand dimension has a unique -minimal trellis if and onlyif the procedure in (38) terminates with exactlycodewords

. This unique -minimal trellis is then givenby .


While the -minimal tail-biting trellis for large randomcodes is usually unique, this is certainly not true in general. Forexample, when the procedure of (38) is used in conjunction withthe characteristic matrix for the extended Hammingcode in (37), it producesThis is, of course, to be expected: since the code is quasi-cyclic,a -minimal tail-biting trellis cannot be unique. In fact, asthe following example illustrates, the situation is even moredegenerate for cyclic codes (cf. [34]).

Example 6.2:Let be a cyclic linear code of length anddimension over (such that ). Let de-note the generator polynomial of. It is easy to see that the

characteristic generators for are preciselythe cyclic shifts of the coefficient vector of . In this case,clearly

and any set of linearly independent characteristic generatorsproduces a -minimal trellis for .

Finally, we observe that any -minimal trellis for is alsominimal under the edge-product order defined in (9), andvice versa. This follows from Theorem 4.6.

B. Minimization With Respect to the Max Order

Unlike the order , the order is much more difficult tohandle. For most linear codes, a good estimate of the least pos-sible maximum state complexity is given by a -minimaltail-biting trellis. However, this is not always true—see Exam-ples 3.1, 6.1, and 6.2. In general, minimization with respect tothe order is similar to the problem of building the lowestpossible circular wall out of bricks of different length, whereeach brick can be used only at a prescribed position. Moreover,certain combinations of bricks, corresponding to linearly depen-dent characteristic generators, are not allowed.

We start by formalizing this problem as a min-max linear op-timization problem. Let denote the set of binary vectors oflength and Hamming weight . Given the characteristic ma-trix for a code of length and dimension over

, let denote the set of all such that the rows ofcorresponding to the support ofare linearly independent. Forexample, given in (37), the set consists of 56 vectors andtheir complements, shown at the bottom of the page. It is easyto see that there is a one-to-one correspondence between vec-tors in and tail-biting trellises for that can be constructedfrom the rows of . Moreover, given , the state-com-plexity profile of the corresponding trellis is given by

, where is the span matrix for

associated with . It follows that the least possible maximumstate complexity for is given by

(39)

and finding a -minimal trellis for is equivalent to findinga vector that attains the minimum in (39). We can use(39) to give a lower bound on in terms of a linear-program-ming problem. Specifically, let us relax the condition in(39) and think of as a vector in , lying on the intersection ofthe hypercube with the hyperplane .Let , and let be arbitrary realnumbers. Consider the following linear-programming problem:

(40)

(41)

(42)

for (43)

Problem is said to befeasibleif there is a vectorthat satisfies all the constraints in (41)–(43). The feasibility of

can be determined in polynomial time with interiorpoint methods [9], [29]. Alternatively, we can use the simplexalgorithm, which runs in linear time on the average [24, p. 143].We have thus proved the following proposition.

Proposition 6.2: Let be the characteristic matrix for a codeof length and dimension . Let be the

smallest integer such that problem is feasible. Thenthe maximum state complexity of a linear trellis foris lower-bounded by .

To compute the lower bound in Proposition 6.2, wecan simply check the feasibility of problem for

until it becomes feasible. At this point, we haveif and only if the feasible region of contains

at least one vector . Such a vector then determinesa trellis which is -minimal for , in view of(39). On the other hand, if we find that the intersection of thefeasible region of with is empty, we can increasethe value of and try again.

To determine whether the intersection of the feasible regionof with is nonempty, we will use the branch-and-bound method [24], to exclude those feasible vectors that donot belong to , combined with a cutting-plane technique[24], to exclude those feasible vectors that do not belong to.Specifically, we will adjoin to (41)–(43) two sets of constraints.The first set consists of constraints of the type

or for some


This set will be denoted ; it obviously contains at most dif-ferent constraints. The second set of constraints is denotedand consists of constraints of the type

where is a binary vector in . The linear-programming problem augmented with these two setsof additional constraints will be denoted . We let

denote a list of linear-programming problems of this type.Now consider Algorithm B at the bottom of the page. It is

easy to see that none of the additional constraints introducedin Algorithm B in the form of the sets and can exclude afeasible solution of problem if this solution belongs to

. Therefore, the algorithm will find such a solution if (andonly if) it exists. It is also easy to see that Algorithm B alwaysterminates, since the number of linear-programming problemson the list cannot exceed .

Although the number of problems on the listmaybecomevery large, it is usually quite small. In fact, the largest list sizewe have ever encountered was . This is so because,as it turns out, the optimal feasible vector forpractically always belongs to . Thus, the branch-and-boundstep, which is the only step that increases the size of, israrely executed. Moreover, for small values of, the problem

is usually infeasible, so that the algorithmquickly proceeds to the correct value of . Specifi-cally, Algorithm B was implemented in MAPLE and tested onrandom linear codes with parameters and

. On average, for a code of length, we had tocheck the feasibility of about five problems before arriving at

, then solve another 2.8 problems to find a solution inand, thus, a -minimal tail-biting trellis. For a code of

length , we had to check the feasibility of about 12 problems

to find , followed by solving another 4.4 problems to finda -minimal trellis.

Nevertheless, we are not able to give a precise estimate ofthe running time of Algorithm B. This is not surprising since, ingeneral, integer programming is known to be NP-hard. We pointout, however, that commercially available integer-programmingpackages, based on algorithms similar to Algorithm B, routinelysolve problems with thousands of variables.

Remark: If one nonetheless insists on an algorithm that isprovably polynomial-time, then all we can offer is the bound ofProposition 6.2. Note that most other bounds on the maximumstate complexity of tail-biting trellises [2], [3], [27], and, in par-ticular, the square-root bound [25], [35], hold for an arbitrarypermutation of the time axis, whereas Proposition 6.2 makessense only when the order of the time axis is fixed. Thus, fora random permutation of the time axis, the bound of Proposi-tion 6.2 is usually much stronger than the square-root bound.For “good” permutations, Proposition 6.2 also produces a tightbound in many cases, but not always. For example, given thetime axis for the Golay code found in [3], Proposi-tion 6.2 yields and, indeed, in this case. Onthe other hand, for random linear codes, we have ob-served several permutations where Proposition 6.2 alone doesnot produce the true value of .

Notice that we have not yet specified the value of the objectivefunction in (40).Indeed, since we are only interested in the feasibility of problem

, we have neglected the “bound part” of the branch-and-bound method. Thus, the constants in (40) maybe chosen arbitrarily. However, there are some natural choices.One good choice is to set for . In thiscase, Algorithm B produces (at no extra cost) a tail-biting trellisthat minimizes the average state complexity

among all the linear trellises for that minimize the

The characteristic matrix for , an integer , and an empty list .An integer and a -minimal trellis for .

Initialization step: If the list is empty, set and .Linear-programming step: Select (arbitrarily) a problem from and solve it, using, for instance, the simplex

algorithm. If is not feasible, delete this problem fromand go back to the initialization step. Otherwise, letdenote the optimal feasible vector for .

Branch-and-bound step: If , find an integer such that . Replace the problem on the listby two problems and , where and . Go back to

the linear-programming step.Cutting-plane step: Here . Let denote the support of . If , then the character-

istic generators are linearly dependent and , for some ,not all of them zero. For , set if and otherwise; for set . Let

For all the problems on the list, including , replace by . Go back to the linear-programming step.Termination step: Here . Free the list , and return . Also return the -minimal trellis .


maximum state complexity .Such a trellis is usually (though not always, cf. Example 3.1)both -minimal and -minimal for .

VII. GENERALIZED PRODUCT CONSTRUCTION AND

TRELLIS DUALITY

The generalized trellis product construction was first devel-oped by Kschischang [14] in the context of conventional trel-lises. In Section VII-A, we use this generalized trellis productto construct tail-biting trellises for certain interesting classesof codes, such as product codes or generalized-concatenatedcodes. In Section VII-B, we consider a special type of gener-alized trellis product, called theintersection product. We usethe intersection product to construct (dual) tail-biting trellisesfor directly from a generator matrix for. In particular, weprove that given any -minimal linear trellis for , thereexists a corresponding dual linear trellis for , such that

is -minimal for and the state-complexity profile ofis equal to that of .

A similar result was proved by Forney [6] in the general con-text of normal factor graphs (also known asForney graphs). Al-though a tail-biting trellis can be regarded as a Forney graph,our methods are very different from those of [6].

A. The Generalized Trellis Product Construction

Let and be two trel-lises of depth , either conventional or tail-biting. Thegeneral-ized product of and , denoted , is defined in exactlythe same way as the trellis product of Definition 4.2, with onedifference: the edges of , rather than being labeled by thesum of the corresponding edge labels inand , arelabeled by , where is an arbitrary given functionwith domain . Clearly, if we take and define

, then the generalized product reduces tothe ordinary trellis product of Definition 4.2.

Observe that the generalized product operationis commu-tative if and only if and the function is sym-metric. In general, the operationis not even associative. Nev-ertheless, just as the ordinary trellis product, the generalizedproduct can be extended to multiply more than two trellises.Let be trellises of the same depth and over thesame alphabet (the latter condition is for notational conve-nience only). Then the expression , wherethe order of multiplication is important, is defined as follows.The incidence structure of the edges and vertices of

is exactly the same as that of (thus,as an unlabeled graph, does not depend onthe order of multiplication). The edges ofare labeled by , where now denotes anarbitrary function with domain .

When the underlying alphabet is a field, it makes sense toask whether the generalized product trellisis linear. We state the following theorem without proof.

Theorem 7.1:Let be trellises of the samedepth over a field . Then, the generalized product trellis

is linear if and only if is linear.

If we take , we againrecover the ordinary trellis product of Definition 4.2. However,other choices for the label function lead to many interestingtrellises. Several possibilities are discussed in the following ex-amples, which illustrate the richness of the generalized trellisproduct construction.

Example 7.1:The Identity Trellis Product . Letbe trellises representing, respectively, the codes

of length over . In constructing thegeneralized product trellis , wetake to be the identity map from to itself. (This typeof generalized trellis product was called thecomma-productin[14].) Thus, the edges of are labeled with vectors of length

over . We can think of the code representedby as a set of matrices , with each column ofcorresponding to the label of one edge in. Clearly, anmatrix over is a codeword of if and only if for all

, the th row of is a codeword of .Note that the identity map is a linear function.

Therefore, by Theorem 7.1, if are linear trel-lises, then so is . If for some ,each of the trellises contains exactly one vertexat time , then is (a cyclic shift of) a conventional trellis. Oth-erwise, is a tail-biting trellis.

The identity product construction has many applications; wewill mention just a couple here. If , thenthe identity product represents thecode obtained by interleaving with itself times (or, equiv-alently, passing through an block interleaver). Forexample, if we take to be a binary BCH codeof length , then represents a subcodeof the corresponding Reed–Solomon code over (cf. [32]).As another application, suppose that are thecodes used in Constructions A, B, or C of lattices from binaryblock codes, described in Conway and Sloane [4, Ch. 5]. Thenthe identity-product trellis represents the corresponding lat-tice (as a union of cosets of , see [5], [30]). This makesit possible to construct tail-biting trellises for many well-knownlattices [4] directly from the tail-biting trellises for the under-lying codes.

Example 7.2:Generalized Trellis Product and ProductCodes. Let denote binary linear codes of lengthand dimension , respectively. Let be generatormatrices for and , respectively. The product codeis defined [20, p. 568] as the set of matrices suchthat every row of is a codeword of and every column of

is a codeword of . Given trellises for and for—either conventional or tail-biting—how can one construct

a trellis for the product code ?

The generalized trellis product provides the answer. Letbe the generalized product of trellises for

. If we take to be the identity map, as in Example 7.1,then represents a set of matrices with each row in .However, the columns of are unrestricted. Instead, we take

to be an encoder for , namely, is definedby . It can be readily verified that the resulting


trellis is then a trellis for ,sectionalized into sections of length bits. If the state-com-plexity profile of is , then the state-com-plexity profile of is . Similarly, if

is the generalized product oftrellises for with defined by ,then represents the product code . In this case, thestate-complexity profile of is times the state-complexityprofile of . The product codes and are ob-viously equivalent to each other—they are related by a matrixtransposition.

Note that if we construct as ,where aredifferenttrellises for whileis given by as before, then still represents thecode . This makes it possible to reduce the number ofstates in by combining appropriate tail-biting trellises forin the generalized product. For example, letbe thebinary single-parity-check code and let be the ex-tended Hamming code (cf. Examples 5.2 and 6.1). The corre-sponding product code is a binarylinear code. If we take , where

and is the -minimal trellis for with state-complexityprofile exhibited in Example 6.1, we geta trellis for sectionalized into 5-bit sections, with state-com-plexity profile . Alternatively, if we take

, where have state-com-plexity profiles

respectively, we get a trellis for with the profile. Yet another option is to reverse

the roles of and in the construction. In this case, weobtain the trellis with 8-bit sectionsand state-complexity profile .

Example 7.3:Turyn Construction of the Golay Code. It iswell known [20, p. 588] that the binary Golay code

can be constructed as

(44)

where and are equivalent binary Hammingcodes such that . Using the generalized trellisproduct, we can construct a tail-biting trellis for directlyfrom the Turyn construction of (44). We take to be the

Hamming code of Example 5.2. Let and be two-minimal tail-biting trellises for with state-complexity

profiles and , respec-tively (cf. Examples 6.2 and 7.1). We take as theHamming code generated by , given by

(45)

It can be readily verified that is indeed an binarycode and . The trellis

, where the spans of are as indicated in(45), is a tail-biting trellis for with state-complexity profile

. This -minimal trellis for was

found using the methods described in Sections V and VI. Nowlet , where

and the order of multiplication is important. It is easy to see thatis a tail-biting trellis for , sectionalized into 3-bit sections,

with state-complexity profile .We point out that this is not optimal; a tail-biting trellis for

sectionalized into 2-bit sections with constant state-complexityprofile was constructed in [3]. However, the trellis

is optimal for this particular permutation of the Golay code.Finally, we note that the construction above indicates how thegeneralized trellis product can be used to construct tail-bitingtrellises for the class of generalized concatenated codes [20,p. 590].

Observe that in all the examples above, the structure of theresulting trellises closely reflects the structure of the underlyingcodes, which might be of importance in some applications. Thisis a distinct advantage of the generalized product construction.

B. Intersection Product and Trellis Duality

We now extend the definition of generalized product as fol-lows. Let . We will henceforth allow anextra symbol for the range of . This symbol has the followingmeaning: whenever , we remove thecorresponding edge from. An important application of thisextension is the intersection product, defined as follows.

Definition 7.1: Let be trellises of the samedepth and over the same alphabet. Then theintersectionproduct is defined as the generalizedproduct

with the label function given by

ifotherwise.

(46)Observe that the label function in (46) is symmetric, and,therefore, the intersection product, is both commutative andassociative. The key property of the intersection product, whichalso justifies its name, is summarized in the following proposi-tion.

Proposition 7.2: Let be trellises of the samedepth over a field representing the codes ,respectively. Let be the intersectionproduct of . Then .

Since Proposition 7.2 was proved by Kschischang [14] in thecontext of conventional trellises, we omit the proof. If the codes

in Proposition 7.2 are linear, then their inter-section is obviously a linear code. Can the same be said abouttrellises—namely, if are linear trellises, is theirintersection product also linear? The an-swer is: almost. We say that a trellisis quasi-linearover ,if there exists a vertex labeling of such that is a linearcode over . The difference between linear and quasi-linear


trellises is that a quasi-linear trellis is not necessarily reduced(cf. Definition 4.1). Indeed, the intersection product often yieldsnonreduced trellises (cf. Example 7.4).

Proposition 7.3: Let be linear trellises ofthe same depth over a field . Then the intersection producttrellis is quasi-linear over .

Proof: First consider the trellis ,where is the identity product of Example 7.1. Clearly, isa linear trellis. Thus, there exists a vertex labeling ofsuchthat is a linear code, which may be specified in termsof its parity-check matrix over . Notice that the vertexclasses of and are identical. To obtain a linear labelingof the vertices of , we simply copy the vertex labeling of .It is now easy to see that can be obtained fromby enforcing a repetition code of length on each of theedge classes. Explicitly, we append rows of the type

to the parity-check matrix insuch a way that the code defined by the resulting matrixisconstrained to have equal symbols in the positions that corre-spond to edge labels. Then, the code defined byis precisely

, which implies that is quasi-linear.

We now use the intersection product to construct (dual) trel-lises for duals of linear codes. This application of intersectionproduct is, to the best of our knowledge, new. Let . Re-call that denotes the one-dimensional subcode ofgener-ated by . Let denote the -dimensional dual code of

. The following simple proposition establishes the relationbetween trellis intersection product and duals of linear codes.

Lemma 7.4:Let be a linear code of dimensionover ,and let be a set of generators for. Then

Proof: Follows directly from (17) and the fact that di-rect-sum and intersection are dual operations. Alternatively, onecan think of as rows of a parity-check matrixfor . Then it is clear that if and only if is orthog-onal to each of .

We now introduce theelementary dual trellis . Given anarbitrary vector over the field

, along with its span , the cor-respondingelementary trellis is completely determined byand , and represents the one-dimensional linear codeover

. Analogously, the elementary dual trellis (cf. Fig. 8) iscompletely determined byand , and represents the -dimensional dual code over . The trellis has ver-tices, labeled , at those positions that belongto and a single vertex, labeled, at other positions. For thesake of simplicity, we will describe the edges of under theassumption that and .

Case 1: Let . Then there are edges between thesingle vertex and the single vertex ,labeled by the elements of .

Case 2: Let . Then the vertex is connected by anedge to each of thevertices . The label ofthis edge is , where is the label of .

Case 3: Let and . Then the verticesand are connected if and only if they havethe same label . If they do have the samelabel, there are edges between and , labeledby the elements of .

Case 4: Let and . Then each vertex isconnected by a single edge to each vertex(so that is a complete bipartite graph). The labelof this edge is , where and are thelabels of and , respectively.

Case 5: Let . Then the single vertex is con-nected by an edge to each of thevertices .The label of this edge is , where is the labelof .

Two different elementary dual trellises for the vector, corresponding to the spans and

are depicted in Fig. 8. Note that the total number ofedges in is either if or is . Also, notethat if and/or , then the edge/vertex incidencestructure of is precisely as described in Cases 1–5 above,but the edges of arelabeledin a slightly different way.

Observe that the elementary trellises and have ex-actly the same set of vertices. Therefore, they also have the samestate-complexity profile , where

is the span vector (cf. Definition 5.3) associated with.

Theorem 7.5:Let be a linear trellis for the code , andsuppose that factors as for some

. Then the trellis

(47)

represents the dual code of, and the state-complexity profileof this trellis is equal componentwise to the state-complexityprofile of .

Proof: Since is a trellisfor , the codewords generate (note that

can be strictly greater than ). It thus follows fromLemma 7.4 that . Theelementary dual trellises represent thecodes by construction. Hence, theintersection product trellis repre-sents the dual code of by Lemma 7.4. The state-complexityprofiles of and are given by

where are the span vectors associated with.

Note that the trellis in (47) is not necessarily reduced. Thisis so because in the intersection product, many of the edgesof product trellis get removed, so that certain edges and verticesof may not belong to any cycle (see Example 7.4). We, there-fore, remove such extraneous edges and vertices fromuntilit becomes reduced. Let denote this reduced version of.We say that is thedual trellis of .

Proposition 7.6: The dual trellis is a linear trellis.


(a)

(b)

Fig. 8. Two elementary dual trellises for the vectorx = (00120110).

Proof: The trellis in (47) is quasi-linear by Proposi-tion 7.3. The linearity of then follows from the fact thatand have the same set of cycles.

In reducing the trellis in (47) to the dual trellis , someof the vertices of can get removed. Therefore, we have

(48)

where the inequality is componentwise. In some cases, the dualtrellis may be strictly smaller than. This situation is illus-trated in the following example.

Example 7.4:Consider the binary linear code. Any two nonzero codewords of

form a basis; we can, therefore, construct a linear trellisforas , where with spanand with span . The elementarytrellises , , and their product are as follows:

Note that is a conventional trellis, but it is not minimalfor , since the basis is not in minimal-span form. Theelementary dual trellises for and , with the samechoice of span for both vectors, are given by

To construct the dual trellis , we first form the identityproduct trellis , then remove all the edges thatare not labeled or to obtain the intersection product

trellis in (47), and finally remove the two ver-tices and four edges of that do not belong to any cycle. Thetrellises , , and are depicted below:

Note that is a linear trellis, although is not. The state-complexity profile of is given by

which is strictly less than . In fact,is the minimal conventional trellis for , even though is notminimal for .

The following theorem shows that the situation described inExample 7.4 does not arise if the original trellisis -min-imal for . That is, if is -minimal then (48) holds withequality.

Theorem 7.7:Let be a -minimal linear trellis, eitherconventional or tail-biting, for a linear code. Then the dualtrellis is a -minimal trellis for and the state-com-plexity profile of is equal componentwise to the state-com-plexity profile of .

Proof: Let us assume to the contrary thatSince is a linear trellis for by Proposition 7.6, the fac-torization theorem (Theorem 4.2) implies that it factors into aproduct of elementary trellises. Letbe such a factorization of , where . Let


be the dual trellis of , which results by removing the ex-traneous edges and vertices from Then,by Theorem 7.5 and Proposition 7.6, is a linear trellis for

and

This is a contradiction, since is a -minimal trellis forby assumption. Similarly, if is not -minimal for, then there exists another linear trellis for such

that . Taking the dual trellis ofproduces a linear trellis for , whose state-complexity profileis strictly less than , a contradiction.

The idea in the proof of Theorem 7.7 can be iterated. Startingwith any linear trellis for , one can construct the dual trellis,then the dual of the dual trellis, and so on, as long as the state-complexity profile strictly decreases. We conjecture that thisprocess always results in a nonmergeable, but not necessarily

-minimal, trellis for . However, we do not have a proof,and leave this as an open problem for future research.

ACKNOWLEDGMENT

The authors would like to acknowledge several stimulatingconversations with David Forney, Frank Kschischang, andYaron Shany. We also thank the anonymous referees forvaluable comments.

REFERENCES

[1] S. Aji, G. Horn, R. J. McEliece, and M. Xu, “Iterative min-sum decodingof tail-biting codes,” inProc. IEEE Workshop on Information Theory,Killarney, Ireland, June 1998, pp. 68–69.

[2] I. Bocharova, R. Johannesson, B. Kudryashov, and P. Ståhl, “Tailbitingcodes: Bounds and search results,”IEEE Trans. Inform. Theory, vol. 48,pp. 137–148, Jan. 2002.

[3] A. R. Calderbank, G. D. Forney, Jr., and A. Vardy, “Minimal tail-bitingtrellises: The Golay code and more,”IEEE Trans. Inform. Theory, vol.45, pp. 1435–1455, July 1999.

[4] J. H. Conway and N. J. A. Sloane,Sphere Packings, Lattices andGroups. New York: Springer-Verlag, 1988.

[5] G. D. Forney, Jr., “Dimension/length profiles and trellis complexity oflattices,” IEEE Trans. Inform. Theory, vol. 40, pp. 1753–1772, Nov.1994.

[6] , “Codes on graphs: Normal realizations,”IEEE Trans. Inform.Theory, vol. 47, pp. 520–548, Feb. 2001.

[7] G. D. Forney, Jr., F. R. Kschischang, B. Marcus, and S. Tuncel, “Iter-ative decoding of tail-biting trellises and connections with symbolicdynamics,” inCodes, Systems, and Graphical Models, IMA Volumesin Mathematics and Its Applications, B. Marcus and J. Rosenthal,Eds. New York: Springer-Verlag, Mar. 2001.

[8] R. Johannesson, P. Ståhl, and E. Wittenmark, “A note on type II con-volutional codes,”IEEE Trans. Inform. Theory, vol. 46, pp. 1510–1514,July 2000.

[9] N. Karmarkar, “A new polynomial-time algorithm for linear program-ming,” Combinatorica, vol. 4, pp. 373–396, 1984.

[10] R. Koetter and A. Vardy, “Construction of minimal tail-biting trellises,”in Proc. IEEE Int. Workshop on Information Theory, Killarney, Ireland,June 1998, pp. 72–74.

[11] , “On the theory of linear trellises,” inInformation, Codingand Mathematics, M. Blaum, P. G. Farrel, and H. C. A. van Tilborg,Eds. Boston, MA: Kluwer, May 2002, pp. 323–354.

[12] , “The structure of tail-biting trellises: Bounds and applications,”manuscript in preparation, May 2003.

[13] R. Koetter, “On the representation of codes in Forney graphs,” inCodes,Graphs, and Systems, R. E. Blahut and R. Koetter, Eds. Boston, MA:Kluwer, Feb. 2002, pp. 425–450.

[14] F. R. Kschischang, “The trellis structure of maximal fixed-cost codes,”IEEE Trans. Inform. Theory, vol. 42, pp. 1828–1838, Nov. 1996.

[15] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs andthe sum-product algorithm,”IEEE Trans. Inform. Theory, vol. 47, pp.498–519, Feb. 2001.

[16] F. R. Kschischang and V. Sorokine, “On the trellis structure of blockcodes,”IEEE Trans. Inform. Theory, vol. 41, pp. 1924–1937, Nov. 1995.

[17] A. Lafourcade and A. Vardy, “Lower bounds on trellis complexity ofblock codes,”IEEE Trans. Inform. Theory, vol. 41, pp. 1938–1954, Nov.1995.

[18] S. Lin and R. Y. Shao, “General structure and construction of tail-bitingtrellises for linear block codes,” inProc. IEEE Int. Symp. InformationTheory, Sorrento, Italy, June 2000, p. 117.

[19] J. H. Ma and J. K. Wolf, “On tail-biting convolutional codes,”IEEETrans. Commun., vol. COM-34, pp. 104–111, Feb. 1986.

[20] F. J. MacWilliams and N. J. A. Sloane,The Theory of Error CorrectingCodes. New York: North-Holland, 1977.

[21] R. J. McEliece, “On the BCJR trellis for linear block codes,”IEEETrans. Inform. Theory, vol. 42, pp. 1072–1092, July 1996.

[22] D. J. Muder, “Minimal trellises for block codes,”IEEE Trans. Inform.Theory, vol. 34, pp. 1049–1053, Sept. 1988.

[23] S. Riedel and C. Weiß, “The Golay convolutional code—Some appli-cation aspects,”IEEE Trans. Inform. Theory, vol. 45, pp. 2191–2199,Sept. 1999.

[24] A. Schrijver,Theory of Linear and Integer Programming. New York:Wiley, 1986.

[25] Y. Shany and Y. Be’ery, “Linear tail-biting trellises, the square rootbound, and applications for Reed–Muller codes,”IEEE Trans. Inform.Theory, vol. 46, pp. 1514–1523, July 2000.

[26] R. Y. Shao, S. Lin, and M. P. C. Fossorier, “Decoding of codes based ontheir tail biting trellises,” inProc. IEEE Int. Symp. Information Theory,Sorrento, Italy, June 2000, p. 342.

[27] P. Ståhl, “On tail-biting codes from convolutional codes,” Ph.D. disser-tation, Tech. Univ. Lund, Lund, Sweden, Dec. 2001.

[28] P. Ståhl, J. B. Anderson, and R. Johannesson, “Optimal and near-optimalencoders for short and moderate-length tailbiting trellises,”IEEE Trans.Inform. Theory, vol. 45, pp. 2562–2571, Nov. 1999.

[29] É. Tardos, “A strongly polynomial algorithm to solve combinatoriallinear programs,”Oper. Res., vol. 34, pp. 250–256, Apr. 1986.

[30] V. Tarokh and A. Vardy, “Upper bounds on trellis complexity of lattices,”IEEE Trans. Inform. Theory, vol. 43, pp. 1294–1300, July 1997.

[31] A. Vardy, “Trellis structure of codes,” inHandbook of Coding Theory,V. S. Pless and W. C. Huffman, Eds. Amsterdam, The Netherlands:Elsevier, 1998, pp. 1989–2118.

[32] A. Vardy and Y. Be’ery, “Bit-level soft-decision decoding of Reed–Solomon codes,”IEEE Trans. Commun., vol. 39, pp. 440–445, Mar.1991.

[33] A. Vardy and F. R. Kschischang, “Proof of a conjecture of McEliece re-garding the expansion index of the minimal trellis,”IEEE Trans. Inform.Theory, vol. 42, pp. 2027–2034, Nov. 1996.

[34] G. Viswanath and B. S. Rajan, “Minimal tail-biting trellises for linearMDS codes over ,” in Proc. IEEE Int. Symp. Information Theory,Sorrento, Italy, June 2000.

[35] N. Wiberg, H.-A. Loeliger, and R. Kötter, “Codes and iterative decodingon general graphs,”Euro. Trans. Telecommun., vol. 6, pp. 513–526, May1995.

the structure of tail-biting trellises: minimality and ...hajir/m499c/koetter... · trellis for a...

Documents