rna_intro

Upload: jarsen21

Post on 03-Jun-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 RNA_intro

    1/3

    RNA-type matchings

    16 2014 .

    1 Colored matchingA word in the alphabet A is a nite sequence of symbols a i A . For instance,if A = {0, 1}, we have nite binary sequences as words. In particular, w =101101 is a word, and we denote |w| = 6 (the length of the word). We willalso say that w has 6 positions numerated 1, 2, . . . , 6. The symbol at positioni is referred to as wi . In our case, say, w2 = 0.

    A matching on w is a set of pairs of positions in w such that all thesepositions are different. Say, M = {(1, 4), (3, 5)} is a matching. The order of positions in the matching does not make any difference, that is, {(1, 4), (5, 3)}is the same matching M . For convenience, we will use the ascending order,that is, we will always write (3, 5), not (5, 3).

    Let us interpret positions of the word as the set of points (i, 0), i =1, 2, . . . , |w| , in the plane R 2 . They are positive integers on the x-axis. Amatching can be visualized as a set of arcs that connect the correspondingpairs of positions. We assume that these arcs are semicircles above the x-axis. In our case (matching M ) two arcs have an intersection point. We saythat the matching is crossless if its arcs are all disjoint, that is, if they haveno intersections with each other.

    Let us say that the pair (i, j ) of positions in w is admissible if wi =w j . The matching M will be called admissible if it is crossless and all itspairs are admissible. For instance, the matching M a = {(1, 6), (2, 5), (3, 4)}is admissible and M b = {(1, 3), (2, 5)} is not (there is a crossing).

    We are looking for a maximal admissible matching on a given word w,that is, for a matching that contains a maximal number of pairs. In ourexample, M a is the unique maximal matching, and it is also complete , thatis all the positions of w are used.

    1

  • 8/12/2019 RNA_intro

    2/3

    Denote by |M | the number of positions that are matched by |M |. For

    instance, |M b| = 4. Denote also d(M ) = |w| | M | (the defect number of M ).By d(w) we denote the minimal value of d(M ) over all admissible matchingsM of the word w.

    Question 1.1. Let w be a binary word of length |w| and let M be a maximaladmissible matching for w. Which values of |M | are possible?

    2 RNA-type matchings

    We may change the denition of admissible pair of positions as follows: (i, j )

    is admissible if wi = w j . Then we ask the same question:Question 2.1. Let w be a binary word of length |w| and let M be a maximaladmissible matching for w. Which values of |M | are possible?

    The rst model that was initiated by the research of the secondary struc-ture of RNA chains is the following one. Let A = {1, 2, 3, 4} and let admissiblepairs of colors be (1, 2) and (3, 4).

    Question 2.2. Suppose the word w in alphabet A is generated at random(Bernoulli scheme, where each symbol of A has equal probability at each

    position of w independently of the rest of the word). What is the probabilitythat the pair (i, j ) is admissible? The same question for the four-color casein the colored matching"scheme.

    3 General matching pattern

    Here we do not consider words any longer! Let us choose an integer interval 1, . . . , N . Some pairs (i, j ) of positions within this interval will be declaredadmissible (or matchable ). This can be done in different ways.

    For instance, given a parameter p [0, 1], we may declare each pair (i, j )matchable with probability p (and unmatchable otherwise). In words, we aretossing a biased coin for each pair (i, j ), independently. If we draw an edgebetween i and j for each matchable pair (i, j ), we get an Erdos-Renyi randomgraph.

    The problem is, again, to count |M | for a maximal crossless matching M .

    2

  • 8/12/2019 RNA_intro

    3/3

    4 String matching

    Let us begin with the following problem. We have a string with a nitenumber of beads"on it. The beads have no size, just points on a line, butthey may be of different color"(there is a nite set of colors). Then we laythe string on a plane in such a way that some pairs of beads of the samecolor contact with each other (the points coincide) but the string has noself-intersections. What is the maximal number of such pairs?

    Another problem is an extension of the colored matching and other cross-less matchings. The difference is that now we allow the matching arcs tobe drawn either above or below the x-axis. We will use the name stringmatchings"for admissible matchings in this sense.

    Question 4.1. Is it the same as the beads-on-a-string"model? What if thestring is closed (like a circle)?

    Question 4.2. Suppose A = {1, 2, 3} and there is an even number of eachcolor in the word w. Does it always exist a complete string"matching in thiscase?

    5 Asymptotic properties of string matching

    Clearly, the two-sided version of the matching problem has larger matchingsthan the one-sided version. The question is if now the relative defect of along word in the alphabet {1, 2, 3} vanishes as |w| .

    I am not aware of an algorithm that could count the minimal defect in areasonable time.

    3