self power map

Upload: peter-newton

Post on 03-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Self Power Map

    1/82

    The self-power map and its image modulo a prime

    by

    Catalina V. Anghel

    Thesis submitted in conformity with the requirementsfor the degree of Doctor of PhilosophyGraduate Department of Mathematics

    University of Toronto

    Copyright c2012 by Catalina V. Anghel

  • 8/12/2019 Self Power Map

    2/82

    Abstract

    The self-power map and its image modulo a prime

    Catalina V. Anghel

    Doctor of Philosophy

    Graduate Department of Mathematics

    University of Toronto

    2012

    The self-power map is the function from the set of natural numbers to itself which sends

    the number n to nn. Motivated by applications to cryptography, we consider the image

    of this map modulo a prime p. We study the question of how large x must be so that

    nn amod p has a solution with 1 n x, for every residue class a modulo p.While nn mod p is not uniformly distributed, it does appear to behave in certain ways

    as a random function. We give a heuristic argument to show that the expected x is

    approximately (p 1)2 log (p 1)/(p 1), using the coupon collector problem as a

    model. Rigorously, we prove the bound x < p

    2

    for sufficiently large p and a fixedconstant >0 independent ofp, using a counting argument and exponential sum bounds.

    Additionally, we prove nontrivial bounds on the number of solutions ofnn amod p fora fixed residue classawhen 1 n x, extending the known bounds when 1 n p1.

    ii

  • 8/12/2019 Self Power Map

    3/82

    Dedication

    For my parents, Mirela and Vinicius Anghel.

    Acknowledgements

    Academic Acknowledgments. First and foremost, I would like to thank my supervi-

    sor,Prof. Kumar Murty, who first suggested the study of the self-power map. His help,

    encouragement, and optimism have been invaluable over the past five years. I thank him

    for creating a wonderful sense of community (including a viewing of the Kung Fu Panda

    movie!). Most of all, I am grateful to him for taking me seriously as a researcher.

    I am grateful to Prof. Balasubramanian (Prof. Balu) for two very useful discussions

    at the start of my study. Dr. Venkatesan (Venkie) introduced me to an algorithmic

    approach to thinking about the self-power map, and always was cheerfully ready to talk

    with me about my work, even from Seattle and India. I thankProf. Repkafor his advice

    and reassurance. It was wonderful to have a faculty member I felt completely comfortable

    approaching when I was struggling.Prof. Ram Murty forwarded me his article and suggestions at a key moment in my

    research. His ideas were different from the path I ultimately took, but he was correct in

    thinking that my problem was similar. His article helped me get unstuck when finding

    the deterministic bound.

    I thank the members of my supervisory committee,Prof. MurnaghanandProf. Kam-

    nitzer, for being a kind audience during our meetings. They also accepted receiving my

    many email updates or helped with (very, very) last-minute reference letters.

    As every mathematics graduate student before me, I am incredibly grateful to Ida

    Bulatfor her warmth, kindness and dedication. From reminders about forgotten dead-

    lines to meeting for coffee to catch up, I am so glad that she is always there to ensure

    that everything goes smoothly.

    iii

  • 8/12/2019 Self Power Map

    4/82

    A huge thank you goes out to my Ganita lab-mates: Robby Burko,Aaron Chow,Pay-

    man Eskandari,William George,Nataliya Laptyeva,Mariam Mourtada,Meng Fai Lim,

    andYing Zongas well as visitors Brendon HansonandAnastasia Zaytseva. Thank you

    for your questions, advice, and support, and for providing such a welcoming environment

    for me to share my ideas. I am so glad to also call you friends. Thank you for the musical

    shows, cozy potlucks, French movies, dinners, wedding, introductions to new babies, and

    ceilidh dances over the past years.

    I would also like to thank Aude Plateaux, for her warm welcome at ENSICAEN this

    summer. In her expert care, I had a quiet place to start writing my thesis, as well as a

    wonderful travel experience.

    Personal Acknowledgments. While I would have never finished my thesis with-

    out the mathematical support of my professors and colleagues, I also would have never

    finished without the encouragement of my friends and family.

    Thank you to Charlotte Ochanine, the kindest person I know. You were my biggest

    cheerleader during my first year and your encouragement and practical advice helped me

    survive the comprehensive exams. Your visit to Deep River over the winter holidays is

    one of my fondest memories of my time as a PhD. Thank you for being so patient with

    me even during my most stressful times.

    Thank you to Peter Leefor making me more conscious about trying new things and

    getting out in nature, both of which restore me when things get tough.

    Hanae Hanzawa, you are my role model for being a kind and confident woman. You

    are so much fun to be around, and so open about everything (including trying strange

    Romanian food).

    Thank you to Caroline Schmeing, who is always ready for new adventures, whether

    it is attending a baseball game, or cooking exotic dishes, or saving me from a mouse

    two days before my supervisory committee meeting. Terry (Theresa) Lim is not only

    one of my closest friends, but has also been my roommate. We definitely bonded over

    iv

  • 8/12/2019 Self Power Map

    5/82

    shared experiences, like the plumbing mishap at 1 am! Thank you for your practical,

    tell-it-like-it-is advice to my fanciful musings, and for all the shared activities, from spin

    class to kayaking.

    Thank you so much to Tom Kuwahara, my best friend and confidant in these past

    five and a half years. Thank you for your wisdom, your sense of humour and your

    understanding during the many times when I have been stressed and irritable. So many

    of my best memories during the past years are with you: the early, long study stretches

    for analysis and topology, breaks for tea and Gregs ice cream, hikes through the Toronto

    ravines and the Newfoundland coast, and all the laughter in our weekly chats. Its with

    you that I feel most me.

    There are two people whom I cannot even begin to thank, my parents Mira and

    Vinicius Anghel. My father has been the sounding board for my ideas and the first

    reader of my work. My mother patiently listened to all my worries and struggles (and

    I am not one to suffer in silence!). I am so grateful that their confidence in me never

    wavered. Their love and support got me through the painful years when nothing worked.

    Its been a gift to finally be able to do mathematics, to find a pattern when at first there

    is none. I could have never gotten there without them.

    v

  • 8/12/2019 Self Power Map

    6/82

    Contents

    1 Introduction 1

    1.1 The self-power map and the discrete logarithm problem . . . . . . . . . . 1

    1.1.1 ElGamal signature scheme . . . . . . . . . . . . . . . . . . . . . . 2

    1.1.2 Two-cycles of the discrete logarithm . . . . . . . . . . . . . . . . 4

    1.2 Past work on the self-power map . . . . . . . . . . . . . . . . . . . . . . 6

    1.3 Extension to larger values ofn . . . . . . . . . . . . . . . . . . . . . . . . 8

    2 Bounds onN(x; a) 12

    2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.2 Estimates for N(x; a) for a of small order . . . . . . . . . . . . . . . . . . 13

    3 A preliminary probability model 20

    3.1 Equidistribution and exponential sums . . . . . . . . . . . . . . . . . . . 20

    3.2 The classical coupon collector problem . . . . . . . . . . . . . . . . . . . 23

    3.3 Powers of primitive roots and a first estimate ofx . . . . . . . . . . . . . 24

    4 Expectation ofx and its cumulative distribution 31

    4.1 Coupon collector problem with blank coupons . . . . . . . . . . . . . . . 31

    4.2 Upper and lower bounds for expectation . . . . . . . . . . . . . . . . . . 37

    4.2.1 Expectation estimate for the casep = 2q+ 1 . . . . . . . . . . . . 39

    4.3 Cumulative distribution function . . . . . . . . . . . . . . . . . . . . . . 43

    vi

  • 8/12/2019 Self Power Map

    7/82

    4.4 Computational results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    5 A deterministic bound 54

    5.1 Elementsaof small order . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    5.2 Elementsaof large order . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

    5.3 Proof of Theorem5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

    5.4 A numerical value of . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    6 Conclusion and future work 67

    A Formal Languages and Probabilities 69

    Bibliography 72

    vii

  • 8/12/2019 Self Power Map

    8/82

    Chapter 1

    Introduction

    Definition 1.1. The self-power map is the functionf from the set of natural numbers

    to itself, such that

    f(n) =nn.

    Our definition differs from Holdens definition in [Hol02b, HR11], where n N ismapped tonn mod p (or pe). Since some authors also consider nn mod b for composite b

    [GOM12], we give this more general definition. However, from now on we will consider

    the image of the self-power map modulo a prime p.

    1.1 The self-power map and the discrete logarithm

    problem

    We begin by describing two applications to cryptography that motivate the study of the

    self-power map. The self-power map and its image modulo a prime p is related to the

    discrete logarithm problem in (Z/pZ).

    Definition 1.2. The discrete logarithm problem in (Z/pZ) is the following: Given

    g (Z/pZ), andh an element in the subgroup generated byg, find an integera suchthatga hmod p.

    1

  • 8/12/2019 Self Power Map

    9/82

    Chapter 1. Introduction 2

    While it is computationally easy to raise an element to a power, the discrete logarithm

    problem, which is the inverse operation, is assumed to be difficult in general. Thus, the

    security of many cryptographic applications is based on this one-way property, and the

    discrete logarithm has been widely studied.

    1.1.1 ElGamal signature scheme

    One of the applications of the discrete logarithm is the ElGamal signature scheme. How-

    ever, the security of two variations of the ElGamal signature scheme also rely on the

    difficulty of inverting the self-power map.

    Algorithms1and2describe the ElGamal signature algorithm, taken from [MvOV96],

    section 11.5.2, and [FLM10]. We assume that Alice is sending a message m to Bob, and

    she would like to sign it so that Bob can verify her identity.

    Algorithm 1 Generation of Alices public and secret key

    1: Choose a large random prime p and a generator g of (Z/pZ).

    2: Select a random integer a such that 1 a p 1 and compute y = ga

    .

    3: Alices public key is (p, g, y) and her private key is a.

    Algorithm 2 ElGamal signature generation and verification

    1: Alices signature generation:

    (a) Select a random integer k such that 1 kp 2 and gcd(k, p 1) = 1.

    (b) Compute r = gk mod p.

    (c) Computes = k1(h(m) ar) where the inverse ofk is calculated modulo (p 1)and h(m) is the hash of the message m.

    (d) Alices signature for the message ofm is the pair (r, s).

  • 8/12/2019 Self Power Map

    10/82

    Chapter 1. Introduction 3

    Algorithm 2 ElGamal signature generation and verification (continued)

    2: Verification of the signature by Bob:

    (a) Verify that 1 r p 1, else reject.

    (b) Compute v1 = yrrs mod p.

    (c) Compute v2 = gh(m) mod p, after calculating h(m).

    (d) Accept the signature ifv1 = v2.

    The signature verification in Algorithm2works because

    v1 = y

    r

    r

    s

    mod p

    (ga)r gkk1(h(m)ar) mod p gargh(m)ar mod p

    gh(m) mod p

    = v2.

    Variations of the ElGamal signature scheme may be obtained by changing the value

    ofs in step 1(c) of the signing algorithm, Algorithm2. Notice that the signing equation

    in terms of the exponents can be also expressed as

    u= av+kw mod (p 1)

    where

    u= h(m), v= r, w= s.

    Assigning tou,v and w the valuess,r, andh(m) in different orders defines a new signing

    equation, which produces a different expression for s in terms ofa,k,r, andh(m). Table

    1.1from[MvOV96, Table 11.5] summarizes the possibilities.

    The verification equations for variations 2 and 4 involve the expression rr. Thus, an

    adversary who can solve the self-power problem may forge Alices signature by choosing

  • 8/12/2019 Self Power Map

    11/82

    Chapter 1. Introduction 4

    u v w Signing equation Verification

    1 h(m) r s h(m) =ar+ks gh(m) = (ga)rrs

    2 h(m) s r h(m) =as+kr gh(m)

    = (ga

    )s

    rr

    3 s r h(m) s= ar+kh(m) gs = (ga)rrh(m)

    4 s h(m) r s= ah(m) +kr gs = (ga)h(m)rr

    5 r s h(m) r= as+kh(m) gr = (ga)srh(m)

    6 r h(m) s r= ah(m) +ks gr = (ga)h(m)rs

    Table 1.1: Variations of the ElGamal signing equation. Signing equations are computed

    modulo (p 1); verification is done modulo p.

    s, then solving

    gh(m) [(ga)s]1 =rr or gs

    (ga)h(m)1

    =rr,

    forr.

    1.1.2 Two-cycles of the discrete logarithm

    Another application of the discrete logarithm related to the self-power map is a pseudo-

    random bit generator of Blum and Micali [BM84].

    Letg0 be a fixed generator of (Z/pZ). Associating the residue classes modulo p 1

    with integers between 1 and p, we can define the following function

    f(x) =gx

    0 mod p

    from the multiplicative group (Z/pZ) to itself. Blum and Micali [BM84] describe a

    pseudo-random bit generator which depends on an iteration

    x, f(x), f2(x), . . . , f N(x)

    .

    The pseudo-randomness of the sequence is compromised if the iteration falls into a fixed

    point or a short cycle.

  • 8/12/2019 Self Power Map

    12/82

    Chapter 1. Introduction 5

    Cycles of length two of the discrete logarithm, of the form

    gh =a modp and ga =h modp, (1.1)

    are related to collisions of the self-power map

    hh aa mod p. (1.2)

    Given a solution (g,h,a) to equation (1.1), the values h and a also satisfy equation

    (1.2). In certain cases, solutions to (1.2) also produce solutions to (1.1). Holden and

    his co-authors study the relationship between two-cycles of the discrete logarithm and

    collisions of the self-power map modulop in [Hol02b, Hol02a,HM06]. We reproduce the

    following proposition from[HM06].

    Proposition 1.3. If gcd(h,a,p1) = 1, then there is a one-to-one correspondencebetween triples (g,h,a) which satisfy (1.1) and pairs (h, a) which satisfy (1.2), and the

    value ofg is unique givenh anda. In particular, this is true whenh ora is relatively

    prime to p 1.

    Proof. Fix an arbitrary primitive root b mod p, and denote by ind x the discrete loga-

    rithm modulop ofx with respect to the base b. That is,

    ind x= y by x (mod p).

    The system of equations (1.1) is then equivalent to the following system

    h ind g ind a mod (p 1)a ind g ind h mod (p 1).

    (1.3)

    If gcd(h,a,p 1) = d, there exist u and v such that uh+ va dmodp 1. Usingelementary operations on systems of equations, we can show with a bit of work that ( 1.3)

    is equivalent to the following system:

    0 adind a h

    dind h mod p 1

    gd hvau mod p.(1.4)

  • 8/12/2019 Self Power Map

    13/82

    Chapter 1. Introduction 6

    In our case gcd(h,a,p 1) = 1, and we obtain the equivalent system:

    aa hh mod pg

    hvau mod p.

    (1.5)

    Many heuristics for the number of solutions of two-cycles of the discrete logarithm

    and collisions of the self-power map modulo a prime are given in Holdens work.

    1.2 Past work on the self-power map

    While we have described the cryptographic applications related to the self power map

    and its image modulo a prime, the behaviour of the self-power map is an interesting

    problem in its own right.

    Past work on the self power map was limited to the case 1 n p 1. We beginour study with the simple question:

    Question 1.4. Given any residue classa modulo p, does there existn such thatnn

    amodp and1 n p 1?

    The answer is no, not necessarily. When restricted to 1np 1 the self powermap is not injective, and thus not surjective, since the residue 1 appears at least twice:

    11 (p 1)(p1) 1 modp.

    The earliest results on the self-power map are due to Crocker [ Cro69], who gives thelower bound on the number of distinct residue classes that are represented by {nn : 1 n p 1}.

    Theorem 1.5. Letp be an odd prime. The number of distinct residues modulo p ofnn,

    where1 n p 1, is at least

    (p 1)/2

    .

  • 8/12/2019 Self Power Map

    14/82

    Chapter 1. Introduction 7

    Somer [Som81] gives an upper bound.

    Theorem 1.6. Letp be an odd prime. The number of distinct residues modulo p ofnn,

    where1

    n

    p

    1, is at most 34p + O p

    12

    +, whereis any positive real number andthe impicit constant depends only on.

    The upper bound is proved by bounding the number quadratic non-residues which do

    not appear among the residues ofnn modp when 1 n p 1. The constant is relatedto the class number ofQ(

    p).Somer finds lower bounds for the number of primitive roots modulo p which do not

    appear as residues ofnn mod p. Given any M

    N, he shows that the residue 1 (or

    1)

    appears at leastMtimes asnn modp for some large prime p. Thus a related question is

    the following:

    Question 1.7. What is the number of times that a residue class may be represented in

    the formnn modp with1 n p 1?

    More recently, Balog, Broughan, and Shparlinski [BBS11] have given bounds on the

    number of times a residue class appears, depending on its multiplicative order in (Z/pZ).Let us introduce the following notation.

    Definition 1.8. Denote by

    N(x; a) = |{n x: nn a (modp)}| ,

    the number of solutions to nn =a mod p with1 n x.

    Balog et al.[BBS11] give the following bounds.

    Theorem 1.9. Uniformly over t|(p 1) and all integersa withgcd(a, p) = 1 of multi-plicative ordert, the following bounds hold asp :

    N(p; a) max

    t, p12 t

    14

    po(1),

  • 8/12/2019 Self Power Map

    15/82

    Chapter 1. Introduction 8

    N(p; a) p 13+o(1)t2/3,

    N(p; a) p1+o(1)t1/12.

    The first two bounds are most useful when the order oft is small, while the last is

    useful when a has large order. In Chapter 2 our proofs will follow those of Balog et

    al. very closely to extend these estimates to N(x; a), for p < x p(p 1).

    1.3 Extension to larger values ofn

    As a variation of the Question1.4,we may ask:

    Question 1.10. If we do not restrict n to 1 n p 1 can we find a solution tonn amod p, for anya?

    The answer is yes, if we allow 1 n p(p 1). Given anya such that 1 a p 1,thena may be expressed as nn by taking

    n= a+p(p a) = 1 + (p 1)(p+ 1 a)

    because

    nn = (a+p(p a))1+(p1)(p+1a) amodp.

    The sequence{nn} is periodic with period p(p 1).Given nn mod p, the base is determined modulo p and the exponent is determined

    modulo p 1. Thus, we may organize the values ofnn modp where gcd(n, p) = 1 as inTable1.2, by multiples ofp. Reducing the base modulop and the exponent modulo p 1gives us Table1.3. An example for p= 11 is given in Table1.4.

    The number of times a residue a (Z/pZ) of order t appears in Table1.3 isk|p1

    t

    (kt) p 1kt

    , (1.6)

    from Theorem 1 of [Som81]. If p 1 is square free, it can also be estimated in thefollowing way, independently found by R. Balasubramanian, [Bal11].

  • 8/12/2019 Self Power Map

    16/82

    Chapter 1. Introduction 9

    Values ofnn, wheren= kp+b

    b 1 2 . . . (p 1)k= 0 11 22 . . . (p 1)(p1)

    k= 1 (p+ 1)(p+1) (p+ 2)(p+2) . . . (2p 1)(2p1)

    k= 2 (2p+ 1)(2p+1) (2p+ 2)(2p+2) . . . (3p 1)(3p1)

    k= 3 (3p+ 1)(3p+1) (3p+ 2)(3p+2) . . . (4p 1)(4p1)...

    ...

    k= (p 2) ((p 2)p+ 1)((p

    2)p+1)

    ((p 2)p+ 2)((p

    2)p+2)

    . . . ((p 1)p 1)((p

    1)p

    1)

    Table 1.2: The table of values ofnn, arranged by groups ofp, omitting multiples ofp.

    Values ofnn mod p, where n = kp +b

    b 1 2 3 . . . (p 2) (p 1)k= 0 11 22 33 . . . (p 2)p2 (p 1)p1

    k= 1 12 23 34 . . . (p 2)p1 (p 1)1

    k= 2 13 24 35 . . . (p 2)1 (p 1)2...

    ...

    k= (p 2) 1p

    1

    21

    32

    . . . (p 2)p

    3

    (p 1)p

    2

    Table 1.3: The values ofnn from Table1.2considered modulop, where the base is reduced

    modulo p, and the exponent modulo p 1.

  • 8/12/2019 Self Power Map

    17/82

  • 8/12/2019 Self Power Map

    18/82

    Chapter 1. Introduction 11

    The residue class a will appear in any column where the order ofb is a multiple of the

    order ofa. Letting ord a= t and ord b= kt, then a will occur (p 1)/kt times in thiscolumn. Hence, the total number of times a appears in the table is

    k|p1

    t

    (kt) p 1kt

    = (p 1) (t)t

    k|p1

    t

    (k)k

    = (p 1) (t)t g

    p1t

    p1t

    =(t)g p 1t

    The residuesnn modp in Table1.3 are not equidistributed, but depend on the order

    of n in (Z/pZ). In particular, elements of lower order appear more frequently thanthose of higher order. Holden [HR11]conjectured that

    N(p; a) k|p1

    t

    (kt)

    kt

    which is 1/(p 1)-th the number of times the element a appears in the entire table, overone period of the sequence{nn}.

    The question we study in the second part of the thesis is the following:

    Question 1.12. How large should x be so that nn amod p has a solution with 1n x, for any given residue classamodulo p?

    The trivial bounds for x are p < x p(p 1). A probabilistic model, described inChapters3and4provides a heuristic argument to show that the value ofx is approxi-

    mately (p 1)2 log (p 1)/(p 1). In Chapter5, we rigorously prove that x < p2

    for large enoughp, for an explicit value >0, which is independent ofp.

  • 8/12/2019 Self Power Map

    19/82

    Chapter 2

    Bounds on N(x; a)

    2.1 Introduction

    Recall that N(x; a) denotes the number of solutions to the congruence nn = a mod p

    with 1 n xfor a fixed residue class a modulo p. Balog et al. [BBS11]give estimatesfor N(p; a) and we extend those estimates to N(x; a) for x > p. The sequence nn has

    periodp(p

    1). Thus, we may consider onlyp < x < p(p

    1) since ifx is larger than

    p(p 1) we can add an appropriate factor ofN(p(p 1); a), evaluated using equation(1.6) or Prop.1.11.

    The method of proof is completely analogous to that of Balog et al. [BBS11]. The

    value N(p; a) is the number of occurrences of the residue class amod p in the first row

    of Table1.3. In one case, we obtain a bound forN(x; a) which is approximately (x/p)

    times the bound for N(p; a), as expected; the residue class a mod p should appear about

    equally often in each row. Thus, ifa = 1, for instance,

    N(x; 1) x1+p2/3

    from Proposition2.3, with t = 1.

    12

  • 8/12/2019 Self Power Map

    20/82

    Chapter 2. Bounds on N(x; a) 13

    2.2 Estimates forN(x; a) for a of small order

    This section describes two estimates for N(x; a) that are non-trivial whenahas relatively

    small multiplicative order. For any element a

    (Z/pZ), and given a fixed primitive

    rootg , let ind adenote the discrete logarithm modulo p ofa with respect to the base g .

    That is,

    ind a= b gb amod p.

    Aside from elementary number arguments, Proposition2.2also requires the following

    sum-product result of Garaev ([Gar08], Theorem 1). LetA be a subset of (Z/pZ).

    Define the sets

    A + A = {a1+a2: a1, a2 A}

    A A = {a1a2 : a1, a2 A}.

    Lemma 2.1. For any setA (Z/pZ)

    |A+

    A| |A A |minp |A| ,

    |A|4

    p .

    Proposition 2.2. Uniformly int|(p 1), we have forx p1+

    a(Z/pZ)

    ord a|t

    N(x; a) xp1+t1 +x 14p 14+t 14 +xp 12+t 14 (2.1)

    Proof. Fix a primitive root g modp. We have

    a(Z/pZ)

    ord a|t

    N(x; a) =

    a(Z/pZ)ord a|t

    nx

    nnamod p

    1 =nx

    ntn1 modp

    1.

  • 8/12/2019 Self Power Map

    21/82

    Chapter 2. Bounds on N(x; a) 14

    We use g as the base of the discrete logarithm, so that we can express n = g indn. Then

    starting from the equivalence ntn 1 modp we obtain:

    ntn 1 (mod p), g(ind n)tn 1 (mod p) (ind n)tn 0 (mod p 1)

    SetT = p1t

    . Then the last condition implies

    n(ind n) 0 (modT).

    Fixd|Tand set Td= Td . If gcd(n, T) =d, then write n= dy and

    y(ind dy) 0 (mod Td) ind dy 0 (mod Td)

    since

    gcd(y, Td) = 1, 1 y D= xd

    .

    Set

    Xd = {y Z, 1 y D : ind dy 0 (mod Td), gcd(y, Td) = 1}

    Wd = {y (mod p) :y Xd} .

    Our goal is to estimate the sum

    a(Z/pZ)

    ord a|t

    N(x; a) =d|T

    |Xd|.

    Since

    |Xd

    | D

    p + 1 |Wd| =

    x

    dp+ 1 |Wd|,

    we will bound|Xd|, and hence the sum, by first bounding|Wd|.The trivial upper bound for|Wd| is dt. This would give the bound

    a(Z/pZ)

    ord a|t

    N(x; a) =d|T

    |Xd| xtp

    d|T1

    +t

    d|Td

    .

  • 8/12/2019 Self Power Map

    22/82

    Chapter 2. Bounds on N(x; a) 15

    The sums

    d|T1 and

    d|Td will occur several times in the following proofs. Sometimes

    we will denote the sum representing the number of divisors by d(T) =

    d|T1. We will

    use the bounds

    d|T

    1 T and d|T

    d T1+. (2.2)Thus, we have

    a(Z/pZ)ord a|t

    N(x; a) xp1+t1 +p1+t. (2.3)

    We obtain a stronger bound using Lemma 2.1, for certain t and x. Ify1, y2 Wd,then

    dy1 gTd1 (mod p),dy2 gTd2 (mod p)

    so that

    d2y1y2 gTd(1+2) (modp).

    There are (p 1)/Td= dt possibilities for the sum (1+ 2), hence also for the producty1y2. Thus

    |Wd

    Wd

    | dt.

    Furthermore,

    |Wd+Wd| |Xd+Xd| 2xd

    and

    |Wd+Wd| p,

    since the sum is defined in (Z/pZ). Hence |Wd+Wd| min{p, (2x)/d}. We havep

  • 8/12/2019 Self Power Map

    23/82

    Chapter 2. Bounds on N(x; a) 16

    The substitution of the bounds for |Wd+Wd| and |Wd Wd| results in four estimates for|Wd|,

    p|Wd|

    pdt if d < 2xp,|Wd| p 23

    2xt if d 2xp,|Wd| p23

    |Wd|4p

    pdt if d < 2xp

    ,|Wd| < p 23

    2xt if d 2xp,|Wd| < p23 .

    Simplifying, we obtain,

    |Wd|

    dt if d < 2xp

    ,|Wd| p 23xtp

    if d 2xp

    ,|Wd| p 23

    p12 d

    14 t

    14 if d < 2x

    p,|Wd| < p 23

    x14p

    14 t

    14 if d 2x

    p,|Wd| < p 23 .

    Notice that ifd > 2xp, then xdp 0 implies that b Wd Wd. Hence

    |Xd|2 bZ/pZ

    s(b) |Wd Wd| max s(b).

    For a given b, since y1y2 D2 =x2/d2 and y1y2 =b+kp, the product y1y2 can take atmost

    x2

    pd2+ 1

    values. Oncezsuch that y1y1= zis fixed, there are

    d(z) =zo(1)

    xd

    2o(1)

    xd

    for the pair (y1, y2). Hence

    s(b)

    x2

    pd2+ 1

    x

    p

    As in the proof of Proposition2.2, we have|Wd Wd| dt.

  • 8/12/2019 Self Power Map

    26/82

    Chapter 2. Bounds on N(x; a) 19

    Therefore

    |Xd|2 dt

    x2

    pd2+ 1

    x

    p

    |Xd| x

    px

    t

    pd+

    td

    .

    On the other hand, from the condition ind (dy) 0 (mod Td) we have

    |Xd|

    x

    p+ 1

    dt.

    From the condition y D, we have also

    |Xd| D= xd

    .

    We choose the intervals d p 13 t 13 , p 13 t 13 d p 23 t 13 , and d p 23 t 13 as in theproof of Balog et al. These intervals are chosen such that the bounds all have similar

    (though not equal) orders of magnitude. We conclude that

    |Xd|

    xp

    + 1

    dt ifd p 13 t 13xp

    dt

    xt

    pd+

    td

    ifp13 t

    13 d p 23 t 13

    x

    d ifd p2

    3 t1

    3

    which simplifies to

    |Xd|

    xp23 t

    23 ifd p 13 t 13

    x1+p23 t

    23 ifp

    13 t

    13 d p 23 t 13

    xp23 t

    13 ifd p 23 t 13 .

    We will use the largest bound above,|

    Xd|

    x1+p

    23 t

    23 in the sum. Again, we apply

    the number of divisors bound (2.2). Thus, for x p,

    a(Z/pZ)ord a|t

    N(x; a) =d|T

    |Xd| Tx1+p 23 t 23 x1+p 23 t 23 .

  • 8/12/2019 Self Power Map

    27/82

  • 8/12/2019 Self Power Map

    28/82

  • 8/12/2019 Self Power Map

    29/82

    Chapter 3. A preliminary probability model 22

    The discrepancy of a sequence gives a quantitative and more general measure of

    uniform distribution.

    Definition 3.3. With notation as above, the discrepancy of the sequence{xn}Nn=1 in[0, 1)k is defined as

    = supB[0,1)k

    N(B)N |B|

    where the supremum runs over all boxesB inside thek-dimensional unit cube.

    Thus, the discrepancy measures the largest deviation ofN(B) from its expected value.The trivial bound for discrepancy is 1. We may rescale a sequence so that its terms fall

    in the unit cube, and a uniformly distributed sequence will have discrepancy tending to

    zero as the number of terms Ntends to infinity. However, the bound of discrepancy for

    finite sequences is also a useful, quantitative indicator of how scattered the sequence is

    within a region.

    Again exponential sums are key, since a bound onN

    n=1 e(h xn) gives a bound forthe discrepancy of a sequence, from the following important theorem.

    Theorem 3.4(Erdos-Turan Inequality). Let{xn}Nn=1be points in thek-dimensional unitcube andM an arbitrary positive integer. Then

    3

    2k

    2M+ 1

    + 0

  • 8/12/2019 Self Power Map

    30/82

    Chapter 3. A preliminary probability model 23

    3.2 The classical coupon collector problem

    The coupon collector problem asks the following question: if m coupons are collected

    with replacement, how many trials are needed in order to collect all m coupons? For

    example, if cereal boxes randomly contain one ofm coupons, how many boxes must be

    bought in order to collect at least one of every coupon?

    The motivation for considering the coupon collector problem is simply the ideal sce-

    nario. If elements were randomly distributed in Table1.3, then the expectedx in Question

    1.12would be the same as the expected number of random draws required to obtain all

    residue classes. In our case the cereal boxes correspond to the ns (or a subsequence

    thereof) and the coupons to the residue classesamodp.

    In Table1.3the residue classes occur with different frequencies, depending on their

    order. However, we begin by describing the classical coupon collector model which as-

    sumes that every coupon occurs with equal probability. The following calculation gives

    the expected time, or number of draws to obtain a full collection. [Isa95, p. 80-82].

    Proposition 3.5. Considermdifferent objects that are drawn with replacement. Suppose

    each object occurs with equal probability1/m. The expected number of draws required to

    collect allm objects is

    E(T) =m log m+m+1

    2+ o(1).

    Proof. LetTbe the time (or number of trials) required to collect allm coupons, and let

    tibe the time to collect thei-th new coupon given thati1 coupons have been collected.The T and ti are random variables. The probability of collecting a new coupon given

    i 1 collected coupons ispi =

    m i+ 1m

    .

    Theti has geometric distribution with expectation

    E(ti) = 1

    pi.

  • 8/12/2019 Self Power Map

    31/82

    Chapter 3. A preliminary probability model 24

    (For instance if the probability of an event is 13

    , then the expected number of trials to

    obtain a success is 3.) We have

    E(T) = E(t1) +E(t2) +. . .+E(tm)

    = 1

    p1+

    1

    p2+. . .+

    1

    pm

    = m

    m+

    m

    m 1+ . . .+ 1

    m

    = m

    1 +

    1

    2+ . . .+

    1

    m

    = mHm

    = m log m+m +1

    2+ o(1)

    In the last two lines we used the notation Hm for the m-th Harmonic number, the

    sum of the reciprocals of the first m natural numbers, and the bound

    1

    2m 1

    8m2 < Hm log m < 1

    2m (3.1)

    where is the Euler-Mascheroni constant [PS72,Ex. 18].

    The following limit theorem for the probability distribution of time Tuntil a complete

    collection is due to Erdos and Renyi. [ER61]

    Theorem 3.6. For anyt R , we have that:

    P(T m log m+tm) eet as m .

    3.3 Powers of primitive roots and a first estimate ofx

    The first, simple model considers a subsequence of{nn (modp)}nthat is equidistributedas p tends to infinity. The congruence nn a (mod p) seems most difficult to solve ifa is a primitive root, as n must be primitive as well. Thus, starting from Table1.3,we

  • 8/12/2019 Self Power Map

    32/82

    Chapter 3. A preliminary probability model 25

    consider only the columns which correspond to primitive elements, and only the first R

    rows corresponding tok= 0, . . . , R 1.Let g1, g2, . . . , g(p1) be all the primitive roots modulo p. We will show that the

    elements

    ggi+kik,i, listed in Table3.1are equidistributed modulopasptends to infinity,

    when R is greater thanp.

    Columns corresponding to primitive roots

    Primitive roots g1 g2 g3 . . . g(p1)

    k= 0 gg11 gg22 g

    g33 . . . g

    g(p1)(p1)

    k= 1 gg1+11 gg2+12 g

    g3+13 . . . g

    g(p1)+1(p1)

    k= 2 gg1+21 gg2+2

    2 gg3+2

    3 . . . gg(p1)+2

    (p1)

    k= 3 gg1+31 gg2+32 g

    g3+33 . . . g

    g(p1)+3(p1)

    ...

    k= R 1 gg1+R11 gg2+R12 gg3+R13 . . . gg(p1)+R1(p1)

    Table 3.1: The residues modulo p of the self-power map nn from Table 1.3, where we

    consider only the columns corresponding to primitive roots, and only the first R rows.

    The following is Corollary 1 in [BGK06].

    Theorem 3.7. There exist positive constantsC1 andC2 such that for any > 0, and

    positive integer tp, and any elementg(Z/pZ) such that the multiplicative orderofg is greater thant, we have

    maxa(Z/pZ)

    t

    1

    k=0

    exp

    2iagk

    p

    tp, = exp C1/C2 .

    Proposition 3.8. Given >0 andR= p, then elements =

    ggi+ki

    k,i

    in Table3.1,

    form an equidistributed sequence asp tends to infinity and the discrepancy of the sequence

  • 8/12/2019 Self Power Map

    33/82

    Chapter 3. A preliminary probability model 26

    is bounded by

    Cp logp+O

    1

    p

    Proof. According to Weyls criterion3.2, we bound the following exponential sum where

    his nonzero modulo p.

    (p1)i=1

    R1k=0

    exp

    2ihggi+ki

    p

    =R1k=0

    exp

    2ihgg11 g

    k1

    p

    +. . .+

    R1k=0

    exp

    2ihgg(p1)(p1) gk(p1)

    p

    =R1

    k=0 exp2ih1g

    k1

    p +. . .+R1

    k=0 exp

    2ih(p1)gk(p1)p

    In the third line, we have set hggii =hi, as they are independent ofk. Eachgi has order

    (p1) p1, thus the order ofgiis clearly greater thanR. The conditions of Theorem3.7are satisfied and we have

    (p1)i=1

    R1k=0

    exp

    2ihggi+ki

    p

    Rp +. . .+Rp =(p 1)Rp

    This sum is of the order o((p 1)R), where(p 1)Ris the total number of terms,thus the sequence

    ggi+ki

    k,i

    is equidistributed by Weyls criterion, Theorem3.2.

    We may use the Erdos Turan Inequality, Theorem3.4, with M = p 1 to estimatethe discrepancy:

    C2

    p+

    0

  • 8/12/2019 Self Power Map

    34/82

    Chapter 3. A preliminary probability model 27

    as some element of the sequence. We are taking (p 1)R terms of the sequence, andusing Theorem3.6,we obtain

    P((p

    1)R

  • 8/12/2019 Self Power Map

    35/82

  • 8/12/2019 Self Power Map

    36/82

    Chapter 3. A preliminary probability model 29

    Figure 3.2: Comparison between the cumulative distribution function derived from the

    coupon collector problem and the empirical distribution function for the variables Xp

    from equation (3.3). On the right is the graph considering only primes of the from

    p = 2q+ 1, for some prime q; in this case, there is a translation of approximately 0.6

    units between the curves.

    In Chapter 4, we will instead refine the model to incorporate the non-uniform distri-

    bution of elements in the entire Table1.3. The new model will correct the left-shift in

    the cumulative distribution function.

    Before continuing, it is worthwhile to give a final motivation for the preliminary model.

    The elements in columns corresponding to primitive roots follow a simple pattern. Ifg

    is a primitive root, the elements in the column corresponding to g are

    gg, gg+1, gg+2, . . . gg1

    .

    Given another primitive root, g1 = gr, the elements in the column corresponding to gr

    are

    grg

    r

    , gr(gr+1), gr(g

    r+2), . . . , gr(gr1)

    =

    grgr

    , grgr+r, grg

    r+2r, . . . , gr(gr1)

    =

    gs, gs+r, gs+2r, . . . , gs+(p1)r

    ,

    where in the last line we let s rgr (modp 1). Thus, the elements in the column

  • 8/12/2019 Self Power Map

    37/82

    Chapter 3. A preliminary probability model 30

    corresponding to gr are everyrth element from the column for g, starting from gs.

    We have an isomorphism between (Z/pZ) and p1, the group of (p 1)st roots ofunity, where we map generator g to e

    2ip1 . The problem of obtaining all the residue classes

    transforms into one of covering all the elements ofp1 on the unit circle. The numbers

    in each column corresponding a primitive root gr where gcd(r, p 1) = 1 are mapped toroots of unity in hops of length r about the unit circlep1.

    Figure 3.3: An illustration of the isomorphism between (Z/23Z) and 22 where 5

    (mod 23) is mapped to e2i22. The first elements 55, 56, 57, and 58 (mod 23) of the column

    corresponding to 5 are indicated. The diagram on the left also includes the first elements1010, 1011, 1012, 1013 (mod 23), of the column corresponding to 10 53 (mod 23) andwhich appear in hops of length 3 on the circle.

    This is reminiscent of the equidistribution of the fractional part of n in the unit

    interval when is an irrational number. While the sequence =

    ggi+ki

    k,i

    is equidis-

    tributed only as p tends to infinity, in both cases the initial elements of the sequence

    follow a pattern, yet the sequence fills or appears to fill the interval of interest.

  • 8/12/2019 Self Power Map

    38/82

    Chapter 4

    Expectation ofx and its cumulative

    distribution

    In this chapter we give a detailed heuristic argument to find the expected x such that

    nn a mod p has a solution 1 n x for any given residue class amod p. Thisexpected value ofx has main term (p 1)2 log (p 1)/(p 1).

    The argument consists of two parts. First, we give a variation of the coupon collector

    problem, including blank coupons which need not be collected. This model provides a

    better estimate for the expected x. Secondly, using this value for the expectation, we

    calculate the probability distribution for the number of draws until a complete collection.

    After a suitable normalization, we obtain the double exponential distribution eet

    , as

    in the classical coupon collector case. The saddle point method is used in the calculation

    of the probability distribution.

    These heuristics are well-supported by the computational results.

    4.1 Coupon collector problem with blank coupons

    In order to calculate the expected x in Question1.12, we will use a variant of the coupon

    collector problem. As before, we assume that the elements of Table 1.3 are randomly

    31

  • 8/12/2019 Self Power Map

    39/82

    Chapter 4. Expectation of x and its cumulative distribution 32

    distributed. Thus, our expectedx should be equal to the number of draws required to

    obtain a full collection of residue classes from a random sampling of entries in Table

    1.3. The different residue classes a congruent to 1, 2, . . . , (p 1) modp correspond to the

    coupons to be collected.

    A part of Theorem 4.1 of [FGT92] gives the expectation of the time necessary to

    obtain a full collection from a random, but non-uniform probability distribution.

    Proposition 4.1. The expectation E(T) of the time necessary to gather a complete

    collection ofm objects from a general probability distribution{pi}mi=1, sampling with re-placement, is given by

    E(T) =

    0

    1

    mi=1

    1 epit dt.

    Since higher order elements occur least often in Table1.3,we set the residue classes

    amodpwhich are primitive roots modulopto be the coupons of value. All other coupons

    will be blank coupons.

    In this section, we will calculate the expected number of trials to obtain a full collec-

    tion of useful coupons. We will use the termsusefulandnon-blankinterchangeably. This

    problem is a variation on the coupon collector problem under non-uniform distribution.

    Consider a collection ofm coupons and let mB coupons, which appear with probabil-

    itiesp1, p2, . . . , pmB , be blank. We do not keep track of how many of these coupons are

    collected, if any. They are omitted from the coupon collector problem in the sense that

    we want to collect the m mB useful coupons, not all m coupons. Let

    p1+p2+. . .+pmB =.

    Our goal in this section is to prove the following proposition.

    Proposition 4.2. With notation as above, the expectationE(Tj)of the time necessary to

    gather a collection ofj useful (non-blank) coupons under a general probability distribution

  • 8/12/2019 Self Power Map

    40/82

    Chapter 4. Expectation of x and its cumulative distribution 33

    withmB blank coupons is

    E(Tj) =

    j1q=0

    0

    [uq]

    m

    i=mB+1

    1 +u

    epit 1

    e(1)tdt, (4.1)

    where [uq]f(u) denotes theq-th coefficient in the Taylor expansion off(u). For the full

    collection of them mB useful coupons the expectation is

    E(T) =

    0

    1

    mi=mB+1

    (1 epit)

    dt. (4.2)

    The proof of Proposition4.2 follows the proofs of Theorems 3.1 and 4.1 in [FGT92]

    very closely. Using regular languages extended by the shuffle product, Flajolet et al. pro-

    duce generating functions to derive integral representations for expectations and prob-

    ability distributions. Appendix A gives a summary of the background required. We

    begin by rewriting Theorem 3.1 of [FGT92] to obtain Proposition4.4 below, of which

    Proposition4.2 is a corollary.

    The alphabet A consists ofm different coupons, and pi is the probability of choosingcouponifrom A. Consider the following generalization of the coupon-collector problem:

    Question 4.3. Determine the expectation of the numberBj of elements that need to be

    drawn fromA (with replacement) until we first obtain j distinct coupons that are eachrepeated at leastk times.

    The case k = 2 and j = 1 is the classical birthday problem of determining the

    expected number of random people needed such that two will share a birthday. The case

    k= 1 andj =m is the coupon collector problem where a full collection is desired. Below

    we determine E(Bj) for the case when the alphabetAcontains blank coupons.

    Proposition 4.4. Consider a set of m coupons under a general probability distribu-

    tion{pi}mi=1, which are drawn with replacement. Let mB coupons, which appear with

  • 8/12/2019 Self Power Map

    41/82

    Chapter 4. Expectation of x and its cumulative distribution 34

    probabilitiesp1, p2, . . . , pmB , be blanks. The expectationE(Bj) of the time for obtaining

    j m mB distinct useful coupons, each appearingk times, is given by

    E(Bj) =

    j1

    q=0

    0

    [uq] m

    i=mB+1

    ek1(pit) +u epit ek1(pit) e(1)tdt (4.3)

    whereek1(t) represents the truncated exponential

    ek1(t) = 1 + t

    1!+

    t2

    2!+ . . .+

    tk

    k!.

    Proof. Let the random variable Bj be the number of draws for first obtaining at least k

    copies ofj useful coupons in an infinite sequence of trials.

    LetYnbe the random variable defined on An representing the number ofk-occurrencesof different useful coupons in a sequence ofn trials. The two probability distributions

    are related by

    Pr{Ynj} = Pr{Bj n}.

    From this, the expectation ofBj is found as follows:

    E(Bj) =n1 n Pr{Bj =n}

    =n0

    Pr{Bj > n}

    =n0

    Pr{Yn< j}

    =

    j1q=0

    n0

    Pr{Yn= q}

    It remains to estimate the sumn0Pr{Yn= q

    }.

    LetHq be the language consisting of words with exactlyquseful letters that occur at

    least k times. Label the letters corresponding to the blank coupons as 1, 2, . . . , mB ,

    and these may occur any number of times. All other letters occur at most k 1 times.Let us set

  • 8/12/2019 Self Power Map

    42/82

  • 8/12/2019 Self Power Map

    43/82

    Chapter 4. Expectation of x and its cumulative distribution 36

    Therefore, n0

    Pr{Yn= q} =hq(1) =

    0

    [uq](t, u)e(1)tdt,

    and summing over qranging from 0 to j 1, we have the statement of the theorem.

    We can now prove Proposition4.2.

    Proof. The equation (4.1) is a specialization of equation (4.3) of Proposition4.4to the

    casek = 1.

    To obtain equation (4.2) from (4.1), we introduce the function

    (t, u) =m

    i=mB+1 1 +u

    epit 1

    (4.5)

    which is the function in equation (4.4) in the case k= 1. Expand (t, u) as follows:

    (t, u) =

    mmBq=0

    q(t)uq.

    We have:

    E(T) =E(TmmB) =mmB1

    q=0

    0

    [uq](t, u)|u=1 e(1)tdt

    =

    0

    mmB1q=0

    [uq

    ](t, u)|u=1 e(

    1)t

    dt

    =

    0

    (0(t) +1(t) +. . .+mmB1(t)) e(1)tdt

    =

    0

    ((t, 1) mmB(t)) e(1)tdt.

    We have that mmB(t) =m

    i=mB+1(epit 1) and, using(4.5),

    (t, 1) =m

    i=mB+1epit =

    m

    i=1epit

    mB

    i=1epit

    = e(p1+p2+...+pm)te(p1+p2+...+pmB )t

    = e(1)t

    Therefore, we obtain:

    E(T) =

    0

    ((t, 1) mmB(t)) e(1)tdt

  • 8/12/2019 Self Power Map

    44/82

    Chapter 4. Expectation of x and its cumulative distribution 37

    =

    0

    e(1)t

    mi=mB+1

    epit 1

    e(1)tdt

    =

    0

    1 e(1)t

    m

    i=mB+1epit

    m

    i=mB+1 1 epit

    dt

    =

    0

    1

    mi=mB+1

    1 epit

    dt

    4.2 Upper and lower bounds for expectation

    Assume that the elements in Table 1.3are randomly selected with replacement. In this

    section we give upper and lower bounds for the expectation of the number of draws

    required to obtain a full collection of all residue classes 1, 2, . . . , (p1) modp. Thelower bound of the expectation is the same as the expectation for the coupon collector

    with blank coupons, where all coupons except those corresponding to primitive roots are

    blank.

    This lower bound is the one we will use in the calculation of the cumulative distri-

    bution in the following section. It is more accurate in the case of primes of the form

    p= 2q+ 1 where qis also prime.

    Proposition 4.5. Assume that the elements in Table1.3randomly selected with replace-

    ment. Then the expected time required to obtain a full collection is bounded as follows:

    (p 1)2(p 1)(log (p 1) ++o(1))< E(T) (p 1)2

    (p 1)

    log (p 1) ++ 12(p 1)

    1

    8(p 1)2

    ,

  • 8/12/2019 Self Power Map

    46/82

    Chapter 4. Expectation of x and its cumulative distribution 39

    again using inequality (3.1). Note that by Proposition4.2,0

    1

    1 e

    (p1)

    (p1)2t(p1)

    dt

    is equal to the expected time if we assume the elements in Table 1.3 are randomlydistributed but all elements other than the primitive roots are blank.

    4.2.1 Expectation estimate for the casep= 2q+ 1

    Letp = 2q+1, whereqis also a prime. We let the residue classes 1 andp1 1 modpcorrespond to the blank coupons since they are easy to collect; they will appear in the

    first two rows of Table1.3. We need only consider the number of draws required to collectall other residue classes, which have large order. There are q 1 elements of order q in(Z/pZ), each of which appear 3(q 1) times in Table1.3,and q 1 elements of order2q, each of which appear (q 1) times in Table1.3.

    Proposition 4.6. Given2qdifferent coupons of which two are blank, q 1 each appearwith probability (q1)

    4q2 , and q 1 each appear with probability 3(q1)

    4q2 , then the expected

    number of draws to obtain all2q

    2 useful coupons is bounded by

    E(T) =

    0

    1

    1 e

    (q1)t

    4q2

    (q1) 1 e

    3(q1)t

    4q2

    (q1)dt

    (p 1)2

    (p 1)

    log(p 1) + log12

    + +O

    1

    q1/4

    +

    1

    4(p 1)

    .

    Proof. By Proposition4.2, the expected time to collect the 2q2 useful coupons is givenby

    E(T) =

    0

    1 1 e (q1)t

    4q2 (q1) 1 e

    3(q1)t4q2 (q1)

    dt.

    Letu= e (q1)t

    4q2 . Thendt= 4q2(q1) duu and

    E(T) = 4q2

    (q 1) 0

    1

    1

    u

    1 (1 u)(q1)(1 u3)(q1) du

    = 4q2

    (q 1) 1

    0

    1

    u

    1 (1 u)(q1)(1 u)(q1)(1 +u+u2)(q1) du

  • 8/12/2019 Self Power Map

    47/82

    Chapter 4. Expectation of x and its cumulative distribution 40

    = 4q2

    (q 1) 1

    0

    1

    u

    1 (1 u)2(q1) (1 u)2 + 3u(q1) du

    = 4q2

    (q 1) 1

    0

    1

    u

    1 (1 u)2(q1)

    (1 u)2(q1) +

    q 1

    1

    (1 u)2(q2)3u+

    . . .+ 3

    (q

    1)

    u

    (q

    1)

    du

    = 4q2

    (q 1) 1

    0

    1

    u

    1 (1 u)4(q1) . . .

    q 1

    k

    (1 u)2(q1)+2(qk)3kuk

    . . . 3(q1)u(q1)(1 u)2(q1) duThe first term is evaluated as in Proposition4.5:

    4q2

    (q 1) 1

    0

    1

    u

    1 (1 u)4(q1) du = 4q2

    (q 1) H4(q1)

    =

    4q2

    (q 1) log 4(q 1) ++ 1

    2(p 1)=

    (p 1)2(p 1)

    log(p 1) + log 2 ++ 1

    4(p 1)

    The other terms in the binomial expansion can be approximated as follows. We recall

    the definition of the Euler Beta function: 10

    (1 t)a1tb1 =B(a, b) =(a 1)!(b 1)!(a+b 1)! .

    Then

    3k 1

    0

    q 1

    k

    (1 u)2(q1)+2(qk)uk1du

    = 3k

    q 1k

    B (2(q 1) + 2(q k) + 1, k)

    = 3k

    (q 1)!k!(q 1 k)!

    (2(q 1) + 2(q k))!(k 1)!

    (2(q 1) + 2(q k) +k)!=

    3k

    k

    (q 1)(4q

    k

    2)

    (q 2)(4q

    k

    2

    1)

    (q 1 l)(4q

    k

    2

    l)

    (q 1 k+ 1)(4q

    k

    2

    k+ 1)

    In general:

    (q 1 l)(4q k 2 l) =

    1

    4+

    2 3l+k4(4q k 2 l) .

    Since 0 l < k and k ranges from 1 toq1, we can bound the left hand side from above1

    4+

    2 3l+k4(4q k 2 l)

    1

    4+

    2 +k4(4q k 2 l)

  • 8/12/2019 Self Power Map

    48/82

    Chapter 4. Expectation of x and its cumulative distribution 41

    14

    +2 +q

    16q

    14

    + 1

    16=

    5

    16

    Let K be a number much less than q

    1. We will take K = q1/4 for simplicity,

    although anyK=o(q1/2) is sufficient. Summing over k we have

    q1k=1

    3k

    k

    k1l=0

    (q 1 l)(4q k 2 l)

    =

    q1k=1

    3k

    k

    k1l=0

    1

    4+

    2 3l+k4(4q k 2 l)

    K

    k=13k

    k

    k1

    l=0 1

    4+

    2 +k4(4q

    k

    2

    l) +

    q1

    k=K3k

    k

    k1

    l=15

    16.

    The second term can be approximated by a geometric sum:

    q1k=K

    3k

    k

    k1l=1

    5

    16 =

    q1k=K

    3k

    k

    5

    16

    k1

    165

    k=K

    1

    k

    15

    16

    k

    = 16

    5

    1

    K

    15

    16

    K1 15

    16

    1

    = (16)2

    5 1K

    1516K

    .

    The first term will be approximated with the aid of several inequalities obtained from

    Taylor series. Since ex >1 + x for all real x, we have,

    Kk=1

    3k

    k

    k1l=0

    1

    4+

    2 +k4(4q k 2 l)

    =K

    k=1

    1

    k 3

    4kk1

    l=0

    1 +

    2q+ k

    q

    4

    1 k

    16q 1

    8q l

    16q

    Kk=1

    1

    k

    3

    4

    ke1q

    k1l=0

    2+k

    4(1 k16q 18q l16q) .

    Using

    1

    q

    k1l=0

    2 +k4

    1 k16q 18q l16q 1

    q

    k1l=0

    2 +k3

    1q

    2

    3k+

    k

    6(k 1)

    k

    2

    q

  • 8/12/2019 Self Power Map

    49/82

    Chapter 4. Expectation of x and its cumulative distribution 42

    and the inequalityex 0, we haveKk=1

    3k

    k

    k1l=0

    1

    4+

    2 +k4(4q k 2 l)

    Kk=1

    1

    k

    3

    4

    kek2

    q

    Kk=1

    1k

    34

    k1

    1 k2q

    Kk=1

    1

    k

    3

    4

    kq

    q k2

    =Kk=1

    1

    k

    3

    4

    k1 +

    k2

    q k2

    =K

    k=11

    k

    3

    4

    k+

    K

    k=11

    k

    3

    4

    k k2

    q k2

    .

    Finally, we use the Taylor expansion for log(1 x) to obtainKk=1

    3k

    k

    k1l=0

    1

    4+

    2 +k4(4q k 2 l)

    log

    1

    4

    +O

    K2

    q

    ln

    3

    4

    = log 4 O

    K2

    q

    .

    Putting it all together, we conclude

    E(T) (p 1)2(p 1)

    log(p 1) + log 2 log 4 ++O

    1q1/4

    + 1

    4(p 1)

    = (p 1)2

    (p 1)

    log(p 1) + log12

    + +O

    1

    q1/4

    +

    1

    4(p 1)

    .

    Thus the expected number of draws required to collect all residue classes differs from

    the classical coupon collector model by log(1/2)

    0.693. It may be reasonable to

    assume that the entire probability distribution function has shifted by this amount, since

    in Figure 3.2, the empirical distribution function differs from the theoretical one by a

    shift to the left of approximately 0.6 units, for primes of the form p = 2q+ 1.

    We note that logp+ log(1/2) is approximately log (p 1). Although only -1 and 1were considered as blank coupons, the upper bound of the expectation approaches the

  • 8/12/2019 Self Power Map

    50/82

    Chapter 4. Expectation of x and its cumulative distribution 43

    lower bound in Proposition4.5, where all non-primitive residue classes were considered

    blanks. In the next section, we will use the lower bound in Proposition 4.5in determining

    the cumulative distribution function.

    4.3 Cumulative distribution function

    Recall that for the classical coupon collector problem with m types of coupons which are

    uniformly distributed, we have

    E(T) =m log m+m+1

    2+ o(1)

    andlimm

    P(T m log m+tm) =eet.

    Similarly, given a coupon collector problem with probability distribution matching the

    frequencies of the residue classes in Table 1.3, if all but the primitive element residue

    classes are considered to be blank we have

    E(x) = (p 1)2(p 1)log (p 1) +

    (p 1)2(p 1)+

    1

    2

    p 1(p 1)

    2+o(1)

    from the proof of Proposition4.5. Using this estimate for the expectation, we show that

    limp

    P

    x (p 1)

    2

    (p 1)log (p 1) +t(p 1)2(p 1)

    =ee

    t

    .

    in the coupon collector problem considering the entire table with no blank coupons, if

    p 1 is square-free.To calculate the cumulative distribution function, we return to generating functions.

    In[FS09, p. 192], we have the following result.

    Proposition 4.7. Consider a set ofm coupons under a general probability distribution

    {pi}mi=1, which are drawn with replacement. Then the probability that we will have a fullcollection by timeT less than or equaln is given by

    P(T n) =n![zn]m

    j=1

    (epiz 1) .

  • 8/12/2019 Self Power Map

    51/82

  • 8/12/2019 Self Power Map

    52/82

    Chapter 4. Expectation of x and its cumulative distribution 45

    where C is a contour that encircles the origin. For certain functionsG(z), the integral canbe evaluated using the saddle point method. Roughly speaking, we can choose a contour

    Csuch that the function is large only on a small part of the contour and small everywhere

    else; the integral may be approximated by a Gaussian integral which captures the area

    under the peak. The path of integration passes near to the saddle point of the function,

    hence the name.

    We can use the saddle point method if G(z) is Hayman-admissible. The following

    definition is from[Hay56, p. 68], with notation from [FS09, Definition VIII.1., p. 565].

    Lettingz= rei, we have

    log G

    rei

    = log G(r) +

    =1

    (r) (i)

    !

    and define

    h(z) = log G(z)

    and

    a(r) = 1(r) =rh(r)

    b(r) = 2(r) =r2h(r) +rh(r).

    Definition 4.8 (Hayman-admissibility). LetG(z) have a radius of convergenceR with

    0< R and be always positive on some subinterval(R0, R) of(0, R). The functionG(z) is said to be H-admissible (or Hayman-admissible) if, witha(r) andb(r) defined as

    above, it satisfies the following three conditions:

    1. (Capture condition) limrR a(r) =

    and limr

    R b(r) =

    .

    2. (Locality condition) For some function 0(r), defined over (R0, R) and satisfying

    0< 0< , one has

    G

    rei G(r)eia(r)2b(r)/2

    asr R, uniformly in|| 0(r).

  • 8/12/2019 Self Power Map

    53/82

    Chapter 4. Expectation of x and its cumulative distribution 46

    3. (Decay condition) Uniformly in0(r) || ,

    G

    rei

    =o

    G(r)

    b(r)

    .

    One of the convenient closure properties of Hayman-admissibility functions is the

    following,[Hay56,Theorem VII.].

    Proposition 4.9. IfG1(z) andG2(z) are H-admissible functions with the same radius

    of convergenceR, thenG1(z)G2(z) is also H-admissible.

    Thus for G(z) to be H-admissible it is sufficient to check that a function of the form

    H(z) = (ez

    1), where >0, is H-admissible.

    Proposition 4.10. H(z) = (ez 1) where >0 is H-admissible.

    Proof. H(z) is analytic and positive on (0, ), as required. We have

    a(z) = z

    ez

    ez 1

    b(z) = z

    ez

    ez 1

    2z2ez

    (ez 1)2

    and

    limr

    a(r) = , limr

    b(r) = ,

    satisfying the first condition.

    The value of 0 is found using the coefficients in the expansion of log G(rei), by

    requiring

    2(r)20

    3(r)30 0.

    Since the coefficients j(r) tend to r, we can take 0= (r) 2

    5 .

    Thus,

    log G(r)eia(r)2b(r)/2 = log (er 1) +ia(r) 2 b(r)

    2

  • 8/12/2019 Self Power Map

    54/82

    Chapter 4. Expectation of x and its cumulative distribution 47

    r+ir 2 r2

    = r

    1 +i 1

    22

    asr tends to infinity. On the other hand,

    log G

    rei

    = log

    erei 1

    = log ere

    i

    1 erei

    = rei + log

    1 erei

    rei

    = r 1 + i1!

    +(i)2

    2! +. . . .

    For n 3, the terms r (i)nn! = 1n! 12n/5r12n/5

    tend to zero as r tends to infinity. Thus G

    rei G(r)eia(r)2b(r)/2, satisfying the

    locality condition.

    Finally,

    G rei = erei 1 er cos er

    1 22!

    + 4

    4!

    er 12 (r)1/5+ 124 (r)3/5 e

    r

    e12

    (r)1/5

    = o

    er

    r

    =o

    G(r)

    b(r)

    asr tends to infinity. We have used the bound

  • 8/12/2019 Self Power Map

    55/82

  • 8/12/2019 Self Power Map

    56/82

    Chapter 4. Expectation of x and its cumulative distribution 49

    from Theorem1.11, we have

    limr

    h(r) =

    d|(p1)(d)2g

    p 1

    d

    = (p 1)2

    limrh(r) = 0.

    In addition,

    a(z) = zh(z)

    d|(p1)(d)2g

    p 1

    d

    z= (p 1)2z

    b(z) = z2h(z) +zh(z)

    d|(p1)(d)2g

    p 1

    d

    z= (p 1)2z.

    The radius of integration should be a good approximation for the solution to the

    equationG()

    G() =n, thus

    = n

    d|(p1) (d)2gp1d

    = n(p 1)2 .

    We use an estimate of the expectation ofxin our expression for n,

    n=(p 1)2 log (p 1) +t(p 1)2

    (p 1) ,

    and we will find how the cumulative distribution function varies with t.

    Our goal is to evaluate

    n!

    p2n

    1

    n

    2b()

    G().

    Using the Stirling approximation for n!, the factor n!p2n

    1

    n

    2b()

    tends to 1en . We

    have,

    1

    enG() = 1

    en

    d|(p1)

    e(d)g

    (

    p1

    d )

    n

    (p1)2 1(d)

    = en

    d|(p1)en(d)2g(p1d )

    (p1)2

    1 e

    n(d)g(p1d )(p1)2

    (d)

    = en+n

    d|p1(d)

    2g(p1d )(p1)2

    d|(p1)

    1 e

    n(d)g( p1d )(p1)2

    (d)

  • 8/12/2019 Self Power Map

    57/82

    Chapter 4. Expectation of x and its cumulative distribution 50

    =

    d|(p1)

    1 en

    (d)g(p1d )(p1)2

    (d)

    = d|(p1)1

    e (p1)2(log(p1)+t)

    (p1)

    (d)g(p1d )(p1)2

    (d)

    =

    d|(p1)

    1 e(log (p1)+t)

    (d)g(p1d )(p1)

    (d)

    =

    1 e(log (p1)+t)(p1)d| (p1)

    2

    1 e(log (p1)+t)

    (d)g(p1d )(p1)

    (d)

    =

    1 1

    (p 1) et(p1)

    d| (p1)2

    1 e(log (p1)+t)(d)g(p1d )

    (p1)

    (d)

    Since

    limp

    1 1

    (p 1) et(p1)

    =eet

    ,

    it remains to show that the product over divisors of (p 1)/2 tends to 1. Let

    =

    d| (p1)2

    1 e(log (p1)+t)

    (d)g(p1d )(p1)

    (d)

    log =

    d| (p1)2

    (d)log

    1 et(p 1)

    g( p1d )(p1d )

    .We must have t (p 1), since we expect np2/(p 1) by Theorem4.5on

    the lower bound on expectation. Thus,

    et

    (p1)

    1. Using the Taylor expansion for

    log(1 x) for smallx, we have

    log =

    d| (p1)2

    (d) et

    (p 1) g( p1d )

    ( p1d ) +O

    et

    (p 1)

    2

    g(p1d )(p1d )

    If (p 1)/d= q1q2 . . . q r then

    g

    p 1

    d

    = (2q1 1)(2q1 1) (2qr 1)

  • 8/12/2019 Self Power Map

    58/82

    Chapter 4. Expectation of x and its cumulative distribution 51

    p 1

    d

    = (q1 1)(q1 1) (qr 1)

    so that

    infg

    p1d

    p1d

    = 2.Notice that (p 1)/d= qand as qgets very large, then (2q 1)/(q 1) approaches 2.We will use the notation d(x) for the number of divisors ofx. We have

    |log |

    n asn . The Kolmogorov-Smirnov test rejects or accepts the null hypothesis based on athreshold value ofDn, which depends on the level of significance.

  • 8/12/2019 Self Power Map

    60/82

    Chapter 4. Expectation of x and its cumulative distribution 53

    For odd primes less than 55000, the Kolmogorov-Smirnov test accepts the null hy-

    pothesis of the double exponential distribution eet

    , at a significance value 0.05, with

    p-value of 0.7266. The p-value is the probability of getting a result at least as extreme as

    the empirical value, assuming that the null hypothesis is true. A p-value of 0.7622 means

    that there is little or no evidence against the null hypothesis.

    Figure 4.1: Comparison between the cumulative distribution function derived from the

    coupon collector problem and the empirical distribution function for the variables Xp

    from equation (4.6).

  • 8/12/2019 Self Power Map

    61/82

    Chapter 5

    A deterministic bound

    In this chapter we will prove the following theorem.

    Theorem 5.1. For a sufficiently large prime p, and any fixed residue class a mod p,

    there exists ann such that

    nn a (modp)

    andn < p2, for some >0 independent ofp.

    For instance, for any large p, the value 1103 is sufficient, as shown inSection5.4.

    5.1 Elements a of small order

    If a 0 (modp), we have pp 0 (modp), and so we may consider only nonzeroelements a in Z/pZ. Table1.3, reproduced below for convenience, will again be useful.

    Given a, finding a solution n < p2 to the equation nn a (modp) is equivalentto the element a appearing in the first p1 rows of the table. Using the notation from

    the table, this means that we have nn a (mod p) withn = kp + bwherek < p1 and1 b p 1.

    54

  • 8/12/2019 Self Power Map

    62/82

    Chapter 5. A deterministic bound 55

    Values ofnn (modp), where n = kp+b

    b 1 2 3 . . . (p 2) (p 1)k= 0 11 22 33 . . . (p 2)p2 (p 1)p1

    k= 1 12 23 34 . . . (p 2)p1 (p 1)1

    k= 2 13 24 35 . . . (p 2)1 (p 1)2...

    ...

    k= (p 2) 1p1 21 32 . . . (p 2)p3 (p 1)p2

    Recall that ifb (mod p) has low multiplicative order ord b= m, then all the elements

    of order dividingm will appear in the first m rows of thebth column in Table1.3. Thusif a column is generated by an element bof low order, then b itself will appear as

    nn = (kp+b)kp+b b (mod p)

    for somek < m.

    Therefore if

    nn

    a (modp)

    where a has small order, sayp, we may find solutions with n p1+ simply by consideringthe elements in the column generated by a. Therefore, we separate the elements of

    (Z/pZ) into those of small multiplicative order (less than or equal to p) and those of

    large order (greater than p). It remains to prove the theorem for a of large order.

    To obtain an numerical value for in Theorem5.1 using a deterministic bound from

    Shparlinski [Shp11], we must take >1/2.

    5.2 Elements a of large order

    The idea for the proof for elements of large order is as follows: we would like the position

    of an element ato be scattered enough in Table1.3,so that awill appear at least once

  • 8/12/2019 Self Power Map

    63/82

    Chapter 5. A deterministic bound 56

    in the top rows. We will consider the appearance ofa only in the columns of elements of

    the same order. To simplify further, it is sufficient for a to be scattered in the columns

    of Table5.1.

    Values ofnj (modp)

    n 1 2 3 . . . (p 2) (p 1)j = 1 11 21 31 . . . (p 2)1 (p 1)1

    j = 2 12 22 32 . . . (p 2)2 (p 1)2

    j = 3 13 23 33 . . . (p 2)3 (p 1)3...

    ...

    j = (p 1) 1p1 2p1 3p1 . . . (p 2)p1 (p 1)p1

    Table 5.1: We consider the tablenj (modp), from which we may obtain the table for

    nn (modp) by a shift along the diagonal. Thus if elements in this table are scattered,

    so are those in the nn table.

    The proof follows the strategy for Theorem 8 in [Mur10]. We relabel aas g, and we

    consider it to be a fixed element in (Z/pZ) of multiplicative order t > p. On a first

    reading of the proof, it may be useful to think of the simplest case, when g is a primitive

    root; hence the change in notation.

    Denote the representatives mod p of all the elements of order tas

    1< g1< g2< . . . < gN< p

    whereN =(t). Note that g is equal to one of the gis. Let 1 ai < t be the numbersrelatively prime to t such that

    gaii g mod p.

    Then,

    gi ga1i mod p,

  • 8/12/2019 Self Power Map

    64/82

    Chapter 5. A deterministic bound 57

    where the inverse ofai is taken modulo t.

    To show that nn g mod p has a solution in n with n < p2, it will be enough toshow that the sequence{(ai, gi)}Ni=1 is sufficiently scattered in the t p rectangle. This

    is equivalent to the element g being scattered in Table5.1. First normalize to the unit

    square [0, 1)2 and recall that gi ga1i mod p, to obtain the sequence

    ait

    ,gip

    Ni=1

    =

    ait

    ,ga

    1i

    p

    Ni=1

    .

    We begin by showing that the discrepancy of this sequence is small.

    Figure 5.1: Ifg is primitive, the rectangle of interest has dimensions (p 1) p. Thefigure shows the positions of{(ai, gi)}Ni=1 corresponding to the smallest primitive rootsg= 6 and g = 2 for the primes 151 and 491, respectively.

    Proposition 5.2. Letp be a prime. The discrepancy of the sequence

    ait

    ,gi

    p

    N

    i=1

    is bounded byO

    (log2 t)N1t1

    , for some >0, whereN=(t).

    Proof. Consider the sum

    Sm,n =Ni=1

    e

    m

    ga1i

    p +n

    ait

  • 8/12/2019 Self Power Map

    65/82

    Chapter 5. A deterministic bound 58

    =Ni=1

    ep

    mga1i

    et(nai)

    =

    (x,t)=1

    ep

    mgx1

    et(nx).

    We will restrict m, n to be integers less than t in magnitude. Theorem 4 in [BS08]

    gives the following bound, which applies for the case m = 0 andt < m, n < t.

    Theorem 5.3. For any > 0 there exists > 0 such that for t p, uniformly overm (Z/pZ) andn Z/tZ, we have the bound

    Sm,n t1.

    Ifn = 0 andm= 0 the sum reduces to the Ramanujan sum:

    S0,n=

    (x,t)=1

    et(nx).

    We have

    S0,n= S0,n,

    so that|S0,n| =|S0,n|. Therefore, we need only consider the cases whenm, n are non-negative. The Ramanujan sum can be evaluated using von Sternecks formula, where

    k= gcd(n, t):

    S0,n=

    t

    k

    (t)

    tk

    .By the Erdos-Turan inequality, Theorem3.4for two dimensions, we have

    C 2M+ 1+ 1N (m,n)=(0,0)0m,nM

    1

    1 +m

    1

    1 +n

    |Sm,n| . (5.1)

    Takingmand n non-negative above only changes the constant in Erdos-Turan inequality

    by a multiple of 4, so that

    C= 4

    3

    2

    2.

  • 8/12/2019 Self Power Map

    66/82

    Chapter 5. A deterministic bound 59

    We can separate the sum above into two parts, depending on whether|Sm,n| < t1

    or|Sm,n| > t1. By the result of [BS08] quoted above, we only have to consider sums ofthe form|S0,n|. Let A be the set

    A=

    1 n < t such that|S0,n| > t1

    .

    Taking M=t 1 in equation (5.1), we have

    C

    2t + 1N

    (m,n)=(0,0),n/A

    0m,n x

    e log log x+ 3log logx

    [BS66, p. 320] and log log(t/k)>1 for (t/k)> ee, we have

    t >

    t

    k

    >

    tk

    e log log tk + 3

  • 8/12/2019 Self Power Map

    67/82

    Chapter 5. A deterministic bound 60

    provided (t/k)> ee. Thus

    t >tk

    e log log t+ 3

    k >

    t1

    e log log t+ 3 . (5.3)

    We remark that in the case log log(t/k) < 1, we have k > (t/ee). Thus in either case,

    n > k > t12 for t large enough.

    Lemma 5.5. Forp large enough,|A| t2.

    Proof. We have from[Bac97] the identity

    tn=1

    |S0,n| =(t)2(t)

    where(t) is the number of distinct prime divisors oft. From the inequality[NZM91,p.

    394]

    2(t) < t(1+)log2loglog t

    we obtain 2(t) < t ift is large enough.

    Thus, there are at most2(t)N

    t1 2(t)t t2

    elements ofA, where we have N=(t).

    We can now return to the proof of Theorem5.1. Using the trivial estimate |S0,n| < N,we have

    1

    NnA

    1n

  • 8/12/2019 Self Power Map

    68/82

    Chapter 5. A deterministic bound 61

    Proposition 5.6. Every elementg (Z/pZ) of order t > p1/2 can be represented asnn g mod p for some natural numbern p2 for some >0.

    Proof. For the equivalence nn g mod p to have a solution with n < p2, we mustshow that

    ggi+kii g mod p

    for ki < p1, for at least one value i = i0. Then the required n is equal to pki0 +gi0.

    We find such a solution as follows.

    Recall that the discrepancy is defined by

    = supB[0,1)2

    N(B)N |

    B

    |whereN(B) is the number of elements in the sequence that fall into the box B . RecallthatN=(t) and using the bound for discrepancy as in [Mur10], we obtain

    N(B) > N(|B| )

    > (t)|B| C(log2 t)t1.

    Therefore, if

    |B| =(t) , where <

    we haveN(B)> 0 forp large enough; this is becauset1 =o (t)1. In other wordsB contains at least one point,

    ait

    ,gip

    Ni=1

    from the sequence.

    When we re-normalize, we note that the length and width of the box were stretched

    by different factors, so that we must multiply the volume by pt. Thus, in the t prectangle every box of volume

    |Bre-normalized| =pt(t) =pt1 , where < < . (5.4)

    contains a point of the sequence{(ai, gi)}Ni=1.

  • 8/12/2019 Self Power Map

    69/82

  • 8/12/2019 Self Power Map

    70/82

    Chapter 5. A deterministic bound 63

    Figure 5.2: Every box of height t1/2 and length p1

    /4 contains one element of the

    sequence{(ai, gi)}N

    j=1. Considering a box positioned under the diagonal will give an

    element (ai0, gi0) with the required properties.

  • 8/12/2019 Self Power Map

    71/82

    Chapter 5. A deterministic bound 64

    5.4 A numerical value of

    To find a finite value of in Theorem5.1that is large enough, it is sufficient to maximize

    , hence (see equation (5.4)). Let t= p where 12 <

    1. We will find a value of

    in terms of that is maximal according to the following explicit bound for Sm,n given by

    Shparlinski in[Shp11].

    Theorem 5.7. Letg be of multiplicative order t. Then for any fixed integersx, y 2uniformly overm (Z/pZ) andn Z/tZ, we have the bound

    Sm,n t1rx,ypsx,y+o(1)

    asp where

    rx,y = 1

    2(2x+y) 1

    4xy and sx,y =

    1

    4(2x+y).

    Sincerx,y < sx,y/2, we may assume that

    t

    p1/2(logp)2

    as otherwise the bound is trivial. Let t= p, where >1/2. Substituting p= t1/ into

    the inequality for Sm,n,

    Sm,n t1+1

    14(2x+y)

    12(2x+y)

    + 14xy

    + o(1) =t1

    where

    = x,y = 1

    2(2x+y) 1

    4xy 1

    1

    4(2x+y)o(1)

    . (5.5)

    We will find the appropriate values ofx, y >2 that maximizefor a given according

    to the formula above. This is a small calculus exercise.

    Claim. The valueis maximized in the equation (5.5) when

    x = 4

    (2 1)

  • 8/12/2019 Self Power Map

    72/82

    Chapter 5. A deterministic bound 65

    y = 8

    (2 1)

    and

    =

    1

    27

    (2

    1)2

    2 o(1)

    .

    Proof. Letz= 2x+y. Theny= z 2xand

    =

    1

    2 1

    4

    z1 1

    4x1(z 2x)1.

    x = 14

    (1)x2(z 2z)1 14

    x1(1)(z 2x)2(2)

    z =1

    2 1

    4

    (1)z2

    1

    4x1

    (1)(z 2x)2

    .

    For x to be zero,

    1

    x21

    (z 2x)2

    x

    1

    (z 2x)2 = 01

    x21

    (z 2x) = 2

    x

    1

    (z 2x)2(z 2x) = 2x

    z = 4x

    y = 2x.

    For y to be zero as well,

    1

    2 1

    4

    1

    z2 =

    1

    41

    x 1

    (z 2x)2 .

    Substitutingz= 4x,

    1

    2 1

    4a

    1

    16x2 =

    1

    41

    x 1

    (2x)2

    x = 1

    12 1

    4

    x = =

    4

    (2 1) .

  • 8/12/2019 Self Power Map

    73/82

    Chapter 5. A deterministic bound 66

    Substituting the above expression ofx in terms of into z= 4x and then into x,z,

    yields the required values. It remains now to show by the second-derivative test that

    these values are maximums. Since

    xx|x= 4(21)

    = 14

    12 1

    4

    42.

    We have < < and < /4. Therefore if= 9/10, for instance,

    = 0.005878894767784 910

    o(1)

    and we may take

  • 8/12/2019 Self Power Map

    74/82

    Chapter 6

    Conclusion and future work

    In the first part of the thesis, we extended bounds for the number of solutions to

    nn amod p for a fixed residue class a. Previous bounds from Balog et al. [BBS11]considered 1 n p 1, and we extended those bounds for the case when n > p. Ex-isting bounds in both cases are far from the conjectured values [Hol02b]. Using further

    tools from additive combinatorics and exponential sums, we may perhaps obtain stronger

    results.

    We have shown that statistical models are useful in generating heuristic estimates

    for number theory problems. To our knowledge, the coupon collector problem has not

    previously appeared in a number theory context. We anticipate modifying the problem

    to obtain other estimates. For instance, the version with blank coupons may be used to

    obtain estimates for a full collection of residue classes with a desired property, such as

    kth powers. In this case, all residue classes not possessing the property would be blank

    coupons. Also, we may be able to give a heuristic argument for the number of different

    residue classes that appear asnn mod p where 1 n p 1, to determine the sharpnessof the bounds given by Crocker [Cro69] or Somer[Som81].

    We also established the existence of small solutions for nn amod p, althoughnn mod p is not uniformly distributed. The method of proof was first applied in [ Mur10]

    67

  • 8/12/2019 Self Power Map

    75/82

    Chapter 6. Conclusion and future work 68

    to find small solutions for xq amodp, where the order ofa is a multiple of (p 1)/q.Thus, the same method may be used to find small solutions of other functions, such as

    polynomials or repeated powers modulo a prime.

    We expect that nn amod p has a solution with n x, for any given residue classa, ifx is approximatelyp2 log (p1)/(p1). We have proved the deterministic boundx < p2 where >0 is small. Given the difference between these two estimates, it may

    be possible to strengthen the deterministic bound.

  • 8/12/2019 Self Power Map

    76/82

    Appendix A

    Formal Languages and Probabilities

    The following background on formal languages, generating functions, and probabilities is

    taken from Section 2 of [FGT92], which summarizes the methods used to relate formal

    descriptions of regular languages to generating functions and probability estimates.

    Definition A.1. LetA ={a1, a2, . . . , am} be a fixed set called the alphabet whose ele-ments are the letters. ThenA denotes the set of all finite sequences, called the wordsor strings of

    A. A language is any subset of

    A.

    Given languagesL1,L2, andL, we define the following operations:

    unionofL1 andL2: L1+L2= {w|w L1 or w L2}

    product ofL1 and L2: L1L2 = {w1w2|w1 L1 and w L2}

    star L: the language formed from all possible sequences from elements ofL,

    L= {} +L+ (L L) + (L L L) +. . .

    where denotes the empty word.

    Definition A.2. The class of regular languages is the smallest class of languages con-

    taining the finite sets and closed under the three operations of union, product and star.

    69

  • 8/12/2019 Self Power Map

    77/82

  • 8/12/2019 Self Power Map

    78/82

    Appendix A. Formal Languages and Probabilities 71

    is called the ordinary generating function (OGF) of the languageL with respect to the

    weight distributionp= (p1, . . . , pm) and is denoted by l(z). The exponential generating

    function (EGF) is defined withzn/n! replacingzn,

    l(z) =

    n1,...,nm

    ln1,...nmpn11 pnmm z

    n1+...+nm

    (n1+. . .+nm)!=

    wA

    [w] z|w|

    |w|! .

    The value [zn]l(z) is the probability that a random word inAn is inL.The ordinary and exponential generating functions are related by the Laplace-Borel

    transform

    l(z) =

    0

    l(zt)etdt

    following from the classical relation0

    tnetdt= n!

    An operation on languages is unambiguous if every word of the resulting language is

    obtained only once. The following is Theorem 2.1 in [FGT92]. Along with the Laplace

    transform integral, it is the key to determining the generating function of a language

    defined by a combination of the four basic operations.

    Theorem A.4. When they operate unambiguously on their arguments, the operations of

    union, product, star, and shuffle product translate into generating functions:

    (a) L= L1+L2 l(z) =l1(z) +l2(z).(b) L= L1 L2 l(z) =l1(z)l2(z).(c) L= L1 l(z) = (1 l1(z))1.

    (d) L= L1 L2 l(z) = l1(z)l2(z).

  • 8/12/2019 Self Power Map

    79/82

    Bibliography

    [Bac97] Gennady Bachman. On an optimality property of Ramanujan sums. Proc.

    Amer. Math. Soc., 125:10011003, 1997.

    [Bal11] R. Balasubramanian. Private communication, 2011.

    [BBS11] Antal Balog, Kevin Broughan, and Igor Shparlinski. On the number of solu-

    tions of exponential congruences. Acta Arith., 148(1):93103, 2011.

    [BGK06] Jean Bourgain, Alexey A. Glibichuck, and Sergei V. Konyagin. Estimates for

    the number of sums and products and for exponential sums in fields of prime

    order. J. London Math. Soc., 73(2):380398, 2006.

    [BM84] Manuel Blum and Silvio Micali. How to generate cryptographically strong

    sequences of pseudorandom bits. SIAM J. Comput., 13(4):850864, 1984.

    [BS66] Eric Bach and Jeffrey Shallit. Algorithmic Number Theory, Vol I: Efficient

    Algorithms. The MIT Press, Cambridge, 1966.

    [BS08] Jean Bourgain and Igor E. Shparlinski. Distribution of consecutive modular

    roots of an integer. Acta Arith., 134(1):8391, 2008.

    [Cro69] Roger Crocker. On residues ofnn. Amer. Math. Monthly, 76(4):10281029,

    1969.

    [DT97] Michael Drmota and Robert F. Tichy. Sequences, discrepancies and applica-

    tions. Lecture Notes in Mathematics, volume 1651. Springer, 1997.

    72

  • 8/12/2019 Self Power Map

    80/82

    BIBLIOGRAPHY 73

    [ER61] Paul Erdos and Alfred Renyi. On a classical problem of probability

    theory. Magyar Tudomanyos Akademia Matematikai Kutato Intezetenek

    Kozlemenyei, 6:215220, 1961.

    [FGT92] Philippe Flajolet, Daniele Gardy, and Loys Thimonier. Birthday paradox,

    coupon collectors, caching algorithms and self-organizing search. Discr. Appl.

    Math., 39:207229, 1992.

    [FLM10] Matthew Friedrichsen, Brian Larson, and Emily McDowell. Structure and

    statistics of the self-power map. RHIT Undergrad. Math. J., 11(2):107139,

    2010.

    [Fri07] John Friedlander. Uniform distribution, exponential sums and cryptography.

    In Andrew Granville and Zeev Rudnick, editors, Equidistribution in Number

    Theory, An Introduction, pages 2957. Springer, 2007.

    [FS09] Philippe Flajolet and Robert Sedgewick.Analytic Combinatorics. Cambridge

    University Press, 2009.

    [Gar08] M. Z. Garaev. The sum-product estimate for large subsets of prime fields.

    Proc. Amer. Math. Soc., 136(8):27352739, 2008.

    [GOM12] Jose Mara Grau and Antonio M. Oller-Marcen. On the last digit and the last

    non-zero digit ofnn in base b. Unpublished, available at arXiv:1203.4066v1,

    2012.

    [Hay56] W.K. Hayman. A generalisation of stirlings formula. J. Reine Angew. Math.,

    196:6595, 1956.

    [HM06] Joshua Holden and Pierre Moree. Some heuristics and results for small cycles

    of the discrete logarithm. Mathematics of Computation, 75(253):419449,

    2006.

  • 8/12/2019 Self Power Map

    81/82

    BIBLIOGRAPHY 74

    [Hol02a] Joshua Holden. Addenda/corrigenda: Fixed points and two-

    cycles of the discrete logarithm. Unpublished, available at

    http://xxx.lanl.gov/abs/math.NT/0208028, 2002.

    [Hol02b] Joshua Holden. Fixed points and two-cycles of the discrete logarithm. Algo-

    rithmic number theory (ANTS 2002), pages 405415, 2002.

    [HR11] Joshua Holden and Margaret M. Robinson. Counting fixed points, two-cycles,

    and collisions of the discrete exponential function usingp-adic methods. Un-

    published, available at arxiv.org/pdf/1105.5346, 2011.

    [Isa95] Richard Isaac.The Pleasures of Probability, University Texts in Mathematics.

    Springer-Verlag, 1995.

    [KN06] L. Kuipers and H. Niederreiter. Uniform Distribution of Sequences. Dover

    Publications, Inc, 2006.

    [Mur08] M. Ram Murty. Problems in Analytic Number Theory, 2nd edition. Springer-

    Velag, 2008.

    [Mur10] M. Ram Murty. Small solutions of polynomial congruences. Ind. J. Pure

    Appl. Math., 41(1):1523, 2010.

    [MvOV96] A. Menezes, P. van Oorschot, and S. Vanstone. Handbook of Applied Cryp-

    tography. CRC Press, 1996.

    [NZM91] Ivan Niven, Herbert S. Zuckerman, and Hugh L. Montogomery. An Intro-

    duction to the Theory of Numbers, 5th edition. John Wiley & Sons, Inc.,

    1991.

    [Pre98] Wiebe R. Prestman. Mathematical Statistics. Walter de Gruyter, 1998.

    [PS72] George Polya and Gabor Szego. Problems and Theorems in Analysis, Vol. I.

    Springer-Verlag, 1972.

  • 8/12/2019 Self Power Map

    82/82

    BIBLIOGRAPHY 75

    [Shp11] Igor E. Shparlinski. Exponential sums with consecutive modular roots of an

    integer. Quart. J. Math., 62:207213, 2011.

    [Som81] Lawrence Somer. The residues of nn modulo p. The Fibonacci Quarterly,

    19(2):110117, 1981.