self power map
TRANSCRIPT
-
8/12/2019 Self Power Map
1/82
The self-power map and its image modulo a prime
by
Catalina V. Anghel
Thesis submitted in conformity with the requirementsfor the degree of Doctor of PhilosophyGraduate Department of Mathematics
University of Toronto
Copyright c2012 by Catalina V. Anghel
-
8/12/2019 Self Power Map
2/82
Abstract
The self-power map and its image modulo a prime
Catalina V. Anghel
Doctor of Philosophy
Graduate Department of Mathematics
University of Toronto
2012
The self-power map is the function from the set of natural numbers to itself which sends
the number n to nn. Motivated by applications to cryptography, we consider the image
of this map modulo a prime p. We study the question of how large x must be so that
nn amod p has a solution with 1 n x, for every residue class a modulo p.While nn mod p is not uniformly distributed, it does appear to behave in certain ways
as a random function. We give a heuristic argument to show that the expected x is
approximately (p 1)2 log (p 1)/(p 1), using the coupon collector problem as a
model. Rigorously, we prove the bound x < p
2
for sufficiently large p and a fixedconstant >0 independent ofp, using a counting argument and exponential sum bounds.
Additionally, we prove nontrivial bounds on the number of solutions ofnn amod p fora fixed residue classawhen 1 n x, extending the known bounds when 1 n p1.
ii
-
8/12/2019 Self Power Map
3/82
Dedication
For my parents, Mirela and Vinicius Anghel.
Acknowledgements
Academic Acknowledgments. First and foremost, I would like to thank my supervi-
sor,Prof. Kumar Murty, who first suggested the study of the self-power map. His help,
encouragement, and optimism have been invaluable over the past five years. I thank him
for creating a wonderful sense of community (including a viewing of the Kung Fu Panda
movie!). Most of all, I am grateful to him for taking me seriously as a researcher.
I am grateful to Prof. Balasubramanian (Prof. Balu) for two very useful discussions
at the start of my study. Dr. Venkatesan (Venkie) introduced me to an algorithmic
approach to thinking about the self-power map, and always was cheerfully ready to talk
with me about my work, even from Seattle and India. I thankProf. Repkafor his advice
and reassurance. It was wonderful to have a faculty member I felt completely comfortable
approaching when I was struggling.Prof. Ram Murty forwarded me his article and suggestions at a key moment in my
research. His ideas were different from the path I ultimately took, but he was correct in
thinking that my problem was similar. His article helped me get unstuck when finding
the deterministic bound.
I thank the members of my supervisory committee,Prof. MurnaghanandProf. Kam-
nitzer, for being a kind audience during our meetings. They also accepted receiving my
many email updates or helped with (very, very) last-minute reference letters.
As every mathematics graduate student before me, I am incredibly grateful to Ida
Bulatfor her warmth, kindness and dedication. From reminders about forgotten dead-
lines to meeting for coffee to catch up, I am so glad that she is always there to ensure
that everything goes smoothly.
iii
-
8/12/2019 Self Power Map
4/82
A huge thank you goes out to my Ganita lab-mates: Robby Burko,Aaron Chow,Pay-
man Eskandari,William George,Nataliya Laptyeva,Mariam Mourtada,Meng Fai Lim,
andYing Zongas well as visitors Brendon HansonandAnastasia Zaytseva. Thank you
for your questions, advice, and support, and for providing such a welcoming environment
for me to share my ideas. I am so glad to also call you friends. Thank you for the musical
shows, cozy potlucks, French movies, dinners, wedding, introductions to new babies, and
ceilidh dances over the past years.
I would also like to thank Aude Plateaux, for her warm welcome at ENSICAEN this
summer. In her expert care, I had a quiet place to start writing my thesis, as well as a
wonderful travel experience.
Personal Acknowledgments. While I would have never finished my thesis with-
out the mathematical support of my professors and colleagues, I also would have never
finished without the encouragement of my friends and family.
Thank you to Charlotte Ochanine, the kindest person I know. You were my biggest
cheerleader during my first year and your encouragement and practical advice helped me
survive the comprehensive exams. Your visit to Deep River over the winter holidays is
one of my fondest memories of my time as a PhD. Thank you for being so patient with
me even during my most stressful times.
Thank you to Peter Leefor making me more conscious about trying new things and
getting out in nature, both of which restore me when things get tough.
Hanae Hanzawa, you are my role model for being a kind and confident woman. You
are so much fun to be around, and so open about everything (including trying strange
Romanian food).
Thank you to Caroline Schmeing, who is always ready for new adventures, whether
it is attending a baseball game, or cooking exotic dishes, or saving me from a mouse
two days before my supervisory committee meeting. Terry (Theresa) Lim is not only
one of my closest friends, but has also been my roommate. We definitely bonded over
iv
-
8/12/2019 Self Power Map
5/82
shared experiences, like the plumbing mishap at 1 am! Thank you for your practical,
tell-it-like-it-is advice to my fanciful musings, and for all the shared activities, from spin
class to kayaking.
Thank you so much to Tom Kuwahara, my best friend and confidant in these past
five and a half years. Thank you for your wisdom, your sense of humour and your
understanding during the many times when I have been stressed and irritable. So many
of my best memories during the past years are with you: the early, long study stretches
for analysis and topology, breaks for tea and Gregs ice cream, hikes through the Toronto
ravines and the Newfoundland coast, and all the laughter in our weekly chats. Its with
you that I feel most me.
There are two people whom I cannot even begin to thank, my parents Mira and
Vinicius Anghel. My father has been the sounding board for my ideas and the first
reader of my work. My mother patiently listened to all my worries and struggles (and
I am not one to suffer in silence!). I am so grateful that their confidence in me never
wavered. Their love and support got me through the painful years when nothing worked.
Its been a gift to finally be able to do mathematics, to find a pattern when at first there
is none. I could have never gotten there without them.
v
-
8/12/2019 Self Power Map
6/82
Contents
1 Introduction 1
1.1 The self-power map and the discrete logarithm problem . . . . . . . . . . 1
1.1.1 ElGamal signature scheme . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Two-cycles of the discrete logarithm . . . . . . . . . . . . . . . . 4
1.2 Past work on the self-power map . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Extension to larger values ofn . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Bounds onN(x; a) 12
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Estimates for N(x; a) for a of small order . . . . . . . . . . . . . . . . . . 13
3 A preliminary probability model 20
3.1 Equidistribution and exponential sums . . . . . . . . . . . . . . . . . . . 20
3.2 The classical coupon collector problem . . . . . . . . . . . . . . . . . . . 23
3.3 Powers of primitive roots and a first estimate ofx . . . . . . . . . . . . . 24
4 Expectation ofx and its cumulative distribution 31
4.1 Coupon collector problem with blank coupons . . . . . . . . . . . . . . . 31
4.2 Upper and lower bounds for expectation . . . . . . . . . . . . . . . . . . 37
4.2.1 Expectation estimate for the casep = 2q+ 1 . . . . . . . . . . . . 39
4.3 Cumulative distribution function . . . . . . . . . . . . . . . . . . . . . . 43
vi
-
8/12/2019 Self Power Map
7/82
4.4 Computational results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5 A deterministic bound 54
5.1 Elementsaof small order . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Elementsaof large order . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 Proof of Theorem5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4 A numerical value of . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6 Conclusion and future work 67
A Formal Languages and Probabilities 69
Bibliography 72
vii
-
8/12/2019 Self Power Map
8/82
Chapter 1
Introduction
Definition 1.1. The self-power map is the functionf from the set of natural numbers
to itself, such that
f(n) =nn.
Our definition differs from Holdens definition in [Hol02b, HR11], where n N ismapped tonn mod p (or pe). Since some authors also consider nn mod b for composite b
[GOM12], we give this more general definition. However, from now on we will consider
the image of the self-power map modulo a prime p.
1.1 The self-power map and the discrete logarithm
problem
We begin by describing two applications to cryptography that motivate the study of the
self-power map. The self-power map and its image modulo a prime p is related to the
discrete logarithm problem in (Z/pZ).
Definition 1.2. The discrete logarithm problem in (Z/pZ) is the following: Given
g (Z/pZ), andh an element in the subgroup generated byg, find an integera suchthatga hmod p.
1
-
8/12/2019 Self Power Map
9/82
Chapter 1. Introduction 2
While it is computationally easy to raise an element to a power, the discrete logarithm
problem, which is the inverse operation, is assumed to be difficult in general. Thus, the
security of many cryptographic applications is based on this one-way property, and the
discrete logarithm has been widely studied.
1.1.1 ElGamal signature scheme
One of the applications of the discrete logarithm is the ElGamal signature scheme. How-
ever, the security of two variations of the ElGamal signature scheme also rely on the
difficulty of inverting the self-power map.
Algorithms1and2describe the ElGamal signature algorithm, taken from [MvOV96],
section 11.5.2, and [FLM10]. We assume that Alice is sending a message m to Bob, and
she would like to sign it so that Bob can verify her identity.
Algorithm 1 Generation of Alices public and secret key
1: Choose a large random prime p and a generator g of (Z/pZ).
2: Select a random integer a such that 1 a p 1 and compute y = ga
.
3: Alices public key is (p, g, y) and her private key is a.
Algorithm 2 ElGamal signature generation and verification
1: Alices signature generation:
(a) Select a random integer k such that 1 kp 2 and gcd(k, p 1) = 1.
(b) Compute r = gk mod p.
(c) Computes = k1(h(m) ar) where the inverse ofk is calculated modulo (p 1)and h(m) is the hash of the message m.
(d) Alices signature for the message ofm is the pair (r, s).
-
8/12/2019 Self Power Map
10/82
Chapter 1. Introduction 3
Algorithm 2 ElGamal signature generation and verification (continued)
2: Verification of the signature by Bob:
(a) Verify that 1 r p 1, else reject.
(b) Compute v1 = yrrs mod p.
(c) Compute v2 = gh(m) mod p, after calculating h(m).
(d) Accept the signature ifv1 = v2.
The signature verification in Algorithm2works because
v1 = y
r
r
s
mod p
(ga)r gkk1(h(m)ar) mod p gargh(m)ar mod p
gh(m) mod p
= v2.
Variations of the ElGamal signature scheme may be obtained by changing the value
ofs in step 1(c) of the signing algorithm, Algorithm2. Notice that the signing equation
in terms of the exponents can be also expressed as
u= av+kw mod (p 1)
where
u= h(m), v= r, w= s.
Assigning tou,v and w the valuess,r, andh(m) in different orders defines a new signing
equation, which produces a different expression for s in terms ofa,k,r, andh(m). Table
1.1from[MvOV96, Table 11.5] summarizes the possibilities.
The verification equations for variations 2 and 4 involve the expression rr. Thus, an
adversary who can solve the self-power problem may forge Alices signature by choosing
-
8/12/2019 Self Power Map
11/82
Chapter 1. Introduction 4
u v w Signing equation Verification
1 h(m) r s h(m) =ar+ks gh(m) = (ga)rrs
2 h(m) s r h(m) =as+kr gh(m)
= (ga
)s
rr
3 s r h(m) s= ar+kh(m) gs = (ga)rrh(m)
4 s h(m) r s= ah(m) +kr gs = (ga)h(m)rr
5 r s h(m) r= as+kh(m) gr = (ga)srh(m)
6 r h(m) s r= ah(m) +ks gr = (ga)h(m)rs
Table 1.1: Variations of the ElGamal signing equation. Signing equations are computed
modulo (p 1); verification is done modulo p.
s, then solving
gh(m) [(ga)s]1 =rr or gs
(ga)h(m)1
=rr,
forr.
1.1.2 Two-cycles of the discrete logarithm
Another application of the discrete logarithm related to the self-power map is a pseudo-
random bit generator of Blum and Micali [BM84].
Letg0 be a fixed generator of (Z/pZ). Associating the residue classes modulo p 1
with integers between 1 and p, we can define the following function
f(x) =gx
0 mod p
from the multiplicative group (Z/pZ) to itself. Blum and Micali [BM84] describe a
pseudo-random bit generator which depends on an iteration
x, f(x), f2(x), . . . , f N(x)
.
The pseudo-randomness of the sequence is compromised if the iteration falls into a fixed
point or a short cycle.
-
8/12/2019 Self Power Map
12/82
Chapter 1. Introduction 5
Cycles of length two of the discrete logarithm, of the form
gh =a modp and ga =h modp, (1.1)
are related to collisions of the self-power map
hh aa mod p. (1.2)
Given a solution (g,h,a) to equation (1.1), the values h and a also satisfy equation
(1.2). In certain cases, solutions to (1.2) also produce solutions to (1.1). Holden and
his co-authors study the relationship between two-cycles of the discrete logarithm and
collisions of the self-power map modulop in [Hol02b, Hol02a,HM06]. We reproduce the
following proposition from[HM06].
Proposition 1.3. If gcd(h,a,p1) = 1, then there is a one-to-one correspondencebetween triples (g,h,a) which satisfy (1.1) and pairs (h, a) which satisfy (1.2), and the
value ofg is unique givenh anda. In particular, this is true whenh ora is relatively
prime to p 1.
Proof. Fix an arbitrary primitive root b mod p, and denote by ind x the discrete loga-
rithm modulop ofx with respect to the base b. That is,
ind x= y by x (mod p).
The system of equations (1.1) is then equivalent to the following system
h ind g ind a mod (p 1)a ind g ind h mod (p 1).
(1.3)
If gcd(h,a,p 1) = d, there exist u and v such that uh+ va dmodp 1. Usingelementary operations on systems of equations, we can show with a bit of work that ( 1.3)
is equivalent to the following system:
0 adind a h
dind h mod p 1
gd hvau mod p.(1.4)
-
8/12/2019 Self Power Map
13/82
Chapter 1. Introduction 6
In our case gcd(h,a,p 1) = 1, and we obtain the equivalent system:
aa hh mod pg
hvau mod p.
(1.5)
Many heuristics for the number of solutions of two-cycles of the discrete logarithm
and collisions of the self-power map modulo a prime are given in Holdens work.
1.2 Past work on the self-power map
While we have described the cryptographic applications related to the self power map
and its image modulo a prime, the behaviour of the self-power map is an interesting
problem in its own right.
Past work on the self power map was limited to the case 1 n p 1. We beginour study with the simple question:
Question 1.4. Given any residue classa modulo p, does there existn such thatnn
amodp and1 n p 1?
The answer is no, not necessarily. When restricted to 1np 1 the self powermap is not injective, and thus not surjective, since the residue 1 appears at least twice:
11 (p 1)(p1) 1 modp.
The earliest results on the self-power map are due to Crocker [ Cro69], who gives thelower bound on the number of distinct residue classes that are represented by {nn : 1 n p 1}.
Theorem 1.5. Letp be an odd prime. The number of distinct residues modulo p ofnn,
where1 n p 1, is at least
(p 1)/2
.
-
8/12/2019 Self Power Map
14/82
Chapter 1. Introduction 7
Somer [Som81] gives an upper bound.
Theorem 1.6. Letp be an odd prime. The number of distinct residues modulo p ofnn,
where1
n
p
1, is at most 34p + O p
12
+, whereis any positive real number andthe impicit constant depends only on.
The upper bound is proved by bounding the number quadratic non-residues which do
not appear among the residues ofnn modp when 1 n p 1. The constant is relatedto the class number ofQ(
p).Somer finds lower bounds for the number of primitive roots modulo p which do not
appear as residues ofnn mod p. Given any M
N, he shows that the residue 1 (or
1)
appears at leastMtimes asnn modp for some large prime p. Thus a related question is
the following:
Question 1.7. What is the number of times that a residue class may be represented in
the formnn modp with1 n p 1?
More recently, Balog, Broughan, and Shparlinski [BBS11] have given bounds on the
number of times a residue class appears, depending on its multiplicative order in (Z/pZ).Let us introduce the following notation.
Definition 1.8. Denote by
N(x; a) = |{n x: nn a (modp)}| ,
the number of solutions to nn =a mod p with1 n x.
Balog et al.[BBS11] give the following bounds.
Theorem 1.9. Uniformly over t|(p 1) and all integersa withgcd(a, p) = 1 of multi-plicative ordert, the following bounds hold asp :
N(p; a) max
t, p12 t
14
po(1),
-
8/12/2019 Self Power Map
15/82
Chapter 1. Introduction 8
N(p; a) p 13+o(1)t2/3,
N(p; a) p1+o(1)t1/12.
The first two bounds are most useful when the order oft is small, while the last is
useful when a has large order. In Chapter 2 our proofs will follow those of Balog et
al. very closely to extend these estimates to N(x; a), for p < x p(p 1).
1.3 Extension to larger values ofn
As a variation of the Question1.4,we may ask:
Question 1.10. If we do not restrict n to 1 n p 1 can we find a solution tonn amod p, for anya?
The answer is yes, if we allow 1 n p(p 1). Given anya such that 1 a p 1,thena may be expressed as nn by taking
n= a+p(p a) = 1 + (p 1)(p+ 1 a)
because
nn = (a+p(p a))1+(p1)(p+1a) amodp.
The sequence{nn} is periodic with period p(p 1).Given nn mod p, the base is determined modulo p and the exponent is determined
modulo p 1. Thus, we may organize the values ofnn modp where gcd(n, p) = 1 as inTable1.2, by multiples ofp. Reducing the base modulop and the exponent modulo p 1gives us Table1.3. An example for p= 11 is given in Table1.4.
The number of times a residue a (Z/pZ) of order t appears in Table1.3 isk|p1
t
(kt) p 1kt
, (1.6)
from Theorem 1 of [Som81]. If p 1 is square free, it can also be estimated in thefollowing way, independently found by R. Balasubramanian, [Bal11].
-
8/12/2019 Self Power Map
16/82
Chapter 1. Introduction 9
Values ofnn, wheren= kp+b
b 1 2 . . . (p 1)k= 0 11 22 . . . (p 1)(p1)
k= 1 (p+ 1)(p+1) (p+ 2)(p+2) . . . (2p 1)(2p1)
k= 2 (2p+ 1)(2p+1) (2p+ 2)(2p+2) . . . (3p 1)(3p1)
k= 3 (3p+ 1)(3p+1) (3p+ 2)(3p+2) . . . (4p 1)(4p1)...
...
k= (p 2) ((p 2)p+ 1)((p
2)p+1)
((p 2)p+ 2)((p
2)p+2)
. . . ((p 1)p 1)((p
1)p
1)
Table 1.2: The table of values ofnn, arranged by groups ofp, omitting multiples ofp.
Values ofnn mod p, where n = kp +b
b 1 2 3 . . . (p 2) (p 1)k= 0 11 22 33 . . . (p 2)p2 (p 1)p1
k= 1 12 23 34 . . . (p 2)p1 (p 1)1
k= 2 13 24 35 . . . (p 2)1 (p 1)2...
...
k= (p 2) 1p
1
21
32
. . . (p 2)p
3
(p 1)p
2
Table 1.3: The values ofnn from Table1.2considered modulop, where the base is reduced
modulo p, and the exponent modulo p 1.
-
8/12/2019 Self Power Map
17/82
-
8/12/2019 Self Power Map
18/82
Chapter 1. Introduction 11
The residue class a will appear in any column where the order ofb is a multiple of the
order ofa. Letting ord a= t and ord b= kt, then a will occur (p 1)/kt times in thiscolumn. Hence, the total number of times a appears in the table is
k|p1
t
(kt) p 1kt
= (p 1) (t)t
k|p1
t
(k)k
= (p 1) (t)t g
p1t
p1t
=(t)g p 1t
The residuesnn modp in Table1.3 are not equidistributed, but depend on the order
of n in (Z/pZ). In particular, elements of lower order appear more frequently thanthose of higher order. Holden [HR11]conjectured that
N(p; a) k|p1
t
(kt)
kt
which is 1/(p 1)-th the number of times the element a appears in the entire table, overone period of the sequence{nn}.
The question we study in the second part of the thesis is the following:
Question 1.12. How large should x be so that nn amod p has a solution with 1n x, for any given residue classamodulo p?
The trivial bounds for x are p < x p(p 1). A probabilistic model, described inChapters3and4provides a heuristic argument to show that the value ofx is approxi-
mately (p 1)2 log (p 1)/(p 1). In Chapter5, we rigorously prove that x < p2
for large enoughp, for an explicit value >0, which is independent ofp.
-
8/12/2019 Self Power Map
19/82
Chapter 2
Bounds on N(x; a)
2.1 Introduction
Recall that N(x; a) denotes the number of solutions to the congruence nn = a mod p
with 1 n xfor a fixed residue class a modulo p. Balog et al. [BBS11]give estimatesfor N(p; a) and we extend those estimates to N(x; a) for x > p. The sequence nn has
periodp(p
1). Thus, we may consider onlyp < x < p(p
1) since ifx is larger than
p(p 1) we can add an appropriate factor ofN(p(p 1); a), evaluated using equation(1.6) or Prop.1.11.
The method of proof is completely analogous to that of Balog et al. [BBS11]. The
value N(p; a) is the number of occurrences of the residue class amod p in the first row
of Table1.3. In one case, we obtain a bound forN(x; a) which is approximately (x/p)
times the bound for N(p; a), as expected; the residue class a mod p should appear about
equally often in each row. Thus, ifa = 1, for instance,
N(x; 1) x1+p2/3
from Proposition2.3, with t = 1.
12
-
8/12/2019 Self Power Map
20/82
Chapter 2. Bounds on N(x; a) 13
2.2 Estimates forN(x; a) for a of small order
This section describes two estimates for N(x; a) that are non-trivial whenahas relatively
small multiplicative order. For any element a
(Z/pZ), and given a fixed primitive
rootg , let ind adenote the discrete logarithm modulo p ofa with respect to the base g .
That is,
ind a= b gb amod p.
Aside from elementary number arguments, Proposition2.2also requires the following
sum-product result of Garaev ([Gar08], Theorem 1). LetA be a subset of (Z/pZ).
Define the sets
A + A = {a1+a2: a1, a2 A}
A A = {a1a2 : a1, a2 A}.
Lemma 2.1. For any setA (Z/pZ)
|A+
A| |A A |minp |A| ,
|A|4
p .
Proposition 2.2. Uniformly int|(p 1), we have forx p1+
a(Z/pZ)
ord a|t
N(x; a) xp1+t1 +x 14p 14+t 14 +xp 12+t 14 (2.1)
Proof. Fix a primitive root g modp. We have
a(Z/pZ)
ord a|t
N(x; a) =
a(Z/pZ)ord a|t
nx
nnamod p
1 =nx
ntn1 modp
1.
-
8/12/2019 Self Power Map
21/82
Chapter 2. Bounds on N(x; a) 14
We use g as the base of the discrete logarithm, so that we can express n = g indn. Then
starting from the equivalence ntn 1 modp we obtain:
ntn 1 (mod p), g(ind n)tn 1 (mod p) (ind n)tn 0 (mod p 1)
SetT = p1t
. Then the last condition implies
n(ind n) 0 (modT).
Fixd|Tand set Td= Td . If gcd(n, T) =d, then write n= dy and
y(ind dy) 0 (mod Td) ind dy 0 (mod Td)
since
gcd(y, Td) = 1, 1 y D= xd
.
Set
Xd = {y Z, 1 y D : ind dy 0 (mod Td), gcd(y, Td) = 1}
Wd = {y (mod p) :y Xd} .
Our goal is to estimate the sum
a(Z/pZ)
ord a|t
N(x; a) =d|T
|Xd|.
Since
|Xd
| D
p + 1 |Wd| =
x
dp+ 1 |Wd|,
we will bound|Xd|, and hence the sum, by first bounding|Wd|.The trivial upper bound for|Wd| is dt. This would give the bound
a(Z/pZ)
ord a|t
N(x; a) =d|T
|Xd| xtp
d|T1
+t
d|Td
.
-
8/12/2019 Self Power Map
22/82
Chapter 2. Bounds on N(x; a) 15
The sums
d|T1 and
d|Td will occur several times in the following proofs. Sometimes
we will denote the sum representing the number of divisors by d(T) =
d|T1. We will
use the bounds
d|T
1 T and d|T
d T1+. (2.2)Thus, we have
a(Z/pZ)ord a|t
N(x; a) xp1+t1 +p1+t. (2.3)
We obtain a stronger bound using Lemma 2.1, for certain t and x. Ify1, y2 Wd,then
dy1 gTd1 (mod p),dy2 gTd2 (mod p)
so that
d2y1y2 gTd(1+2) (modp).
There are (p 1)/Td= dt possibilities for the sum (1+ 2), hence also for the producty1y2. Thus
|Wd
Wd
| dt.
Furthermore,
|Wd+Wd| |Xd+Xd| 2xd
and
|Wd+Wd| p,
since the sum is defined in (Z/pZ). Hence |Wd+Wd| min{p, (2x)/d}. We havep
-
8/12/2019 Self Power Map
23/82
Chapter 2. Bounds on N(x; a) 16
The substitution of the bounds for |Wd+Wd| and |Wd Wd| results in four estimates for|Wd|,
p|Wd|
pdt if d < 2xp,|Wd| p 23
2xt if d 2xp,|Wd| p23
|Wd|4p
pdt if d < 2xp
,|Wd| < p 23
2xt if d 2xp,|Wd| < p23 .
Simplifying, we obtain,
|Wd|
dt if d < 2xp
,|Wd| p 23xtp
if d 2xp
,|Wd| p 23
p12 d
14 t
14 if d < 2x
p,|Wd| < p 23
x14p
14 t
14 if d 2x
p,|Wd| < p 23 .
Notice that ifd > 2xp, then xdp 0 implies that b Wd Wd. Hence
|Xd|2 bZ/pZ
s(b) |Wd Wd| max s(b).
For a given b, since y1y2 D2 =x2/d2 and y1y2 =b+kp, the product y1y2 can take atmost
x2
pd2+ 1
values. Oncezsuch that y1y1= zis fixed, there are
d(z) =zo(1)
xd
2o(1)
xd
for the pair (y1, y2). Hence
s(b)
x2
pd2+ 1
x
p
As in the proof of Proposition2.2, we have|Wd Wd| dt.
-
8/12/2019 Self Power Map
26/82
Chapter 2. Bounds on N(x; a) 19
Therefore
|Xd|2 dt
x2
pd2+ 1
x
p
|Xd| x
px
t
pd+
td
.
On the other hand, from the condition ind (dy) 0 (mod Td) we have
|Xd|
x
p+ 1
dt.
From the condition y D, we have also
|Xd| D= xd
.
We choose the intervals d p 13 t 13 , p 13 t 13 d p 23 t 13 , and d p 23 t 13 as in theproof of Balog et al. These intervals are chosen such that the bounds all have similar
(though not equal) orders of magnitude. We conclude that
|Xd|
xp
+ 1
dt ifd p 13 t 13xp
dt
xt
pd+
td
ifp13 t
13 d p 23 t 13
x
d ifd p2
3 t1
3
which simplifies to
|Xd|
xp23 t
23 ifd p 13 t 13
x1+p23 t
23 ifp
13 t
13 d p 23 t 13
xp23 t
13 ifd p 23 t 13 .
We will use the largest bound above,|
Xd|
x1+p
23 t
23 in the sum. Again, we apply
the number of divisors bound (2.2). Thus, for x p,
a(Z/pZ)ord a|t
N(x; a) =d|T
|Xd| Tx1+p 23 t 23 x1+p 23 t 23 .
-
8/12/2019 Self Power Map
27/82
-
8/12/2019 Self Power Map
28/82
-
8/12/2019 Self Power Map
29/82
Chapter 3. A preliminary probability model 22
The discrepancy of a sequence gives a quantitative and more general measure of
uniform distribution.
Definition 3.3. With notation as above, the discrepancy of the sequence{xn}Nn=1 in[0, 1)k is defined as
= supB[0,1)k
N(B)N |B|
where the supremum runs over all boxesB inside thek-dimensional unit cube.
Thus, the discrepancy measures the largest deviation ofN(B) from its expected value.The trivial bound for discrepancy is 1. We may rescale a sequence so that its terms fall
in the unit cube, and a uniformly distributed sequence will have discrepancy tending to
zero as the number of terms Ntends to infinity. However, the bound of discrepancy for
finite sequences is also a useful, quantitative indicator of how scattered the sequence is
within a region.
Again exponential sums are key, since a bound onN
n=1 e(h xn) gives a bound forthe discrepancy of a sequence, from the following important theorem.
Theorem 3.4(Erdos-Turan Inequality). Let{xn}Nn=1be points in thek-dimensional unitcube andM an arbitrary positive integer. Then
3
2k
2M+ 1
+ 0
-
8/12/2019 Self Power Map
30/82
Chapter 3. A preliminary probability model 23
3.2 The classical coupon collector problem
The coupon collector problem asks the following question: if m coupons are collected
with replacement, how many trials are needed in order to collect all m coupons? For
example, if cereal boxes randomly contain one ofm coupons, how many boxes must be
bought in order to collect at least one of every coupon?
The motivation for considering the coupon collector problem is simply the ideal sce-
nario. If elements were randomly distributed in Table1.3, then the expectedx in Question
1.12would be the same as the expected number of random draws required to obtain all
residue classes. In our case the cereal boxes correspond to the ns (or a subsequence
thereof) and the coupons to the residue classesamodp.
In Table1.3the residue classes occur with different frequencies, depending on their
order. However, we begin by describing the classical coupon collector model which as-
sumes that every coupon occurs with equal probability. The following calculation gives
the expected time, or number of draws to obtain a full collection. [Isa95, p. 80-82].
Proposition 3.5. Considermdifferent objects that are drawn with replacement. Suppose
each object occurs with equal probability1/m. The expected number of draws required to
collect allm objects is
E(T) =m log m+m+1
2+ o(1).
Proof. LetTbe the time (or number of trials) required to collect allm coupons, and let
tibe the time to collect thei-th new coupon given thati1 coupons have been collected.The T and ti are random variables. The probability of collecting a new coupon given
i 1 collected coupons ispi =
m i+ 1m
.
Theti has geometric distribution with expectation
E(ti) = 1
pi.
-
8/12/2019 Self Power Map
31/82
Chapter 3. A preliminary probability model 24
(For instance if the probability of an event is 13
, then the expected number of trials to
obtain a success is 3.) We have
E(T) = E(t1) +E(t2) +. . .+E(tm)
= 1
p1+
1
p2+. . .+
1
pm
= m
m+
m
m 1+ . . .+ 1
m
= m
1 +
1
2+ . . .+
1
m
= mHm
= m log m+m +1
2+ o(1)
In the last two lines we used the notation Hm for the m-th Harmonic number, the
sum of the reciprocals of the first m natural numbers, and the bound
1
2m 1
8m2 < Hm log m < 1
2m (3.1)
where is the Euler-Mascheroni constant [PS72,Ex. 18].
The following limit theorem for the probability distribution of time Tuntil a complete
collection is due to Erdos and Renyi. [ER61]
Theorem 3.6. For anyt R , we have that:
P(T m log m+tm) eet as m .
3.3 Powers of primitive roots and a first estimate ofx
The first, simple model considers a subsequence of{nn (modp)}nthat is equidistributedas p tends to infinity. The congruence nn a (mod p) seems most difficult to solve ifa is a primitive root, as n must be primitive as well. Thus, starting from Table1.3,we
-
8/12/2019 Self Power Map
32/82
Chapter 3. A preliminary probability model 25
consider only the columns which correspond to primitive elements, and only the first R
rows corresponding tok= 0, . . . , R 1.Let g1, g2, . . . , g(p1) be all the primitive roots modulo p. We will show that the
elements
ggi+kik,i, listed in Table3.1are equidistributed modulopasptends to infinity,
when R is greater thanp.
Columns corresponding to primitive roots
Primitive roots g1 g2 g3 . . . g(p1)
k= 0 gg11 gg22 g
g33 . . . g
g(p1)(p1)
k= 1 gg1+11 gg2+12 g
g3+13 . . . g
g(p1)+1(p1)
k= 2 gg1+21 gg2+2
2 gg3+2
3 . . . gg(p1)+2
(p1)
k= 3 gg1+31 gg2+32 g
g3+33 . . . g
g(p1)+3(p1)
...
k= R 1 gg1+R11 gg2+R12 gg3+R13 . . . gg(p1)+R1(p1)
Table 3.1: The residues modulo p of the self-power map nn from Table 1.3, where we
consider only the columns corresponding to primitive roots, and only the first R rows.
The following is Corollary 1 in [BGK06].
Theorem 3.7. There exist positive constantsC1 andC2 such that for any > 0, and
positive integer tp, and any elementg(Z/pZ) such that the multiplicative orderofg is greater thant, we have
maxa(Z/pZ)
t
1
k=0
exp
2iagk
p
tp, = exp C1/C2 .
Proposition 3.8. Given >0 andR= p, then elements =
ggi+ki
k,i
in Table3.1,
form an equidistributed sequence asp tends to infinity and the discrepancy of the sequence
-
8/12/2019 Self Power Map
33/82
Chapter 3. A preliminary probability model 26
is bounded by
Cp logp+O
1
p
Proof. According to Weyls criterion3.2, we bound the following exponential sum where
his nonzero modulo p.
(p1)i=1
R1k=0
exp
2ihggi+ki
p
=R1k=0
exp
2ihgg11 g
k1
p
+. . .+
R1k=0
exp
2ihgg(p1)(p1) gk(p1)
p
=R1
k=0 exp2ih1g
k1
p +. . .+R1
k=0 exp
2ih(p1)gk(p1)p
In the third line, we have set hggii =hi, as they are independent ofk. Eachgi has order
(p1) p1, thus the order ofgiis clearly greater thanR. The conditions of Theorem3.7are satisfied and we have
(p1)i=1
R1k=0
exp
2ihggi+ki
p
Rp +. . .+Rp =(p 1)Rp
This sum is of the order o((p 1)R), where(p 1)Ris the total number of terms,thus the sequence
ggi+ki
k,i
is equidistributed by Weyls criterion, Theorem3.2.
We may use the Erdos Turan Inequality, Theorem3.4, with M = p 1 to estimatethe discrepancy:
C2
p+
0
-
8/12/2019 Self Power Map
34/82
Chapter 3. A preliminary probability model 27
as some element of the sequence. We are taking (p 1)R terms of the sequence, andusing Theorem3.6,we obtain
P((p
1)R
-
8/12/2019 Self Power Map
35/82
-
8/12/2019 Self Power Map
36/82
Chapter 3. A preliminary probability model 29
Figure 3.2: Comparison between the cumulative distribution function derived from the
coupon collector problem and the empirical distribution function for the variables Xp
from equation (3.3). On the right is the graph considering only primes of the from
p = 2q+ 1, for some prime q; in this case, there is a translation of approximately 0.6
units between the curves.
In Chapter 4, we will instead refine the model to incorporate the non-uniform distri-
bution of elements in the entire Table1.3. The new model will correct the left-shift in
the cumulative distribution function.
Before continuing, it is worthwhile to give a final motivation for the preliminary model.
The elements in columns corresponding to primitive roots follow a simple pattern. Ifg
is a primitive root, the elements in the column corresponding to g are
gg, gg+1, gg+2, . . . gg1
.
Given another primitive root, g1 = gr, the elements in the column corresponding to gr
are
grg
r
, gr(gr+1), gr(g
r+2), . . . , gr(gr1)
=
grgr
, grgr+r, grg
r+2r, . . . , gr(gr1)
=
gs, gs+r, gs+2r, . . . , gs+(p1)r
,
where in the last line we let s rgr (modp 1). Thus, the elements in the column
-
8/12/2019 Self Power Map
37/82
Chapter 3. A preliminary probability model 30
corresponding to gr are everyrth element from the column for g, starting from gs.
We have an isomorphism between (Z/pZ) and p1, the group of (p 1)st roots ofunity, where we map generator g to e
2ip1 . The problem of obtaining all the residue classes
transforms into one of covering all the elements ofp1 on the unit circle. The numbers
in each column corresponding a primitive root gr where gcd(r, p 1) = 1 are mapped toroots of unity in hops of length r about the unit circlep1.
Figure 3.3: An illustration of the isomorphism between (Z/23Z) and 22 where 5
(mod 23) is mapped to e2i22. The first elements 55, 56, 57, and 58 (mod 23) of the column
corresponding to 5 are indicated. The diagram on the left also includes the first elements1010, 1011, 1012, 1013 (mod 23), of the column corresponding to 10 53 (mod 23) andwhich appear in hops of length 3 on the circle.
This is reminiscent of the equidistribution of the fractional part of n in the unit
interval when is an irrational number. While the sequence =
ggi+ki
k,i
is equidis-
tributed only as p tends to infinity, in both cases the initial elements of the sequence
follow a pattern, yet the sequence fills or appears to fill the interval of interest.
-
8/12/2019 Self Power Map
38/82
Chapter 4
Expectation ofx and its cumulative
distribution
In this chapter we give a detailed heuristic argument to find the expected x such that
nn a mod p has a solution 1 n x for any given residue class amod p. Thisexpected value ofx has main term (p 1)2 log (p 1)/(p 1).
The argument consists of two parts. First, we give a variation of the coupon collector
problem, including blank coupons which need not be collected. This model provides a
better estimate for the expected x. Secondly, using this value for the expectation, we
calculate the probability distribution for the number of draws until a complete collection.
After a suitable normalization, we obtain the double exponential distribution eet
, as
in the classical coupon collector case. The saddle point method is used in the calculation
of the probability distribution.
These heuristics are well-supported by the computational results.
4.1 Coupon collector problem with blank coupons
In order to calculate the expected x in Question1.12, we will use a variant of the coupon
collector problem. As before, we assume that the elements of Table 1.3 are randomly
31
-
8/12/2019 Self Power Map
39/82
Chapter 4. Expectation of x and its cumulative distribution 32
distributed. Thus, our expectedx should be equal to the number of draws required to
obtain a full collection of residue classes from a random sampling of entries in Table
1.3. The different residue classes a congruent to 1, 2, . . . , (p 1) modp correspond to the
coupons to be collected.
A part of Theorem 4.1 of [FGT92] gives the expectation of the time necessary to
obtain a full collection from a random, but non-uniform probability distribution.
Proposition 4.1. The expectation E(T) of the time necessary to gather a complete
collection ofm objects from a general probability distribution{pi}mi=1, sampling with re-placement, is given by
E(T) =
0
1
mi=1
1 epit dt.
Since higher order elements occur least often in Table1.3,we set the residue classes
amodpwhich are primitive roots modulopto be the coupons of value. All other coupons
will be blank coupons.
In this section, we will calculate the expected number of trials to obtain a full collec-
tion of useful coupons. We will use the termsusefulandnon-blankinterchangeably. This
problem is a variation on the coupon collector problem under non-uniform distribution.
Consider a collection ofm coupons and let mB coupons, which appear with probabil-
itiesp1, p2, . . . , pmB , be blank. We do not keep track of how many of these coupons are
collected, if any. They are omitted from the coupon collector problem in the sense that
we want to collect the m mB useful coupons, not all m coupons. Let
p1+p2+. . .+pmB =.
Our goal in this section is to prove the following proposition.
Proposition 4.2. With notation as above, the expectationE(Tj)of the time necessary to
gather a collection ofj useful (non-blank) coupons under a general probability distribution
-
8/12/2019 Self Power Map
40/82
Chapter 4. Expectation of x and its cumulative distribution 33
withmB blank coupons is
E(Tj) =
j1q=0
0
[uq]
m
i=mB+1
1 +u
epit 1
e(1)tdt, (4.1)
where [uq]f(u) denotes theq-th coefficient in the Taylor expansion off(u). For the full
collection of them mB useful coupons the expectation is
E(T) =
0
1
mi=mB+1
(1 epit)
dt. (4.2)
The proof of Proposition4.2 follows the proofs of Theorems 3.1 and 4.1 in [FGT92]
very closely. Using regular languages extended by the shuffle product, Flajolet et al. pro-
duce generating functions to derive integral representations for expectations and prob-
ability distributions. Appendix A gives a summary of the background required. We
begin by rewriting Theorem 3.1 of [FGT92] to obtain Proposition4.4 below, of which
Proposition4.2 is a corollary.
The alphabet A consists ofm different coupons, and pi is the probability of choosingcouponifrom A. Consider the following generalization of the coupon-collector problem:
Question 4.3. Determine the expectation of the numberBj of elements that need to be
drawn fromA (with replacement) until we first obtain j distinct coupons that are eachrepeated at leastk times.
The case k = 2 and j = 1 is the classical birthday problem of determining the
expected number of random people needed such that two will share a birthday. The case
k= 1 andj =m is the coupon collector problem where a full collection is desired. Below
we determine E(Bj) for the case when the alphabetAcontains blank coupons.
Proposition 4.4. Consider a set of m coupons under a general probability distribu-
tion{pi}mi=1, which are drawn with replacement. Let mB coupons, which appear with
-
8/12/2019 Self Power Map
41/82
Chapter 4. Expectation of x and its cumulative distribution 34
probabilitiesp1, p2, . . . , pmB , be blanks. The expectationE(Bj) of the time for obtaining
j m mB distinct useful coupons, each appearingk times, is given by
E(Bj) =
j1
q=0
0
[uq] m
i=mB+1
ek1(pit) +u epit ek1(pit) e(1)tdt (4.3)
whereek1(t) represents the truncated exponential
ek1(t) = 1 + t
1!+
t2
2!+ . . .+
tk
k!.
Proof. Let the random variable Bj be the number of draws for first obtaining at least k
copies ofj useful coupons in an infinite sequence of trials.
LetYnbe the random variable defined on An representing the number ofk-occurrencesof different useful coupons in a sequence ofn trials. The two probability distributions
are related by
Pr{Ynj} = Pr{Bj n}.
From this, the expectation ofBj is found as follows:
E(Bj) =n1 n Pr{Bj =n}
=n0
Pr{Bj > n}
=n0
Pr{Yn< j}
=
j1q=0
n0
Pr{Yn= q}
It remains to estimate the sumn0Pr{Yn= q
}.
LetHq be the language consisting of words with exactlyquseful letters that occur at
least k times. Label the letters corresponding to the blank coupons as 1, 2, . . . , mB ,
and these may occur any number of times. All other letters occur at most k 1 times.Let us set
-
8/12/2019 Self Power Map
42/82
-
8/12/2019 Self Power Map
43/82
Chapter 4. Expectation of x and its cumulative distribution 36
Therefore, n0
Pr{Yn= q} =hq(1) =
0
[uq](t, u)e(1)tdt,
and summing over qranging from 0 to j 1, we have the statement of the theorem.
We can now prove Proposition4.2.
Proof. The equation (4.1) is a specialization of equation (4.3) of Proposition4.4to the
casek = 1.
To obtain equation (4.2) from (4.1), we introduce the function
(t, u) =m
i=mB+1 1 +u
epit 1
(4.5)
which is the function in equation (4.4) in the case k= 1. Expand (t, u) as follows:
(t, u) =
mmBq=0
q(t)uq.
We have:
E(T) =E(TmmB) =mmB1
q=0
0
[uq](t, u)|u=1 e(1)tdt
=
0
mmB1q=0
[uq
](t, u)|u=1 e(
1)t
dt
=
0
(0(t) +1(t) +. . .+mmB1(t)) e(1)tdt
=
0
((t, 1) mmB(t)) e(1)tdt.
We have that mmB(t) =m
i=mB+1(epit 1) and, using(4.5),
(t, 1) =m
i=mB+1epit =
m
i=1epit
mB
i=1epit
= e(p1+p2+...+pm)te(p1+p2+...+pmB )t
= e(1)t
Therefore, we obtain:
E(T) =
0
((t, 1) mmB(t)) e(1)tdt
-
8/12/2019 Self Power Map
44/82
Chapter 4. Expectation of x and its cumulative distribution 37
=
0
e(1)t
mi=mB+1
epit 1
e(1)tdt
=
0
1 e(1)t
m
i=mB+1epit
m
i=mB+1 1 epit
dt
=
0
1
mi=mB+1
1 epit
dt
4.2 Upper and lower bounds for expectation
Assume that the elements in Table 1.3are randomly selected with replacement. In this
section we give upper and lower bounds for the expectation of the number of draws
required to obtain a full collection of all residue classes 1, 2, . . . , (p1) modp. Thelower bound of the expectation is the same as the expectation for the coupon collector
with blank coupons, where all coupons except those corresponding to primitive roots are
blank.
This lower bound is the one we will use in the calculation of the cumulative distri-
bution in the following section. It is more accurate in the case of primes of the form
p= 2q+ 1 where qis also prime.
Proposition 4.5. Assume that the elements in Table1.3randomly selected with replace-
ment. Then the expected time required to obtain a full collection is bounded as follows:
(p 1)2(p 1)(log (p 1) ++o(1))< E(T) (p 1)2
(p 1)
log (p 1) ++ 12(p 1)
1
8(p 1)2
,
-
8/12/2019 Self Power Map
46/82
Chapter 4. Expectation of x and its cumulative distribution 39
again using inequality (3.1). Note that by Proposition4.2,0
1
1 e
(p1)
(p1)2t(p1)
dt
is equal to the expected time if we assume the elements in Table 1.3 are randomlydistributed but all elements other than the primitive roots are blank.
4.2.1 Expectation estimate for the casep= 2q+ 1
Letp = 2q+1, whereqis also a prime. We let the residue classes 1 andp1 1 modpcorrespond to the blank coupons since they are easy to collect; they will appear in the
first two rows of Table1.3. We need only consider the number of draws required to collectall other residue classes, which have large order. There are q 1 elements of order q in(Z/pZ), each of which appear 3(q 1) times in Table1.3,and q 1 elements of order2q, each of which appear (q 1) times in Table1.3.
Proposition 4.6. Given2qdifferent coupons of which two are blank, q 1 each appearwith probability (q1)
4q2 , and q 1 each appear with probability 3(q1)
4q2 , then the expected
number of draws to obtain all2q
2 useful coupons is bounded by
E(T) =
0
1
1 e
(q1)t
4q2
(q1) 1 e
3(q1)t
4q2
(q1)dt
(p 1)2
(p 1)
log(p 1) + log12
+ +O
1
q1/4
+
1
4(p 1)
.
Proof. By Proposition4.2, the expected time to collect the 2q2 useful coupons is givenby
E(T) =
0
1 1 e (q1)t
4q2 (q1) 1 e
3(q1)t4q2 (q1)
dt.
Letu= e (q1)t
4q2 . Thendt= 4q2(q1) duu and
E(T) = 4q2
(q 1) 0
1
1
u
1 (1 u)(q1)(1 u3)(q1) du
= 4q2
(q 1) 1
0
1
u
1 (1 u)(q1)(1 u)(q1)(1 +u+u2)(q1) du
-
8/12/2019 Self Power Map
47/82
Chapter 4. Expectation of x and its cumulative distribution 40
= 4q2
(q 1) 1
0
1
u
1 (1 u)2(q1) (1 u)2 + 3u(q1) du
= 4q2
(q 1) 1
0
1
u
1 (1 u)2(q1)
(1 u)2(q1) +
q 1
1
(1 u)2(q2)3u+
. . .+ 3
(q
1)
u
(q
1)
du
= 4q2
(q 1) 1
0
1
u
1 (1 u)4(q1) . . .
q 1
k
(1 u)2(q1)+2(qk)3kuk
. . . 3(q1)u(q1)(1 u)2(q1) duThe first term is evaluated as in Proposition4.5:
4q2
(q 1) 1
0
1
u
1 (1 u)4(q1) du = 4q2
(q 1) H4(q1)
=
4q2
(q 1) log 4(q 1) ++ 1
2(p 1)=
(p 1)2(p 1)
log(p 1) + log 2 ++ 1
4(p 1)
The other terms in the binomial expansion can be approximated as follows. We recall
the definition of the Euler Beta function: 10
(1 t)a1tb1 =B(a, b) =(a 1)!(b 1)!(a+b 1)! .
Then
3k 1
0
q 1
k
(1 u)2(q1)+2(qk)uk1du
= 3k
q 1k
B (2(q 1) + 2(q k) + 1, k)
= 3k
(q 1)!k!(q 1 k)!
(2(q 1) + 2(q k))!(k 1)!
(2(q 1) + 2(q k) +k)!=
3k
k
(q 1)(4q
k
2)
(q 2)(4q
k
2
1)
(q 1 l)(4q
k
2
l)
(q 1 k+ 1)(4q
k
2
k+ 1)
In general:
(q 1 l)(4q k 2 l) =
1
4+
2 3l+k4(4q k 2 l) .
Since 0 l < k and k ranges from 1 toq1, we can bound the left hand side from above1
4+
2 3l+k4(4q k 2 l)
1
4+
2 +k4(4q k 2 l)
-
8/12/2019 Self Power Map
48/82
Chapter 4. Expectation of x and its cumulative distribution 41
14
+2 +q
16q
14
+ 1
16=
5
16
Let K be a number much less than q
1. We will take K = q1/4 for simplicity,
although anyK=o(q1/2) is sufficient. Summing over k we have
q1k=1
3k
k
k1l=0
(q 1 l)(4q k 2 l)
=
q1k=1
3k
k
k1l=0
1
4+
2 3l+k4(4q k 2 l)
K
k=13k
k
k1
l=0 1
4+
2 +k4(4q
k
2
l) +
q1
k=K3k
k
k1
l=15
16.
The second term can be approximated by a geometric sum:
q1k=K
3k
k
k1l=1
5
16 =
q1k=K
3k
k
5
16
k1
165
k=K
1
k
15
16
k
= 16
5
1
K
15
16
K1 15
16
1
= (16)2
5 1K
1516K
.
The first term will be approximated with the aid of several inequalities obtained from
Taylor series. Since ex >1 + x for all real x, we have,
Kk=1
3k
k
k1l=0
1
4+
2 +k4(4q k 2 l)
=K
k=1
1
k 3
4kk1
l=0
1 +
2q+ k
q
4
1 k
16q 1
8q l
16q
Kk=1
1
k
3
4
ke1q
k1l=0
2+k
4(1 k16q 18q l16q) .
Using
1
q
k1l=0
2 +k4
1 k16q 18q l16q 1
q
k1l=0
2 +k3
1q
2
3k+
k
6(k 1)
k
2
q
-
8/12/2019 Self Power Map
49/82
Chapter 4. Expectation of x and its cumulative distribution 42
and the inequalityex 0, we haveKk=1
3k
k
k1l=0
1
4+
2 +k4(4q k 2 l)
Kk=1
1
k
3
4
kek2
q
Kk=1
1k
34
k1
1 k2q
Kk=1
1
k
3
4
kq
q k2
=Kk=1
1
k
3
4
k1 +
k2
q k2
=K
k=11
k
3
4
k+
K
k=11
k
3
4
k k2
q k2
.
Finally, we use the Taylor expansion for log(1 x) to obtainKk=1
3k
k
k1l=0
1
4+
2 +k4(4q k 2 l)
log
1
4
+O
K2
q
ln
3
4
= log 4 O
K2
q
.
Putting it all together, we conclude
E(T) (p 1)2(p 1)
log(p 1) + log 2 log 4 ++O
1q1/4
+ 1
4(p 1)
= (p 1)2
(p 1)
log(p 1) + log12
+ +O
1
q1/4
+
1
4(p 1)
.
Thus the expected number of draws required to collect all residue classes differs from
the classical coupon collector model by log(1/2)
0.693. It may be reasonable to
assume that the entire probability distribution function has shifted by this amount, since
in Figure 3.2, the empirical distribution function differs from the theoretical one by a
shift to the left of approximately 0.6 units, for primes of the form p = 2q+ 1.
We note that logp+ log(1/2) is approximately log (p 1). Although only -1 and 1were considered as blank coupons, the upper bound of the expectation approaches the
-
8/12/2019 Self Power Map
50/82
Chapter 4. Expectation of x and its cumulative distribution 43
lower bound in Proposition4.5, where all non-primitive residue classes were considered
blanks. In the next section, we will use the lower bound in Proposition 4.5in determining
the cumulative distribution function.
4.3 Cumulative distribution function
Recall that for the classical coupon collector problem with m types of coupons which are
uniformly distributed, we have
E(T) =m log m+m+1
2+ o(1)
andlimm
P(T m log m+tm) =eet.
Similarly, given a coupon collector problem with probability distribution matching the
frequencies of the residue classes in Table 1.3, if all but the primitive element residue
classes are considered to be blank we have
E(x) = (p 1)2(p 1)log (p 1) +
(p 1)2(p 1)+
1
2
p 1(p 1)
2+o(1)
from the proof of Proposition4.5. Using this estimate for the expectation, we show that
limp
P
x (p 1)
2
(p 1)log (p 1) +t(p 1)2(p 1)
=ee
t
.
in the coupon collector problem considering the entire table with no blank coupons, if
p 1 is square-free.To calculate the cumulative distribution function, we return to generating functions.
In[FS09, p. 192], we have the following result.
Proposition 4.7. Consider a set ofm coupons under a general probability distribution
{pi}mi=1, which are drawn with replacement. Then the probability that we will have a fullcollection by timeT less than or equaln is given by
P(T n) =n![zn]m
j=1
(epiz 1) .
-
8/12/2019 Self Power Map
51/82
-
8/12/2019 Self Power Map
52/82
Chapter 4. Expectation of x and its cumulative distribution 45
where C is a contour that encircles the origin. For certain functionsG(z), the integral canbe evaluated using the saddle point method. Roughly speaking, we can choose a contour
Csuch that the function is large only on a small part of the contour and small everywhere
else; the integral may be approximated by a Gaussian integral which captures the area
under the peak. The path of integration passes near to the saddle point of the function,
hence the name.
We can use the saddle point method if G(z) is Hayman-admissible. The following
definition is from[Hay56, p. 68], with notation from [FS09, Definition VIII.1., p. 565].
Lettingz= rei, we have
log G
rei
= log G(r) +
=1
(r) (i)
!
and define
h(z) = log G(z)
and
a(r) = 1(r) =rh(r)
b(r) = 2(r) =r2h(r) +rh(r).
Definition 4.8 (Hayman-admissibility). LetG(z) have a radius of convergenceR with
0< R and be always positive on some subinterval(R0, R) of(0, R). The functionG(z) is said to be H-admissible (or Hayman-admissible) if, witha(r) andb(r) defined as
above, it satisfies the following three conditions:
1. (Capture condition) limrR a(r) =
and limr
R b(r) =
.
2. (Locality condition) For some function 0(r), defined over (R0, R) and satisfying
0< 0< , one has
G
rei G(r)eia(r)2b(r)/2
asr R, uniformly in|| 0(r).
-
8/12/2019 Self Power Map
53/82
Chapter 4. Expectation of x and its cumulative distribution 46
3. (Decay condition) Uniformly in0(r) || ,
G
rei
=o
G(r)
b(r)
.
One of the convenient closure properties of Hayman-admissibility functions is the
following,[Hay56,Theorem VII.].
Proposition 4.9. IfG1(z) andG2(z) are H-admissible functions with the same radius
of convergenceR, thenG1(z)G2(z) is also H-admissible.
Thus for G(z) to be H-admissible it is sufficient to check that a function of the form
H(z) = (ez
1), where >0, is H-admissible.
Proposition 4.10. H(z) = (ez 1) where >0 is H-admissible.
Proof. H(z) is analytic and positive on (0, ), as required. We have
a(z) = z
ez
ez 1
b(z) = z
ez
ez 1
2z2ez
(ez 1)2
and
limr
a(r) = , limr
b(r) = ,
satisfying the first condition.
The value of 0 is found using the coefficients in the expansion of log G(rei), by
requiring
2(r)20
3(r)30 0.
Since the coefficients j(r) tend to r, we can take 0= (r) 2
5 .
Thus,
log G(r)eia(r)2b(r)/2 = log (er 1) +ia(r) 2 b(r)
2
-
8/12/2019 Self Power Map
54/82
Chapter 4. Expectation of x and its cumulative distribution 47
r+ir 2 r2
= r
1 +i 1
22
asr tends to infinity. On the other hand,
log G
rei
= log
erei 1
= log ere
i
1 erei
= rei + log
1 erei
rei
= r 1 + i1!
+(i)2
2! +. . . .
For n 3, the terms r (i)nn! = 1n! 12n/5r12n/5
tend to zero as r tends to infinity. Thus G
rei G(r)eia(r)2b(r)/2, satisfying the
locality condition.
Finally,
G rei = erei 1 er cos er
1 22!
+ 4
4!
er 12 (r)1/5+ 124 (r)3/5 e
r
e12
(r)1/5
= o
er
r
=o
G(r)
b(r)
asr tends to infinity. We have used the bound
-
8/12/2019 Self Power Map
55/82
-
8/12/2019 Self Power Map
56/82
Chapter 4. Expectation of x and its cumulative distribution 49
from Theorem1.11, we have
limr
h(r) =
d|(p1)(d)2g
p 1
d
= (p 1)2
limrh(r) = 0.
In addition,
a(z) = zh(z)
d|(p1)(d)2g
p 1
d
z= (p 1)2z
b(z) = z2h(z) +zh(z)
d|(p1)(d)2g
p 1
d
z= (p 1)2z.
The radius of integration should be a good approximation for the solution to the
equationG()
G() =n, thus
= n
d|(p1) (d)2gp1d
= n(p 1)2 .
We use an estimate of the expectation ofxin our expression for n,
n=(p 1)2 log (p 1) +t(p 1)2
(p 1) ,
and we will find how the cumulative distribution function varies with t.
Our goal is to evaluate
n!
p2n
1
n
2b()
G().
Using the Stirling approximation for n!, the factor n!p2n
1
n
2b()
tends to 1en . We
have,
1
enG() = 1
en
d|(p1)
e(d)g
(
p1
d )
n
(p1)2 1(d)
= en
d|(p1)en(d)2g(p1d )
(p1)2
1 e
n(d)g(p1d )(p1)2
(d)
= en+n
d|p1(d)
2g(p1d )(p1)2
d|(p1)
1 e
n(d)g( p1d )(p1)2
(d)
-
8/12/2019 Self Power Map
57/82
Chapter 4. Expectation of x and its cumulative distribution 50
=
d|(p1)
1 en
(d)g(p1d )(p1)2
(d)
= d|(p1)1
e (p1)2(log(p1)+t)
(p1)
(d)g(p1d )(p1)2
(d)
=
d|(p1)
1 e(log (p1)+t)
(d)g(p1d )(p1)
(d)
=
1 e(log (p1)+t)(p1)d| (p1)
2
1 e(log (p1)+t)
(d)g(p1d )(p1)
(d)
=
1 1
(p 1) et(p1)
d| (p1)2
1 e(log (p1)+t)(d)g(p1d )
(p1)
(d)
Since
limp
1 1
(p 1) et(p1)
=eet
,
it remains to show that the product over divisors of (p 1)/2 tends to 1. Let
=
d| (p1)2
1 e(log (p1)+t)
(d)g(p1d )(p1)
(d)
log =
d| (p1)2
(d)log
1 et(p 1)
g( p1d )(p1d )
.We must have t (p 1), since we expect np2/(p 1) by Theorem4.5on
the lower bound on expectation. Thus,
et
(p1)
1. Using the Taylor expansion for
log(1 x) for smallx, we have
log =
d| (p1)2
(d) et
(p 1) g( p1d )
( p1d ) +O
et
(p 1)
2
g(p1d )(p1d )
If (p 1)/d= q1q2 . . . q r then
g
p 1
d
= (2q1 1)(2q1 1) (2qr 1)
-
8/12/2019 Self Power Map
58/82
Chapter 4. Expectation of x and its cumulative distribution 51
p 1
d
= (q1 1)(q1 1) (qr 1)
so that
infg
p1d
p1d
= 2.Notice that (p 1)/d= qand as qgets very large, then (2q 1)/(q 1) approaches 2.We will use the notation d(x) for the number of divisors ofx. We have
|log |
n asn . The Kolmogorov-Smirnov test rejects or accepts the null hypothesis based on athreshold value ofDn, which depends on the level of significance.
-
8/12/2019 Self Power Map
60/82
Chapter 4. Expectation of x and its cumulative distribution 53
For odd primes less than 55000, the Kolmogorov-Smirnov test accepts the null hy-
pothesis of the double exponential distribution eet
, at a significance value 0.05, with
p-value of 0.7266. The p-value is the probability of getting a result at least as extreme as
the empirical value, assuming that the null hypothesis is true. A p-value of 0.7622 means
that there is little or no evidence against the null hypothesis.
Figure 4.1: Comparison between the cumulative distribution function derived from the
coupon collector problem and the empirical distribution function for the variables Xp
from equation (4.6).
-
8/12/2019 Self Power Map
61/82
Chapter 5
A deterministic bound
In this chapter we will prove the following theorem.
Theorem 5.1. For a sufficiently large prime p, and any fixed residue class a mod p,
there exists ann such that
nn a (modp)
andn < p2, for some >0 independent ofp.
For instance, for any large p, the value 1103 is sufficient, as shown inSection5.4.
5.1 Elements a of small order
If a 0 (modp), we have pp 0 (modp), and so we may consider only nonzeroelements a in Z/pZ. Table1.3, reproduced below for convenience, will again be useful.
Given a, finding a solution n < p2 to the equation nn a (modp) is equivalentto the element a appearing in the first p1 rows of the table. Using the notation from
the table, this means that we have nn a (mod p) withn = kp + bwherek < p1 and1 b p 1.
54
-
8/12/2019 Self Power Map
62/82
Chapter 5. A deterministic bound 55
Values ofnn (modp), where n = kp+b
b 1 2 3 . . . (p 2) (p 1)k= 0 11 22 33 . . . (p 2)p2 (p 1)p1
k= 1 12 23 34 . . . (p 2)p1 (p 1)1
k= 2 13 24 35 . . . (p 2)1 (p 1)2...
...
k= (p 2) 1p1 21 32 . . . (p 2)p3 (p 1)p2
Recall that ifb (mod p) has low multiplicative order ord b= m, then all the elements
of order dividingm will appear in the first m rows of thebth column in Table1.3. Thusif a column is generated by an element bof low order, then b itself will appear as
nn = (kp+b)kp+b b (mod p)
for somek < m.
Therefore if
nn
a (modp)
where a has small order, sayp, we may find solutions with n p1+ simply by consideringthe elements in the column generated by a. Therefore, we separate the elements of
(Z/pZ) into those of small multiplicative order (less than or equal to p) and those of
large order (greater than p). It remains to prove the theorem for a of large order.
To obtain an numerical value for in Theorem5.1 using a deterministic bound from
Shparlinski [Shp11], we must take >1/2.
5.2 Elements a of large order
The idea for the proof for elements of large order is as follows: we would like the position
of an element ato be scattered enough in Table1.3,so that awill appear at least once
-
8/12/2019 Self Power Map
63/82
Chapter 5. A deterministic bound 56
in the top rows. We will consider the appearance ofa only in the columns of elements of
the same order. To simplify further, it is sufficient for a to be scattered in the columns
of Table5.1.
Values ofnj (modp)
n 1 2 3 . . . (p 2) (p 1)j = 1 11 21 31 . . . (p 2)1 (p 1)1
j = 2 12 22 32 . . . (p 2)2 (p 1)2
j = 3 13 23 33 . . . (p 2)3 (p 1)3...
...
j = (p 1) 1p1 2p1 3p1 . . . (p 2)p1 (p 1)p1
Table 5.1: We consider the tablenj (modp), from which we may obtain the table for
nn (modp) by a shift along the diagonal. Thus if elements in this table are scattered,
so are those in the nn table.
The proof follows the strategy for Theorem 8 in [Mur10]. We relabel aas g, and we
consider it to be a fixed element in (Z/pZ) of multiplicative order t > p. On a first
reading of the proof, it may be useful to think of the simplest case, when g is a primitive
root; hence the change in notation.
Denote the representatives mod p of all the elements of order tas
1< g1< g2< . . . < gN< p
whereN =(t). Note that g is equal to one of the gis. Let 1 ai < t be the numbersrelatively prime to t such that
gaii g mod p.
Then,
gi ga1i mod p,
-
8/12/2019 Self Power Map
64/82
Chapter 5. A deterministic bound 57
where the inverse ofai is taken modulo t.
To show that nn g mod p has a solution in n with n < p2, it will be enough toshow that the sequence{(ai, gi)}Ni=1 is sufficiently scattered in the t p rectangle. This
is equivalent to the element g being scattered in Table5.1. First normalize to the unit
square [0, 1)2 and recall that gi ga1i mod p, to obtain the sequence
ait
,gip
Ni=1
=
ait
,ga
1i
p
Ni=1
.
We begin by showing that the discrepancy of this sequence is small.
Figure 5.1: Ifg is primitive, the rectangle of interest has dimensions (p 1) p. Thefigure shows the positions of{(ai, gi)}Ni=1 corresponding to the smallest primitive rootsg= 6 and g = 2 for the primes 151 and 491, respectively.
Proposition 5.2. Letp be a prime. The discrepancy of the sequence
ait
,gi
p
N
i=1
is bounded byO
(log2 t)N1t1
, for some >0, whereN=(t).
Proof. Consider the sum
Sm,n =Ni=1
e
m
ga1i
p +n
ait
-
8/12/2019 Self Power Map
65/82
Chapter 5. A deterministic bound 58
=Ni=1
ep
mga1i
et(nai)
=
(x,t)=1
ep
mgx1
et(nx).
We will restrict m, n to be integers less than t in magnitude. Theorem 4 in [BS08]
gives the following bound, which applies for the case m = 0 andt < m, n < t.
Theorem 5.3. For any > 0 there exists > 0 such that for t p, uniformly overm (Z/pZ) andn Z/tZ, we have the bound
Sm,n t1.
Ifn = 0 andm= 0 the sum reduces to the Ramanujan sum:
S0,n=
(x,t)=1
et(nx).
We have
S0,n= S0,n,
so that|S0,n| =|S0,n|. Therefore, we need only consider the cases whenm, n are non-negative. The Ramanujan sum can be evaluated using von Sternecks formula, where
k= gcd(n, t):
S0,n=
t
k
(t)
tk
.By the Erdos-Turan inequality, Theorem3.4for two dimensions, we have
C 2M+ 1+ 1N (m,n)=(0,0)0m,nM
1
1 +m
1
1 +n
|Sm,n| . (5.1)
Takingmand n non-negative above only changes the constant in Erdos-Turan inequality
by a multiple of 4, so that
C= 4
3
2
2.
-
8/12/2019 Self Power Map
66/82
Chapter 5. A deterministic bound 59
We can separate the sum above into two parts, depending on whether|Sm,n| < t1
or|Sm,n| > t1. By the result of [BS08] quoted above, we only have to consider sums ofthe form|S0,n|. Let A be the set
A=
1 n < t such that|S0,n| > t1
.
Taking M=t 1 in equation (5.1), we have
C
2t + 1N
(m,n)=(0,0),n/A
0m,n x
e log log x+ 3log logx
[BS66, p. 320] and log log(t/k)>1 for (t/k)> ee, we have
t >
t
k
>
tk
e log log tk + 3
-
8/12/2019 Self Power Map
67/82
Chapter 5. A deterministic bound 60
provided (t/k)> ee. Thus
t >tk
e log log t+ 3
k >
t1
e log log t+ 3 . (5.3)
We remark that in the case log log(t/k) < 1, we have k > (t/ee). Thus in either case,
n > k > t12 for t large enough.
Lemma 5.5. Forp large enough,|A| t2.
Proof. We have from[Bac97] the identity
tn=1
|S0,n| =(t)2(t)
where(t) is the number of distinct prime divisors oft. From the inequality[NZM91,p.
394]
2(t) < t(1+)log2loglog t
we obtain 2(t) < t ift is large enough.
Thus, there are at most2(t)N
t1 2(t)t t2
elements ofA, where we have N=(t).
We can now return to the proof of Theorem5.1. Using the trivial estimate |S0,n| < N,we have
1
NnA
1n
-
8/12/2019 Self Power Map
68/82
Chapter 5. A deterministic bound 61
Proposition 5.6. Every elementg (Z/pZ) of order t > p1/2 can be represented asnn g mod p for some natural numbern p2 for some >0.
Proof. For the equivalence nn g mod p to have a solution with n < p2, we mustshow that
ggi+kii g mod p
for ki < p1, for at least one value i = i0. Then the required n is equal to pki0 +gi0.
We find such a solution as follows.
Recall that the discrepancy is defined by
= supB[0,1)2
N(B)N |
B
|whereN(B) is the number of elements in the sequence that fall into the box B . RecallthatN=(t) and using the bound for discrepancy as in [Mur10], we obtain
N(B) > N(|B| )
> (t)|B| C(log2 t)t1.
Therefore, if
|B| =(t) , where <
we haveN(B)> 0 forp large enough; this is becauset1 =o (t)1. In other wordsB contains at least one point,
ait
,gip
Ni=1
from the sequence.
When we re-normalize, we note that the length and width of the box were stretched
by different factors, so that we must multiply the volume by pt. Thus, in the t prectangle every box of volume
|Bre-normalized| =pt(t) =pt1 , where < < . (5.4)
contains a point of the sequence{(ai, gi)}Ni=1.
-
8/12/2019 Self Power Map
69/82
-
8/12/2019 Self Power Map
70/82
Chapter 5. A deterministic bound 63
Figure 5.2: Every box of height t1/2 and length p1
/4 contains one element of the
sequence{(ai, gi)}N
j=1. Considering a box positioned under the diagonal will give an
element (ai0, gi0) with the required properties.
-
8/12/2019 Self Power Map
71/82
Chapter 5. A deterministic bound 64
5.4 A numerical value of
To find a finite value of in Theorem5.1that is large enough, it is sufficient to maximize
, hence (see equation (5.4)). Let t= p where 12 <
1. We will find a value of
in terms of that is maximal according to the following explicit bound for Sm,n given by
Shparlinski in[Shp11].
Theorem 5.7. Letg be of multiplicative order t. Then for any fixed integersx, y 2uniformly overm (Z/pZ) andn Z/tZ, we have the bound
Sm,n t1rx,ypsx,y+o(1)
asp where
rx,y = 1
2(2x+y) 1
4xy and sx,y =
1
4(2x+y).
Sincerx,y < sx,y/2, we may assume that
t
p1/2(logp)2
as otherwise the bound is trivial. Let t= p, where >1/2. Substituting p= t1/ into
the inequality for Sm,n,
Sm,n t1+1
14(2x+y)
12(2x+y)
+ 14xy
+ o(1) =t1
where
= x,y = 1
2(2x+y) 1
4xy 1
1
4(2x+y)o(1)
. (5.5)
We will find the appropriate values ofx, y >2 that maximizefor a given according
to the formula above. This is a small calculus exercise.
Claim. The valueis maximized in the equation (5.5) when
x = 4
(2 1)
-
8/12/2019 Self Power Map
72/82
Chapter 5. A deterministic bound 65
y = 8
(2 1)
and
=
1
27
(2
1)2
2 o(1)
.
Proof. Letz= 2x+y. Theny= z 2xand
=
1
2 1
4
z1 1
4x1(z 2x)1.
x = 14
(1)x2(z 2z)1 14
x1(1)(z 2x)2(2)
z =1
2 1
4
(1)z2
1
4x1
(1)(z 2x)2
.
For x to be zero,
1
x21
(z 2x)2
x
1
(z 2x)2 = 01
x21
(z 2x) = 2
x
1
(z 2x)2(z 2x) = 2x
z = 4x
y = 2x.
For y to be zero as well,
1
2 1
4
1
z2 =
1
41
x 1
(z 2x)2 .
Substitutingz= 4x,
1
2 1
4a
1
16x2 =
1
41
x 1
(2x)2
x = 1
12 1
4
x = =
4
(2 1) .
-
8/12/2019 Self Power Map
73/82
Chapter 5. A deterministic bound 66
Substituting the above expression ofx in terms of into z= 4x and then into x,z,
yields the required values. It remains now to show by the second-derivative test that
these values are maximums. Since
xx|x= 4(21)
= 14
12 1
4
42.
We have < < and < /4. Therefore if= 9/10, for instance,
= 0.005878894767784 910
o(1)
and we may take
-
8/12/2019 Self Power Map
74/82
Chapter 6
Conclusion and future work
In the first part of the thesis, we extended bounds for the number of solutions to
nn amod p for a fixed residue class a. Previous bounds from Balog et al. [BBS11]considered 1 n p 1, and we extended those bounds for the case when n > p. Ex-isting bounds in both cases are far from the conjectured values [Hol02b]. Using further
tools from additive combinatorics and exponential sums, we may perhaps obtain stronger
results.
We have shown that statistical models are useful in generating heuristic estimates
for number theory problems. To our knowledge, the coupon collector problem has not
previously appeared in a number theory context. We anticipate modifying the problem
to obtain other estimates. For instance, the version with blank coupons may be used to
obtain estimates for a full collection of residue classes with a desired property, such as
kth powers. In this case, all residue classes not possessing the property would be blank
coupons. Also, we may be able to give a heuristic argument for the number of different
residue classes that appear asnn mod p where 1 n p 1, to determine the sharpnessof the bounds given by Crocker [Cro69] or Somer[Som81].
We also established the existence of small solutions for nn amod p, althoughnn mod p is not uniformly distributed. The method of proof was first applied in [ Mur10]
67
-
8/12/2019 Self Power Map
75/82
Chapter 6. Conclusion and future work 68
to find small solutions for xq amodp, where the order ofa is a multiple of (p 1)/q.Thus, the same method may be used to find small solutions of other functions, such as
polynomials or repeated powers modulo a prime.
We expect that nn amod p has a solution with n x, for any given residue classa, ifx is approximatelyp2 log (p1)/(p1). We have proved the deterministic boundx < p2 where >0 is small. Given the difference between these two estimates, it may
be possible to strengthen the deterministic bound.
-
8/12/2019 Self Power Map
76/82
Appendix A
Formal Languages and Probabilities
The following background on formal languages, generating functions, and probabilities is
taken from Section 2 of [FGT92], which summarizes the methods used to relate formal
descriptions of regular languages to generating functions and probability estimates.
Definition A.1. LetA ={a1, a2, . . . , am} be a fixed set called the alphabet whose ele-ments are the letters. ThenA denotes the set of all finite sequences, called the wordsor strings of
A. A language is any subset of
A.
Given languagesL1,L2, andL, we define the following operations:
unionofL1 andL2: L1+L2= {w|w L1 or w L2}
product ofL1 and L2: L1L2 = {w1w2|w1 L1 and w L2}
star L: the language formed from all possible sequences from elements ofL,
L= {} +L+ (L L) + (L L L) +. . .
where denotes the empty word.
Definition A.2. The class of regular languages is the smallest class of languages con-
taining the finite sets and closed under the three operations of union, product and star.
69
-
8/12/2019 Self Power Map
77/82
-
8/12/2019 Self Power Map
78/82
Appendix A. Formal Languages and Probabilities 71
is called the ordinary generating function (OGF) of the languageL with respect to the
weight distributionp= (p1, . . . , pm) and is denoted by l(z). The exponential generating
function (EGF) is defined withzn/n! replacingzn,
l(z) =
n1,...,nm
ln1,...nmpn11 pnmm z
n1+...+nm
(n1+. . .+nm)!=
wA
[w] z|w|
|w|! .
The value [zn]l(z) is the probability that a random word inAn is inL.The ordinary and exponential generating functions are related by the Laplace-Borel
transform
l(z) =
0
l(zt)etdt
following from the classical relation0
tnetdt= n!
An operation on languages is unambiguous if every word of the resulting language is
obtained only once. The following is Theorem 2.1 in [FGT92]. Along with the Laplace
transform integral, it is the key to determining the generating function of a language
defined by a combination of the four basic operations.
Theorem A.4. When they operate unambiguously on their arguments, the operations of
union, product, star, and shuffle product translate into generating functions:
(a) L= L1+L2 l(z) =l1(z) +l2(z).(b) L= L1 L2 l(z) =l1(z)l2(z).(c) L= L1 l(z) = (1 l1(z))1.
(d) L= L1 L2 l(z) = l1(z)l2(z).
-
8/12/2019 Self Power Map
79/82
Bibliography
[Bac97] Gennady Bachman. On an optimality property of Ramanujan sums. Proc.
Amer. Math. Soc., 125:10011003, 1997.
[Bal11] R. Balasubramanian. Private communication, 2011.
[BBS11] Antal Balog, Kevin Broughan, and Igor Shparlinski. On the number of solu-
tions of exponential congruences. Acta Arith., 148(1):93103, 2011.
[BGK06] Jean Bourgain, Alexey A. Glibichuck, and Sergei V. Konyagin. Estimates for
the number of sums and products and for exponential sums in fields of prime
order. J. London Math. Soc., 73(2):380398, 2006.
[BM84] Manuel Blum and Silvio Micali. How to generate cryptographically strong
sequences of pseudorandom bits. SIAM J. Comput., 13(4):850864, 1984.
[BS66] Eric Bach and Jeffrey Shallit. Algorithmic Number Theory, Vol I: Efficient
Algorithms. The MIT Press, Cambridge, 1966.
[BS08] Jean Bourgain and Igor E. Shparlinski. Distribution of consecutive modular
roots of an integer. Acta Arith., 134(1):8391, 2008.
[Cro69] Roger Crocker. On residues ofnn. Amer. Math. Monthly, 76(4):10281029,
1969.
[DT97] Michael Drmota and Robert F. Tichy. Sequences, discrepancies and applica-
tions. Lecture Notes in Mathematics, volume 1651. Springer, 1997.
72
-
8/12/2019 Self Power Map
80/82
BIBLIOGRAPHY 73
[ER61] Paul Erdos and Alfred Renyi. On a classical problem of probability
theory. Magyar Tudomanyos Akademia Matematikai Kutato Intezetenek
Kozlemenyei, 6:215220, 1961.
[FGT92] Philippe Flajolet, Daniele Gardy, and Loys Thimonier. Birthday paradox,
coupon collectors, caching algorithms and self-organizing search. Discr. Appl.
Math., 39:207229, 1992.
[FLM10] Matthew Friedrichsen, Brian Larson, and Emily McDowell. Structure and
statistics of the self-power map. RHIT Undergrad. Math. J., 11(2):107139,
2010.
[Fri07] John Friedlander. Uniform distribution, exponential sums and cryptography.
In Andrew Granville and Zeev Rudnick, editors, Equidistribution in Number
Theory, An Introduction, pages 2957. Springer, 2007.
[FS09] Philippe Flajolet and Robert Sedgewick.Analytic Combinatorics. Cambridge
University Press, 2009.
[Gar08] M. Z. Garaev. The sum-product estimate for large subsets of prime fields.
Proc. Amer. Math. Soc., 136(8):27352739, 2008.
[GOM12] Jose Mara Grau and Antonio M. Oller-Marcen. On the last digit and the last
non-zero digit ofnn in base b. Unpublished, available at arXiv:1203.4066v1,
2012.
[Hay56] W.K. Hayman. A generalisation of stirlings formula. J. Reine Angew. Math.,
196:6595, 1956.
[HM06] Joshua Holden and Pierre Moree. Some heuristics and results for small cycles
of the discrete logarithm. Mathematics of Computation, 75(253):419449,
2006.
-
8/12/2019 Self Power Map
81/82
BIBLIOGRAPHY 74
[Hol02a] Joshua Holden. Addenda/corrigenda: Fixed points and two-
cycles of the discrete logarithm. Unpublished, available at
http://xxx.lanl.gov/abs/math.NT/0208028, 2002.
[Hol02b] Joshua Holden. Fixed points and two-cycles of the discrete logarithm. Algo-
rithmic number theory (ANTS 2002), pages 405415, 2002.
[HR11] Joshua Holden and Margaret M. Robinson. Counting fixed points, two-cycles,
and collisions of the discrete exponential function usingp-adic methods. Un-
published, available at arxiv.org/pdf/1105.5346, 2011.
[Isa95] Richard Isaac.The Pleasures of Probability, University Texts in Mathematics.
Springer-Verlag, 1995.
[KN06] L. Kuipers and H. Niederreiter. Uniform Distribution of Sequences. Dover
Publications, Inc, 2006.
[Mur08] M. Ram Murty. Problems in Analytic Number Theory, 2nd edition. Springer-
Velag, 2008.
[Mur10] M. Ram Murty. Small solutions of polynomial congruences. Ind. J. Pure
Appl. Math., 41(1):1523, 2010.
[MvOV96] A. Menezes, P. van Oorschot, and S. Vanstone. Handbook of Applied Cryp-
tography. CRC Press, 1996.
[NZM91] Ivan Niven, Herbert S. Zuckerman, and Hugh L. Montogomery. An Intro-
duction to the Theory of Numbers, 5th edition. John Wiley & Sons, Inc.,
1991.
[Pre98] Wiebe R. Prestman. Mathematical Statistics. Walter de Gruyter, 1998.
[PS72] George Polya and Gabor Szego. Problems and Theorems in Analysis, Vol. I.
Springer-Verlag, 1972.
-
8/12/2019 Self Power Map
82/82
BIBLIOGRAPHY 75
[Shp11] Igor E. Shparlinski. Exponential sums with consecutive modular roots of an
integer. Quart. J. Math., 62:207213, 2011.
[Som81] Lawrence Somer. The residues of nn modulo p. The Fibonacci Quarterly,
19(2):110117, 1981.