### Return to Table of Contents

RSA was among the first public-key cryptography developed. It was first described in 1978, and is named after its creators, Ron Rivest, Adi Shamir, and Len Adleman.^{1} Although “textbook” RSA by itself is not a secure encryption schem e, it is a fundamental ingredient for public-key cryptography

^{1}Clifford Cocks developed an equivalent scheme in 1973, but it was classified since he was working for British intelligence.

**13.1: Modular Arithmetic & Number Theory**

In general, public-key cryptography relies on computational problems from abstract algebra. Of the techniques currently known for public-key crypto, RSA uses some of the simplest mathematical ideas, so it’s an ideal place to start.

We will be working with modular arithmetic, so please review the section on modular arithmetic from the first lecture! We need to understand the behavior of the four basic arithmetic operations in the set ℤ* _{n}* = {0,. . . ,

*n*− 1}.

Every element *x* ∈ ℤ* _{n}* has an inverse with respect to addition mod

*n*: namely −

*x % n*. For example, the additive inverse of 11 mod 14 is −11 ≡

_{14}3. However, multiplicative inverses are not so straight-forward.

#### Greatest Common Divisors

If *d | x* and *d | y*, then *d* is a **common divisor** of *x* and *y*. The largest possible such *d* is called the **greatest common divisor (GCD)**, denoted gcd(*x,y*). If gcd(*x,y*) = 1, then we say that *x* and *y* are **relatively prime**. The oldest “algorithm” ever documented is the one Euclid described for computing GCDs (ca. 300 BCE):

#### Multiplicative Inverses

We let ℤ^{∗}_{n} denote the set {*x* ∈ ℤ* _{n}* | gcd(

*x,n*) = 1}, the

**multiplicative group modulo**

*n*. This group is

*closed under multiplication mod n*, which just means that if

*x,y*∈ ℤ

^{∗}

_{n}then

*xy*∈ ℤ

^{∗}

_{n}, where

*xy*denotes multiplication mod

*n*. Indeed, if gcd(

*x,n*) = gcd(

*y,n*) = 1, then gcd(

*xy,n*) = 1 and thus gcd(

*xy % n,n*) = 1 by Euclid’s algorithm.

In abstract algebra, a *group* is a set that is closed under its operation (in this case multiplication mod *n*), and is also closed under inverses. So if ℤ^{∗}_{n} is really a group under multiplication mod *n*, then for every *x* ∈ ℤ^{∗}_{n} there must be a *y* ∈ ℤ^{∗}_{n} so that *xy* ≡_{n} 1. In other words, *y* is the **multiplicative inverse** of *x* (and we would give it the name x^{−1}.

The fact that we can always find a multiplicative inverse for elements of ℤ^{∗}_{n} is due to the following theorem:

### Theorem 13.1: Bezout’s Theorem

*For all integers x and y, there exist integers a and b such that ax* + *by* = gcd(*x,y*). *In fact,* gcd(*x,y*) *is the smallest positive integer that can be written as an integral linear combination of x and y.*

What does this have to do with multiplicative inverses? Take any *x* ∈ ℤ^{∗}_{n}; we will show how to find its multiplicative inverse. Since *x* ∈ ℤ^{∗}_{n}, we have gcd(*x,n*) = 1. From Bezout’s theorem, there exist integers *a,b* satisfying *ax* + *bn* = 1. By reducing both sides of this equation modulo *n*, we have

_{n}*ax*+

*bn*≡

_{n}*ax*+ 0

(since *bn* ≡* _{n}* 0). Thus the integer

*a*guaranteed by Bezout’s theorem is the multiplicative inverse of

*x*modulo

*n*.

We have shown that every *x* ∈ ℤ^{∗}_{n} has a multiplicative inverse mod *n*. That is, if gcd(*x,n*) = 1, then *x* has a multiplicative inverse. But might it be possible for x to have a multiplicative inverse mod *n* even if gcd(*x,n*) ≠ 1?

Suppose that we have an element *x* with a multiplicative inverse; that is, *xx*^{−1} ≡* _{n}* 1. Then

*n*divides

*xx*

^{−1}− 1, so we can write

*xx*

^{−1}− 1 =

*kn*(as an expression over the integers) for some integer

*k*. Rearranging, we have that

*xx*

^{−1}−

*kn*= 1. That is to say, we have a way to write 1 as an integral linear combination of

*x*and

*n*. From Bezout’s theorem, this must mean that gcd(

*x,n*) = 1. Hence,

*x*∈ ℤ

^{∗}

_{n}. We conclude that:

The elements of ℤ^{∗}_{n} are exactly those elements with a multiplicative inverse mod *n*.

Furthermore, multiplicative inverses can be computed efficiently using an extended version of Euclid’s GCD algorithm. While we are computing the GCD, we can also keep track of integers *a* and *b* from Bezout’s theorem at every step of the recursion; see below:

#### Example

*Below is a table showing the computation of EXTGCD*(35,144). *Note that the columns x, y are computed from the top down (as recursive calls to EXTCD are made), while the columns d, a, and b are computed from bottom up (as recursive calls return). Also note that in each row, we indeed have d* = *ax* + *by.*

The final result demonstrates that 35^{−1} ≡_{144} −37 ≡_{144} 107.

#### The Totient Function

Euler’s **totient** function is defined as *ϕ*(*n*) ≝ |ℤ^{∗}_{n}|, in other words, the number of elements of ℤ* _{n}* which are relatively prime to

*n*.

As an example, if *p* is a prime, then ℤ^{∗}_{n} = ℤ_{n} \ {0} because every integer in ℤ_{n} apart from zero is relatively prime to *p*. Therefore, *ϕ*(*p*) = *p* − 1.

We will frequently work modulo *n* where *n* is the product of two distinct primes *n* = *pq*. In that case, *ϕ*(*n*) = (*p* − 1)(*q* − 1). To see why, let’s count how many elements in **Z*** _{pq}* share a common divisor with

*pq*(i.e., are not in ℤ

^{∗}

_{pq}).

- The multiples of
*p*share a common divisor with*pq*. These include 0,*p*,2*p*,3*p*,. . . , (*q*− 1)*p*. There are*q*elements in this list. - The multiples of
*q*share a common divisor with*pq*. These include 0,*q*,2*q*,3*q*,. . . , (*p*− 1)*q*. There are*p*elements in this list.

We have clearly double-counted element 0 in these lists. But no other element is double counted. Any item that occurs in both lists would be a common multiple of both *p* and *q*, but the least common multiple of *p* and *q* is *pq* since *p* and *q* are relatively prime. But *pq* is larger than any item in these lists.

We count *p* + *q* − 1 elements in ℤ_{pq} which share a common divisor with *pq*. That leaves the rest to reside in ℤ^{∗}_{pq}, and there are *pq* − (*p* + *q* − 1) = (*p* − 1)(*q* − 1) of them. Hence *ϕ*(*pq*) = (*p* − 1)(*q* − 1).

General formulas for *ϕ*(*n*) exist, but they typically rely on knowing the prime factorization of *n*. We will see more connections between the difficulty of computing *ϕ*(*n*) and the difficulty of factoring *n* later in this part of the course.

Here’s an important theorem from abstract algebra:

### Theorem 13.2: Euler’s Theorem

*If x* ∈ ℤ^{∗}_{n} *then x*^{ϕ(n)} ≡* _{n}* 1.

As a final corollary, we can deduce Fermat’s “little theorem,” that *x ^{p}* ≡

*for all*

_{p}x*x*, when

*p*is prime.

^{2}

^{2}You have to handle the case of *x* ≡* _{p}* 0 separately, since 0 ∉ ℤ

^{∗}

_{p}so Euler’s theorem doesn’t apply to it.

**13.2: The RSA Function**

The RSA function is defined as follows:

- Let
*p*and*q*be distinct primes (later we will say more about how they are chosen), and let*N*=*pq*.*N*is called the**RSA modulus**. - Let
*e*and*d*be integers such that*ed*≡_{ϕ(N) }1. That is,*e*and*d*are multiplicative inverses mod*ϕ*(*N*) — not mod*N*!*e*is called the**encryption exponent**, and*d*is called the**decryption exponent**. These names are historical, but not entirely precise since RSA by itself does not achieve CPA security. - The RSA function is:
*m*↦*m*, where^{e}% N*m*∈ ℤ_{N} - The inverse RSA function is:
*c*↦*c*, where^{d}% N*c*∈ ℤ._{N}

Essentially, the RSA function (and its inverse) is a simple modular exponentiation. The most confusing thing to remember about RSA is that *e* and *d* “live” in ℤ^{∗}_{ϕ(N)}, while *m* and *c* “live” in ℤ* _{N}*.

Let’s make sure the the function we called the “inverse RSA function” is actually in inverse of the RSA function. The RSA function raises its input to the *e* power, and the inverse RSA function raises its input to the *d* power. So it suffices to show that raising to the *ed* power has no effect modulo *N*.

Since *ed* ≡*ϕ*(*N*) 1, we can write *ed* = *tϕ*(*N*) + 1 for some integer *t*. Then:

Note that we have used the fact that *m*^{ϕ(N)} ≡_{N} 1 from Euler’s theorem.

#### Security Properties

In these notes we will not formally define a desired security property for RSA. Roughly speaking, the idea is that even when *N* and *e* can be made public, it should be hard to compute the operation *c* ↦ *c ^{d} % N*. In other words, the RSA function

*m*↦

*m*is:

^{e}% N- easy to compute given
*N*and*e* - hard to invert given
*N*and*e*but not*d* - easy to invert given
*d*

**13.3: Chinese Remainder Theorem**

The multiplicative group ℤ^{∗}_{N} has some interesting structure when *N* is the product of distinct primes. We can use this structure to optimize some algorithms related to RSA.

**History.** Some time around the 4th century CE, Chinese mathematician Sun Tzu Suan Ching discussed problems relating to simultaneous equations of modular arithmetic:

We have a number of things, but we do not know exactly how many. If we count them by threes we have two left over. If we count them by fives we have three left over. If we count them by sevens we have two left over. How many things are there?^{3}

^{3}Translation due to Joseph Needham, *Science and Civilisation in China, vol. 3: Mathematics and Sciences
of the Heavens and Earth*, 1959.

In our notation, he is asking for a solution *x* to the following system of equations:

A generalized way to solve equations of this kind was later given by mathematician Qin Jiushao in 1247 CE. For our eventual application to RSA, we will only need to consider the case of two simultaneous equations.

### Theorem 13.3: CRT

Suppose gcd(*r,s*) = 1. Then for all integers *u*,?, there is a solution for *x* in the following system of equations:

Furthermore, this solution is *unique* modulo *rs*.

#### Proof

Since gcd(*r,s*) = 1, we have by Bezout’s theorem that 1 = *ar* + *bs* for some integers *a* and *b*. Furthermore, *b* and *s* are multiplicative inverses modulo *r*. Now choose *x* = ?*ar* + *ubs*. Then,

So *x* ≡_{r}*u*, as desired. By a symmetric argument, we can see that *x* ≡* _{s}* ?, so

*x*is a solution to the system of equations.

Now we argue why the solutizon is *unique* modulo *rs*. Suppose *x* and *x’* are two solutions to the system of equations, so we have:

Since *x* ≡_{r}*x’* and *x* ≡_{s}*x’*, it must be that *x* − *x’* is a multiple of *r* and a multiple of *s*. Since *r* and *s* are relatively prime, their least common multiple is *rs*, so *x* − *x’* must be a multiple of *rs*. Hence, *x* ≡_{rs}*x’*. So any two solutions to this system of equations are congruent mod *rs*.

We can associate every pair (*u*,?) ∈ ℤ_{r} × ℤ_{s} with its corresponding system of equations of the above form (with *u* and ? as the right hand-sides). The CRT suggests a relationship between these pairs (*u*,?) ∈ ℤ_{r} × ℤ_{s} and elements of ℤ_{rs}.

For *x* ∈ ℤ* _{rs}*, and (

*u*,?) ∈ ℤ

_{r}× ℤ

_{s}, let us write

to mean that *x* is a solution to *x* ≡* _{r} u* and

*x*≡

*?. The CRT says that the*

_{s}^{crt}←→ relation is a

*bijection*(1-to-1 correspondence) between elements of ℤ

_{rs}and elements of ℤ

_{r}× ℤ

_{s}.

In fact, the relationship is even deeper than that. Consider the following observations:

- . If
*x*^{crt}←→ (*u*,?) and*x’*^{crt}←→ (*u’*,?’), then*x*+*x’*^{crt}←→ (*u*+*u’*, ? + ?’). You can see this by adding relevant equations together from the system of equations. Note here that the addition*x*+*x’*is done mod*rs*; the addition*u*+*u’*is done mod*r*; and the addition ? + ?’ is done mod*s*. - If
*x*^{crt}←→ (*u*,?) and*x’*^{crt}←→ (*u’*, ?’), then*xx’*^{crt}←→ (*uu’*,??’). You can see this by multiplying relevant equations together from the system of equations. As above, the multiplication*xx’*is mod*rs*;*uu’*is done mod*r*; ??’ is done mod*s*. - Suppose
*x*^{crt}←→ (*u*,?). Then gcd(*x,rs*) = 1 if and only if gcd(*u,r*) = gcd(?,*s*) = 1. In other words, the^{crt}←→ relation is a 1-to-1 correspondence between elements of ℤ ∗*rs*and elements of ℤ ∗*r*× ℤ ∗*s*.^{4}

^{4}Fun fact: this yields an alternative proof that *ϕ*(*pq*) = (*p* − 1)(*q* − 1) when *p* and *q* are prime. That is, *ϕ*(*pq*) = |**Z**^{∗}_{pq}| = |**Z**^{∗}_{p} × **Z**^{∗}* _{q}*| = (

*p*− 1)(

*q*− 1).

The bottom line is that the CRT demonstrates that ℤ_{rs} and ℤ_{r} × ℤ_{s}**are essentially the same mathematical object**. In the terminology of abstract algebra, the two structures are *isomorphic*.

Think of ℤ_{rs} and ℤ* _{r}* × ℤ

_{s}being two different kinds of

*names*or

*encodings*for the same set of items. If we know the “ℤ

_{rs}-names” of two items, we can add them (mod

*rs*) to get the ℤ

_{rs}-name of the result. If we know the “ℤ

*× ℤ*

_{r}*-names” of two items, we can add them (first components mod*

_{s}*r*and second components mod

*s*) to get the ℤ

_{r}× ℤ

_{s}-name of the result. The CRT says that both of these ways of adding give the same results.

Additionally, the proof of the CRT shows us how to convert between these styles of names for a given object. So given *x* ∈ ℤ* _{rs}*, we can compute (

*x % r,x % s*), which is the corresponding element/name in ℤ

*× ℤ*

_{r}_{s}. Given (

*u*,?) ∈ ℤ

_{r}× ℤ

_{s}, we can compute

*x*= ?

*ar*+

*ubs % rs*(where

*a*and

*b*are computed from the extended Euclidean algorithm) to obtain the corresponding element/name

*x*∈ ℤ

*.*

_{rs}From a **mathematical** perspective, ℤ_{rs} and ℤ_{r} × ℤ_{s} are the same object. However, from a **computational** perspective, there might be reason to favor one over the other. In fact, it turns out that doing computations in the ℤ_{r} × ℤ_{s} realm is significantly cheaper.

#### Application to RSA

In the context of RSA decryption, we are interested in taking *c* ∈ ℤ* _{pq}* and computing

*cd*∈ ℤ

*. Since*

_{pq}*p*and

*q*are distinct primes, gcd(

*p,q*) = 1 and the CRT is in effect.

Thinking in terms of ℤ* _{pq}*-arithmetic, raising

*c*to the

*d*power is rather straightforward. However, the CRT suggests that another approach is possible: We could convert

*c*into its ℤ

_{p}× ℤ

_{q}representation, do the exponentiation under that representation, and then convert back into the ℤ

_{pq}representation. This approach corresponds to the bold arrows in Figure 13.1, and the CRT guarantees that the result will be the same either way.

Now why would we ever want to compute things this way? Performing an exponentiation modulo an *n*-bit number requires about *n*^{3} steps. Let’s suppose that *p* and *q* are each *n* bits long, so that the RSA modulus *N* is 2*n* bits long. Performing *c* ↦ *c ^{d}* modulo

*N*therefore costs about (2

*n*)

^{3}= 8

*n*

^{3}total.

The CRT approach involves two modular exponentiations — one mod *p* and one mod *q*. Each of these moduli are only *n* bits long, so the total cost is *n*^{3} + *n*^{3} = 2*n*^{3}. **The CRT approach is 4 times faster!** Of course, we are neglecting the cost of converting between representations, but that cost is very small in comparison to the cost of exponentiation.

It’s worth pointing out that this speedup can only be done for the RSA *inverse* function. One must know *p* and *q* in order to exploit the Chinese Remainder Theorem, and only the party performing the RSA inverse function typically knows this.

**13.4: The Hardless of Factoring ***N*

*N*

Clearly the hardness of RSA is related to the hardness of factoring the modulus *N*. Indeed, if you can factor *N*, then you can compute *ϕ*(*N*), solve for *d*, and easily invert RSA. So factoring must be *at least as hard* as inverting RSA.

Factoring integers (or, more specifically, factoring RSA moduli) is believed to be a hard problem for classical computers.^{5} In this section we show that some other problems related to RSA are “as hard as factoring.” What does it mean for a computational problem to be “as hard as factoring?” More formally, in this section we will show the following:

### Theorem 13.4:

*Either all of the following problems can be solved can be solved in polynomial-time, or none of them can:*

*Given an RSA modulus N*; =*pq, compute its factors p and q*.*Given an RSA modulus N*=*pq compute ϕ*(*N*) = (*p*– 1)(*q*– 1)*Given an RSA modulus N*=*pq and value e, compute its inverse d, where ed*≡*ϕ*(*N*) 1.*Given an RSA modulus N*=*pq, find any x*≢≡*N*± 1*such that x*^{2}≡*N*1

To prove the theorem, we will show:

*if*there is an efficient algorithm for (1),*then*we can use it as a subroutine to construct an efficient algorithm for (2). This is straight-forward: if you have a subroutine factoring*N*into*p*and*q*, then you can call the subroutine and then compute (*p*− 1)(*q*− 1).*if*there is an efficient algorithm for (2),*then*we can use it as a subroutine to construct an efficient algorithm for (3). This is also straight-forward: if you have a subroutine computing*ϕ*(*N*) given*N*, then you can compute the multiplicative inverse of*e*using the extended Euclidean algorithm.*if*there is an efficient algorithm for (3),*then*we can use it as a subroutine to construct an efficient algorithm for (4).*if*there is an efficient algorithm for (4),*then*we can use it as a subroutine to construct an efficient algorithm for (1).

Below we focus on the final two implications.

#### Using square roots of unity to factor *N*

Problem (4) of Theorem 13.4 concerns a new concept known as square roots of unity:

### Definition 13.5: sqrt of unity

*x is a square root of unity modulo N if x*

^{2}≡

*1.*

_{N}*If x*≢≡

_{N}1 and

*x ≢≡*−1,

_{N}*then we say that x is a*

**non-trivial**square root of unity.Note that ±1 are always square roots of unity modulo *N*, for any *N* ((±1)^{2} = 1 over the integers, so it is also true mod *N*). But if *N* is the product of distinct odd primes, then *N* has 4 square roots of unity: two trivial and two non-trivial ones (see the exercises in this chapter).

### Claim 13.6

*Suppose there is an efficient algorithm for computing nontrivial square roots of unity modulo N. Then there is an efficient algorithm for factoring N. (This is the (4) ⇒ (1) step in Theorem 13.4.)*

#### Proof

The reduction is rather simple. Suppose NTSRU is an algorithm that on input *N* returns a non-trivial square root of unity modulo *N*. Then we can factor *N* with the following algorithm:

The algorithm is simple, but we must argue that it is correct. When *x* is a nontrivial square root of unity modulo *N*, we have the following:

So the prime factorization of (*x* + 1)(*x* − 1) contains a factor of *p* and a factor of *q*. But neither *x* + 1 nor *x* − 1 contain factors of *both p* and *q*. Hence *x* + 1 and *x* − 1 must each contain factors of exactly one of {*p,q*}, and {gcd(*pq,x* − 1),gcd(*pq,x* + 1)} = {*p,q*}.

#### Finding square roots of unity

### Claim 13.7

*If there is an efficient algorithm for computing d* ≡*ϕ*(*N*) *e*^{−1} *given N and e, then there is an efficient algorithm for computing nontrivial square roots of unity modulo N. (This is the (3) ⇒ (4) step in Theorem 13.4.)*

#### Proof

Suppose we have an algorithm FIND_D that on input (*N,e*) returns the corresponding exponent *d*. Then consider the following algorithm which uses FIND_D as a subroutine:

There are several return statements in this algorithm, and it should be clear that all of them indeed return a square root of unity. Furthermore, the algorithm does eventually return within the main for-loop, because *x* takes on the sequence of values:

and the final value of that sequence satisfies

Conditioned on *w* ∈ ℤ^{∗}* _{N}*, it is possible to show that SqrtUnity(

*N,e,d*) returns a square root of unity chosen

*uniformly at random*from among the four possible square roots of unity. So with probability 1/2, the output is a nontrivial square root. We can repeat this basic process

*n*times, and eventually encounter a nontrivial square root of unity with probability 1 − 2

^{−n}.

**13.5: Malleability of RSA, and Applications**

We now discuss several surprising problems that turn out to be equivalent to the problem of inverting RSA. The results in this section rely on the following *malleability* property of RSA: Suppose you are given *c* = *m ^{e}* for an unknown message

*m*. Assuming

*e*is public, you can easily compute

*c*·

*x*= (

^{e}*mx*)

*. In other words, given the RSA function applied to*

^{e}*m*, it is possible to obtain the RSA function applied to a related message

*mx*.

#### Inverting RSA on a small subset

Suppose you had a subroutine INVERT(*N,e,c*) that inverted RSA (*i.e.*, returned *c ^{d}* mod

*N*) but only for, say, 1% of all possible

*c*’s. That is, there exists some subset

*G*⊆ ℤ

_{N}with |

*G*| ⩾

*N*/100, such that for all

*m*∈

*G*we have

*m*= INVERT(

*N,e,me*).

If you happen to have a value *c* = *m ^{e}* for

*m*∉

*G*, then it’s not so clear how useful such a subroutine INVERT could be to you. However, it turns out that the subroutine can be used to invert RSA on

*any input whatsoever*. Informally, if inverting RSA is easy on 1% of inputs, then inverting RSA is easy

*everywhere*.

Assuming that we have such an algorithm INVERT, then this is how we can use it to invert RSA on any input:

Suppose the input to REALLYINVERT involves *c* = (*m*^{∗})* ^{e}* for some unknown

*m*

^{∗}. The goal is to output

*m*

^{∗}.

In the main loop, *c’* is constructed to be an RSA encoding of *m*^{∗}·*r*. Since *r* is uniformly distributed in ℤ_{N} , so is *m*^{∗} · *r*. So the probability of *m*^{∗} · *r* being in the “good set” *G* is 1%. Furthermore, when it is in the good set, INVERT correctly returns *m*^{∗} · *r*. And in that case, REALLYINVERT outputs the correct answer *m*^{∗}.

Each time through the main loop incurs a 1% chance of successfully inverting the given *c*. Therefore the expected running time of REALLYINVERT is 1/0.01 = 100 times through the main loop.

#### Determining high-order bits of *m*

Consider the following problem: Given *c* = *m ^{e}* mod

*N*for an unknown

*m*, determine whether

*m*>

*N*/2 or

*m*<

*N*/2. That is, does

*m*live in the top half or bottom half of ℤ

_{N}?

We show a surprising result that even this limited amount of information is enough to completely invert RSA. Equivalently, if inverting RSA is hard, then it is not possible to tell whether *m* is in the top half or bottom half of ℤ_{N} given *m ^{e} % N*.

The main idea is that we can do a kind of binary search in ℤ* _{N}*. Suppose TOPHALF(

*N,e,c*) is a subroutine that can tell whether

*c*mod

^{d}*N*is in {0,. . . ,(

*N*− 1) / 2) or in{(

*N*+ 1) / 2,. . . ,

*N*− 1}. Given a candidate

*c*, we can call TOPHALF to reduce the possible range of m from ℤ

*to either the top or bottom half. Consider the ciphertext*

_{N}*c’*=

*c*· 2

*which encodes 2*

^{e}*m*. We can use TOPHALF to now determine whether 2

*m*is in the top half of ℤ

_{N}. If 2

*m*is in the top half of ℤ

*, then*

_{N}*m*is in the top half of its current range. Using this approach, we can repeatedly query TOPHALF to reduce the search space for

*m*by half each time. In only log

*N*queries we can uniquely identify

*m*.

**Exercises**

13.1: Prove by induction the correctness of the EXTGCD algorithm. That is, whenever (*d,a,b*) = EXTGCD(*x,y*), we have gcd(*x,y*) = *d* = *ax* + *by*. You may use the fact that the original Euclidean algorithm correctly computes the GCD.

13.2: Prove that if *g ^{a}* ≡

_{n}1 and

*g*≡

^{b}*1, then*

_{n}*g*

^{gcd(a,b)}≡

*1.*

_{n}

13.3: Prove that gcd(2^{a} − 1,2^{b} − 1) = 2^{gcd(a,b)} − 1.

13.4: Prove that *x ^{a} % n* =

*x*

^{a%ϕ(n)}

*% n*. In other words, when working modulo

*n*, you can reduce exponents modulo

*ϕ*(

*n*).

13.5: In this problem we determine the efficiency of Euclid’s GCD algorithm. Since its input is a pair of numbers (*x,y*), let’s call *x* + *y* the size of the input. Let *F _{k}* denote the

*k*th Fibonacci number, using the indexing convention

*F*

_{0}= 1;

*F*

_{1}= 2. Prove that (

*F*

_{k},

*F*

_{k}−1) is the smallest-

*size*input on which Euclid’s algorithm makes

*k*recursive calls.

*Hint:*Use induction on

*k*.

Note that the *size* of input (*F _{k},F*

_{k−1}) is

*F*

_{k+1}, and recall that

*F*

_{k+1}≈

*ϕ*

^{k+1}, where

*ϕ*≈ 1.618 . . . is the golden ratio. Thus, for any inputs of size

*N*∈ [

*F*,

_{k}*F*

_{k + 1}), Euclid’s algorithm will make less than

*k*⩽ log

_{ϕ}

*N*recursive calls. In other words, the worst-case number of recursive calls made by Euclid’s algorithm on an input of size

*N*is

*O*(log

*N*), which is linear in the number of bits needed to write such an input.

^{6}

^{6}A more involved calculation that incorporates the cost of each division (modulus) operation shows the worst-case overall efficiency of the algorithm to be *O*(log^{2} *N*) — quadratic in the number of bits needed to write the input.

13.6: Consider the following **symmetric-key** encryption scheme with plaintext space ? = {0,1}^{λ}. To encrypt a message *m*, we “pad” *m* into a prime number by appending a zero and then random non-zero bytes. We then multiply by the secret key. To decrypt, we divide off the key and then strip away the “padding.”

The idea is that decrypting a ciphertext without knowledge of the secret key requires factoring the product of two large primes, which is a hard problem.

Show an attack breaking CPA-security of the scheme. That is, describe a distinguisher and compute its bias. *Hint:* ask for any two ciphertexts.

13.7: Explain why the RSA encryption exponent *e* must always be an odd number.

13.8: The Chinese Remainder Theorem states that there is always a solution for *x* in the following system of equations, when gcd(*r,s*) = 1:

Give an example *u*, ?, *r, s*, with gcd(*r,s*) ≠ 1 for which the equations have no solution. Explain why there is no solution.

13.9: Bob chooses an RSA plaintext *m* ∈ ℤ_{N} and encrypts it under Alice’s public key as *c* ≡_{N}*m ^{e}*. To decrypt, Alice first computes

*m*≡

_{p}

_{p}*c*and

^{d}*m*≡

_{q}

_{q}*c*, then uses the CRT conversion to obtain

^{d}*m*∈ ℤ

*, just as expected. But suppose Alice is using faulty hardware, so that she computes a*

_{N}**wrong value**for

*m*. The rest of the computation happens correctly, and Alice computes the (wrong) result

_{q}*mˆ*. Show that, no matter what

*m*is, and no matter what Alice’s computational error was, Bob can factor

*N*if he learns

*mˆ*.

*Hint:* Bob knows *m* and *mˆ* satisfying the following:

13.10: (a). Show that given an RSA modulus *N* and *ϕ*(*N*), it is possible to factor *N* easily.

*Hint:* you have two equations (involving *ϕ*(*N*) and *N*) and two unknowns (*p* and *q*).

(b). Write a pari function that takes as input an RSA modulus *N* and *ϕ*(*N*) and factors *N*. Use it to factor the following 2048-bit RSA modulus. *Note:* take care that there are no precision issues in how you solve the problem; double-check your factorization!

13.11: True or false: if *x*^{2} ≡* _{N}* 1 then

*x*∈ ℤ

^{∗}

*. Prove or give a counterexample.*

_{N}

13.12: Discuss the computational difficulty of the following problem:

*Given an integer N, find an element of* ℤ* _{N}* \ ℤ*

*.*

_{N}

If you can, relate its difficulty to that of other problems we’ve discussed (factoring *N* or inverting RSA).

13.13: (a). Show that it is possible to efficiently compute all four square roots of unity modulo *pq*, given *p* and *q*. *Hint:* CRT!

(b). Implement a pari function that takes distinct primes *p* and *q* as input and returns the four square roots of unity modulo *pq*. Use it to compute the four square roots of unity modulo

13.14: Show that, conditioned on *w* ∈ ℤ^{∗}* _{N}*, the SqrtUnity subroutine outputs a square root of unity chosen uniformly at random from the 4 possible square roots of unity.

*Hint:*use the Chinese Remainder Theorem.

13.15: Suppose *N* is an RSA modulus, and *x*^{2} ≡_{N}*y*^{2}, but *x* ≢≡*N* ± *y*. Show that *N*. can be efficiently factored if such a pair *x* and *y* are known.

13.16: Why are ± 1 the only square roots of unity modulo *p*, when *p* is an odd prime?

13.17: When *N* is an RSA modulus, why is squaring modulo *N* a 4-to-1 function, but raising to the *e*^{th} power modulo *N* is 1-to-1?

13.18: Implement a pari function that efficiently factors an RSA modulus *N*, given only *N, e,* and *d*. Use your function to factor the following 2048-bit RSA modulus. *Note:* pari function valuation(n,p) returns the largest number *d* such that *p ^{d}* |

*n*.

13.19: In this problem we’ll see that it’s bad to choose RSA prime factors *p* and *q* too close together.

(a). Let *s* = (*p* − *q*)/2 and *t* = (*p* + *q*)/2. Then *t* is an integer greater than √*N* such that *t*^{2} − *N* is a perfect square. When *p* and *q* are close, *t* is not much larger than √*N*, so by testing successive integers, it is possible to find *t* and *s*, and hence *p* and *q*.. Describe the details of this attack and how it works.

(b). Implement a pari function that factors RSA moduli using this approach. Use it to factor the following 2048-bit number (whose two prime factors are guaranteed to be close enough for the factoring approach to work in a reasonable amount of time, but far enough apart that you can’t do the trial-and-error part by hand). What qualifies as “close prime factors” in this problem? How close was *t* to √*N*?

*Hint:* pari has an issquare function. Also, be sure to do exact square roots over the integers, not the reals.