RA: RSA and Modular arithmetic

PDF of Eric’s handwritten notes are here.

This lecture will explain the mathematics behind the RSA cryptosystem.  Let’s begin with a brief review of the definition of modular arithmetic.

Modular Arithmetic:
Here is a simple example to illustrate modular arithmetic:
For x \in \mathbb{Z},  x mod 2 =  least significant bit of x  =

  • 1 if x is odd
  • 0 if x is even

The general definition is the following:
Definition: For integer N \geq 1,x mod N = remainder when divide x by N= r, where

x = qN + r and 0 \leq r < N for integers q, r.

Example: mod 3 has 3 equivalence classes:

  • \dots, -9, -6, -3, 0, 3, 6, 9, \dots
  • \dots, -8, -5, -2, 1, 4, 7, 10, \dots
  • \dots, -7, -4, -1, 2, 5, 8, 11, \dots

Properties: Given x \equiv x' mod N and y \equiv y' mod N, then:

  • x + y \equiv x' + y' \mod N
  • xy \equiv x'y' \mod N

Example: Find 2^{345} mod 31, given 345 = 5 x 69.

2^{345} \equiv (2^5)^{69} \equiv (32)^{69} \equiv 1 \mod 31

Example: input: x, y, N, where x and y are n-digit numbers, and x, y \leq N.
Complexity for the following computations:

  • x + y \mod N:
    • Addition: O(n)
    • Module: if x+y \geq N: output (x+y-N); else: output x + y. O(n)
    • Total time: O(n)
  • xy \mod N:
    • Multiplication: O(n^2)
    • Module: compute xy divided by N and output the remainder.
    • Total time: O(n^2)
  • x^y \mod N:
    • Easy Idea: Compute x^1 \mod N, x^2 \mod N, \dots, x^y \mod N. y rounds in total and y = 2^n, so the running time is O(n^22^n), which is exponential time.
    • Improvement: Compute all x^k \mod N, where k is power of 2 and k \leq y. And then combine the results to get the final answer. \log_2{y} = n rounds of computation in total, and O(n^2) for each round,  and the same for the combining rounds. Thus the total time is O(n^2 \log y) = O(n^3).
    • Example: y = 25 = (11001)_2, then x^y \equiv x^{25} \equiv x^{16}x^8x^2 \mod N. And we need to compute y^k \equiv (y^{\frac{k}{2}})^2 \mod N, where k = 1, 2, 4, 8, 16. We need to do at most n multiplications to get these values and another n multiplications to get the final result. Thus the total time is O(n^3).

Multiplicative Inverse:
Definition: Z is a multiplicative inverse of A \mod N if aZ \equiv 1 \mod N, we can also write it as Z \equiv A^{-1} \mod N.
Theorem: Z exists iff gcd(A, N) = 1, (or in other words, there exists \alpha, \beta \in \mathbb{Z} such that \alpha Z + \beta A = 1).
Example: N = 14, find the inverse of A = 1, 2, 3, \dots, 13

A 1 2 3 4 5 6 7 8 9 10 11 12 13
A^{-1} 1 X 5 X 3 X X X 11 X 9 X 13

To find the inverse of A \mod N, we can first compute gcd(A, N) to verify that gcd(A, N) = 1 .  In addition we can find integers \alpha, \beta such that \alpha A + \beta N = gcd(A,N). The number \alpha here is the inverse of A when gcd(A,N)=1.We can use the Extended Euclid’s Algorithm to find d=gcd(A,B) and also these numbers \alpha, \beta\in Z that \alpha A + \beta N = d. Let’s first review the basic Euclid’s algorithm for finding the gcd(A,B).
Euclid’s Rule and Euclidean Algorithm:
Euclid’s Rule: For x, y \in \mathbb{Z} such that x \geq y > 0, gcd(x, y) = gcd(x \mod y, y).
This follows from the fact that gcd(x,y) = gcd(x-y,y).  Euclid’s rule yields a simple divide and conquer algorithm for computing the gcd.
Pseudocode:
Euclid(x, y): x \geq y \geq 0
Output: (d = gcd(x,y))

  • if y = 0:
    • Return x
  • else:
    • Return Euclid(y, x \mod y)

Running Time:
Claim: If y \leq x, then x \mod y < \frac{x}{2}.Proof:

  • y \leq \frac{x}{2}: x \mod y < y \leq \frac{x}{2}
  • y > \frac{x}{2}: x \mod y = x - y < \frac{x}{2}

Hence every other round the first coordinate decreases by at least half.  Therefore, the running time satisfies: T(n) = T(n-1) + O(n^2), here n is the number of digit of x. Thus Euclid’s Algorithm takes O(n^3) time to compute the gcd(x, y).
We can now look at the extended version of the algorithm which also computes the numbers \alpha,\beta mentioned earlier.
Pseudocode:
Extended-Euclid(x, y): x \geq y \geq 0
Output: (d = gcd(x,y), \alpha, \beta)

  • if y = 0:
    • Return (x, 1, 0)
  • else:
    • (d, \alpha', \beta') = Extended-Euclid(y, x \mod y)
    • Return (d, \beta', \alpha' -  \lfloor \frac{x}{y} \rfloor \beta')

Here if d = gcd(x, y) = 1, then \alpha is the inverse of x \mod y because \alpha x + \beta y = 1 and mod each side by y gives us \alpha \equiv x^{-1} \mod y
Fermat’s Little Theorem and Euler’s Theorem:
Fermat’s Little Theorem:
For prime p, for all a relatively prime to p (so all a where a \neq 0 \mod p), then a^{p-1} \equiv 1 \mod p
Proof:
Let p be a prime and a \neq 0 \mod p.
Let S = \{1, 2, \dots, p-1\}.
Define S' = aS = \{a \mod p, 2a \mod p, \dots, a(p-1) \mod p\}.
E.g., p = 7, a = 3, S = \{1, 2, 3, 4, 5, 6\}, S' = aS = \{3, 6, 2, 5, 1, 4\}.
It turns out that in general the sets S and S' are identical sets, just possibly have the elements in different order.
Claim: S = S'
Proof of Claim: We’ll show that S' has p-1 distinct and non-zero elements, and therefore it contains all elements of S exactly once, so we’ll have that S=S'.
Let’s first show that the elements of S' are distinct.  Assume  i,j \in S such that i \neq j and ai \equiv aj \mod p.
Then since a \neq 0 \mod p, a^{-1} mod p exists.
a^{-1}ai \equiv a^{-1}aj \mod p \quad \Rightarrow \quad i \equiv j \mod p
And this implies i = j, which is a contradiction.
Similarly, if ai \equiv 0 \mod p, then i \equiv 0 \mod p.
Thus S' has the same p-1 elements as S, which gives us S = S'. \square
Using this claim, we can multiply together the elements of S and we’ll get the same result as when we multiply the elements of S', hence:
(1)(2)(3)\dots(p-1) \equiv (a1)(a2)(a3)\dots(a(p-1))
Since p is prime, so gcd(i, p) = 1 for i = 1, 2, \dots, p-1.
Thus 1 \equiv a^{p-1} \mod p so we’re left with 1\equiv a^{p-1} \bmod p, and this completes the proof of Fermat’s Little Theorem. \square
We’ll use the following generalization of Fermat’s little theorem:
Euler’s Theorem (generalization of Fermat’s Little Theorem):
For any N, given any a relatively prime to N (So gcd(a,N)=1)
a^{\phi(N)} = 1 \mod N where \phi(N)  = # integers b \in {1,...,N} that are relatively prime to N  =  |\{b:b \in {1,...,N}\mid gcd(b, N)=1\}|
For prime p, \phi(p) = p-1 and hence we obtain Fermat’s little theorem in this case.
For primes p,q, let N = pq, then \phi(N) = (p-1)(q-1).  Why?  Well let’s look at the N numbers between 1 and N and count how many are not relatively prime to N.  In particular, which of these N numbers have a common factor with N=pq?  There are q multiples of p, namely p, 2p, \dots, qp, and for each of these q numbers p is a common factor of this number and also of N.  Similarly there are p multiples of q, namely q, 2q, \dots, pq.  And we counted pq=qp twice.  Thus there are q+p-1 numbers between 1 and N which are not relatively prime to N and the others are relatively prime.  Therefore, \phi(N) = pq - p - q + 1 = (p-1)(q-1).
So for a where gcd(a, N) = 1, then a^{(p-1)(q-1)} = 1.  This is the key result for the RSA protocol.  Let us explore how we use it.
Back to Fermat’s Little Theorem
Take b, c where bc \equiv 1 \mod p-1.
So b \equiv c^{-1} \mod (p-1), which means bc = 1+k(p-1) for some integer k.   Therefore, a^{bc} \equiv (a)(a^{(p-1)})^k \equiv a \mod p
Looking at Euler’s Theorem:
For primes p, q, let N = pq.
Take d,e where de \equiv  1 \mod (p-1)(q-1).  Thus, de = 1+k(p-1)(q-1) for some integer k.  Therefore, for a relatively prime to N, a^{de} \equiv (a) (a^{(p-1)(q-1)})^k \equiv a \mod N.
This important property is used in RSA. Someone can find large primes p, q and compute d,e. They keep d but give out the public N, e. Anyone who wants to send a secret message can encrypt the original message m by m^e \mod N to get the encrypted message y. When receiving that y, they  can recover the original message m by decrypting using y^e \equiv m^{de} \equiv m \mod N. We’ll talk about the details of the protocol in a moment.
Cryptography
Alice has a message m that she wants to send to Bob, but Eve can see the message sent.
m \rightarrow encoder \rightarrow   (e(m)) \rightarrow decoder \rightarrow  m=d(e(m))
Eve sees e(m)
Private-key scheme (One-time pad):
Alice & Bob meet beforehand and decide on a n-bit string r.
Then Alice sends:
e(m) = m \oplus r. Here \oplus is Exclusive OR.
Bob decrypts:
d(e(m)) = e(m) \oplus r = m
How should they pick r?
Choose a random string, then e(m) is also a random string.
Problem:
1) need to meet or have a secure, private computation beforehand
2) can only use it once.
E.g., suppose Alice sends m_1 and m_2 using same r, then Eve can see
(m_1  \oplus r) \oplus (m_2  \oplus r) = m_1  \oplus m_2
which might have some useful info, e.g. if one of the has a string of 0’s then you see the other one.
Public-key ‘crypotgraphy’:
Bob publishes a public key: anyone can send a message to Bob uses his public key to encrypt, while Bob uses his private, secret key to decrypt the received message.
Protocol:
1) Bob picks 2 n-bit random primes p, q
-How?
Choose random n-bit numbers and check if they are prime (we’ll see how to do that next class). It has prob \approx \frac{1}{n} of being prime.
2) Bob chooses an e relatively prime to (p-1)(q-1)
He does this by trying e=3,5,7,11. And he also computes N = pq.
3) Bob publishes his public key as (N,e) where N = pq. (he doesn’t reveal p or q, just pq)
4) Bob computes his private key
d \equiv e^{-1} \mod (p-1)(q-1)
He does this using the extended Euclid algorithm.
Alice:
to send the message m to Bob
1) She looks up his public key (N, e)
2) She computes y = m^e \mod N using fast modular exponentiation algorithm that we saw.
3) She sends y to Bob
Bob:
1) He receives y and decrypts using m = y^d  \mod N using fast modular exponentiation.
Key assumption:
Given N, e, y, it is computationally difficult to determine m.
Natural way is to factor N to get p, q then one can get (p-1)(q-1).
We saw before that:
Since de \equiv 1 \mod (p-1)(q-1) we have that de = 1 + k(p-1)(q-1) for some integer k.   Moreover, if m is relatively prime to N then by Euler’s theorem m^{(p-1)(q-1)} \equiv 1 \mod N.  Therefore we have the following:
y^d \equiv (m^e)^d \equiv m^{ed} \equiv m^{1+k(p-1)(q-1)} \equiv m \times (m^{(p-1)(q-1)})^k \equiv m \mod N
The remaining case is when m has gcd(m, N) > 1.  This case requires the Chinese remainder theorem, which we have not covered so we skip this case but the identity m^{ed} \equiv m \mod N still holds true when gcd (m,N)>1.  (The proof using the Chinese remainder theorem is not difficult, here’s a good explanation that I found.)
This gives us y^d \equiv m \mod N, so RSA works.
RSA = Rivest-Shamir-Adelman published in 1977.
Clifford Cocks – working for UK intelligence agency described a similar system in ’73.
– Note if e=3 (or is small), one needs to make sure m^e > N otherwise mod N isn’t doing anything and to decrypt one can just take y^{\frac{1}{3}}
So often “pad” m.
– If send same m to e or more people all using the same e then can decrypt (see HW problems).