RSA Cryptosystem: Understanding Modular Arithmetic and Factoring for Encryption, Essays (high school) of Software Engineering

The rsa cryptosystem, focusing on modular arithmetic and factoring. It covers modular exponentiation, division, and euclid's algorithm for greatest common divisor. The text also discusses the selection of decryption exponent d and the security of the rsa system.

Typology: Essays (high school)

2011/2012

Uploaded on 04/16/2012

alley
alley 🇺🇸

4.2

(5)

256 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 161 Computer Security
Fall 2005 Joseph/Tygar/Vazirani/Wagner Notes 10
1 One-way function
A one-way function is a fundamental notion in cryptography. It is a function on nbits such that given xit is
easy to compute f(x)but on input f(x)it is hard to recover x(or any other preimage of f(x)). One of the
fundamental sources of one-way functions is the remarkable contrast between multiplication, which is fast,
and factoring, for which we know only exponential time algorithms. The simplest procedures for factoring
a number require an enormous effort if that number is large. Given a number N, one can try dividing it
by 1,2,...,N1 in turn, and returning all the factors that emerge. This algorithm requires N1 steps.
If Nis in binary representation, as is customary, then its length is n=dlog2Nebits, which means that the
running time is proportional to 2n, exponential in the size of the input. One clever simplification is to restrict
the possible candidates to just 2,3,...,N, and for each factor ffound in this shortened list, to also note
the corresponding factor N/f. As justification, witness that if N=ab for some numbers aand b, then at
most one of these numbers can be more than N. The modified procedure requires only Nsteps, which
is proportional to 2n/2but is still exponential. Factoring is one of the most intensely studied problems by
algorithmists and number theorists. The best algorithms for this problem take 2cn1/3log2/3nsteps. The current
record is the factoring of RSA576, a 576 bit challenge by RSA Inc. The factoring of 1024 bit numbers is
well beyond the capability of current algorithms.
The security of the RSA public key cryptosystem is based on this stark contrast between the hardness of
factoring and multiplication.
2 Outline of RSA
In the RSA cryptosystem, each user selects a public key (N,e), where Nis a product of two large primes
Pand Q, and eis the encryption exponent (usually e=3). Pand Qare unknown to the rest of the World,
and are used by the owner of the key (say Alice), to compute the private key (N,d). Even though dis
uniquely defined by the public key (N,e), actually recovering dfrom (N,e)is as hard as factoring N. i.e.
given dthere is an efficient algorithm to recover Pand Q. The encryption function is a permutation on
{0,1,...,N1}. It is given by E(m) = memod N. The decryption function is D(c) = cdmod N, with the
property that D(E(m)) = m. i.e. for every m,med=m mod N. To establish these properties and understand
how to choose d,ewe must review modular arithmetic.
Before we do that let us make some observations about RSA. First, what makes public key cryptography
counter-intuitive is the seeming symmetry between the recepient of the message, Alice, and the eavesdrop-
per, Eve. After all, the ciphertext memod N together with the public key (N,e)uniquely specifies the
plaintext m. In principle one could try computing xemod N for all 0 xN1 until one hits upon the
ciphertext. However this is prohibitively expensive. RSA breaks the symmetry between Alice and Eve be-
cause RSA encryption is actually a trapdoor function: it is easy to compute, and hard to invert, unless you
have knowledge of d(the hidden trapdoor). Then it is easy to invert.
Secondly, public key encryption schemes including RSA are substantially slower than symmetric-key en-
CS 161, Fall 2005, Notes 10 1
pf3
pf4
pf5

Partial preview of the text

Download RSA Cryptosystem: Understanding Modular Arithmetic and Factoring for Encryption and more Essays (high school) Software Engineering in PDF only on Docsity!

CS 161 Computer Security

Fall 2005 Joseph/Tygar/Vazirani/Wagner Notes 10

1 One-way function

A one-way function is a fundamental notion in cryptography. It is a function on n bits such that given x it is easy to compute f ( x ) but on input f ( x ) it is hard to recover x (or any other preimage of f ( x )). One of the fundamental sources of one-way functions is the remarkable contrast between multiplication, which is fast, and factoring, for which we know only exponential time algorithms. The simplest procedures for factoring a number require an enormous effort if that number is large. Given a number N , one can try dividing it by 1, 2 ,... , N − 1 in turn, and returning all the factors that emerge. This algorithm requires N − 1 steps. If N is in binary representation, as is customary, then its length is n = dlog 2 N e bits, which means that the running time is proportional to 2 n , exponential in the size of the input. One clever simplification is to restrict the possible candidates to just 2, 3 ,... ,

N , and for each factor f found in this shortened list, to also note the corresponding factor N / f. As justification, witness that if N = ab for some numbers a and b , then at most one of these numbers can be more than

N. The modified procedure requires only

N steps, which is proportional to 2 n /^2 but is still exponential. Factoring is one of the most intensely studied problems by algorithmists and number theorists. The best algorithms for this problem take 2 cn 1 / (^3) log 2 / (^3) n steps. The current record is the factoring of RSA576, a 576 bit challenge by RSA Inc. The factoring of 1024 bit numbers is well beyond the capability of current algorithms.

The security of the RSA public key cryptosystem is based on this stark contrast between the hardness of factoring and multiplication.

2 Outline of RSA

In the RSA cryptosystem, each user selects a public key ( N , e ), where N is a product of two large primes P and Q , and e is the encryption exponent (usually e = 3). P and Q are unknown to the rest of the World, and are used by the owner of the key (say Alice), to compute the private key ( N , d ). Even though d is uniquely defined by the public key ( N , e ), actually recovering d from ( N , e ) is as hard as factoring N. i.e. given d there is an efficient algorithm to recover P and Q. The encryption function is a permutation on { 0 , 1 ,... , N − 1 }. It is given by E ( m ) = me^ mod N. The decryption function is D ( c ) = cd^ mod N , with the property that D ( E ( m )) = m. i.e. for every m , med = m mod N. To establish these properties and understand how to choose d , e we must review modular arithmetic.

Before we do that let us make some observations about RSA. First, what makes public key cryptography counter-intuitive is the seeming symmetry between the recepient of the message, Alice, and the eavesdrop- per, Eve. After all, the ciphertext me^ mod N together with the public key ( N , e ) uniquely specifies the plaintext m. In principle one could try computing xe^ mod N for all 0 ≤ xN − 1 until one hits upon the ciphertext. However this is prohibitively expensive. RSA breaks the symmetry between Alice and Eve be- cause RSA encryption is actually a trapdoor function: it is easy to compute, and hard to invert, unless you have knowledge of d (the hidden trapdoor). Then it is easy to invert.

Secondly, public key encryption schemes including RSA are substantially slower than symmetric-key en-

cryption algorithms such as DES and AES. For this reason, public key encryption is typically used to estab- lish private session keys between two parties who then communicate using a symmetric encryption scheme. Thus public key encryption is used to solve the key distribution problem in symmetric encryption schemes, where if n people wish to communicate it is necessary to establish

( n 2

keys. For a public key scheme they only need n keys.

3 Algorithms for modular arithmetic

We start by considering two number-theoretic problems – modular exponentiation and greatest common divisor – for which the most obvious algorithms take exponentially long, but which can be solved in poly- nomial time with some ingenuity. The choice of algorithm makes all the difference.

3.1 Simple modular arithmetic

Two n -bit integers can be added, multiplied, or divided by mimicking the usual manual techniques which are taught in elementary school. For addition, the resulting algorithm takes a constant amount of time to produce each bit of the answer, since each such step only requires dealing with three bits – two input bits and a carry – and anything involving a constant number of bits takes O ( 1 ) time. The overall time is therefore O ( n ), or linear. Similarly, multiplication and division take O ( n^2 ), or quadratic, time.

Modular arithmetic can be implemented naturally using these primitives. To compute a mod s , simply return the remainder upon dividing a by s. By reducing all inputs and answers modulo s , modular addition, subtraction, and multiplication are easily performed, and all take time O (log^2 s ) since the numbers involved never grow beyond s and therefore have size at most dlog 2 s e.

3.2 Modular exponentiation

Modular exponentiation consists of computing ab^ mod s. One way to do this is to repeatedly multiply by a modulo s , generating the sequence of intermediate products ai^ mod s , i = 1 ,... , b. They each take O (log^2 s ) time to compute, and so the overall running time to compute the b − 1 products is O ( b log 2 s ), exponential in the size of b.

A repeated squaring procedure for modular exponentiation.

function ModExp1( a , b , s ) Input: A modulus s , a positive integer a < s and a positive exponent b Let bn − 1 · · · b 1 b 0 be the binary form of b , where n = dlog 2 b e Output: ab^ mod s

// Compute the powers pi = a^2 i mod s. p 0 = a mod s for i = 1 to n − 1 pi = p^2 i − 1 mod s

// Multiply together a subset of these powers. r = 1 for i = 0 to n − 1

Lemma If a > b then gcd( a , b ) = gcd( a mod b , b ).

Proof : Actually Euclid noticed the slightly simpler rule gcd( a , b ) = gcd( ab , b ) from which the one above can be derived by the repeated subtraction of b from a.

Why is gcd( a , b ) = gcd( ab , b )? Well, any integer which divides both a and b must also divide both ab and b , so gcd( a , b ) ≤ gcd( ab , b ). And similarly, any integer which divides both ab and b must also divide both a and b , so gcd( a , b ) ≥ gcd( ab , b ). 2

How long does the Euclid algorithm (below) take? We will see that on each successive recursive call one of its arguments gets reduced to at most half its value while the other remains unchanged. Therefore there can be at most blog 2 a c + blog 2 b c + 1 recursive calls before one of the arguments gets reduced to zero. The following lemma summarizes this key observation.

Lemma If ab then a mod ba /2.

Proof : Consider two possible ranges for the value of b. Either ba /2, in which case a mod b < ba /2, or b > a /2, in which case a mod b = aba /2. 2

For an input size of n = dlog 2 a e + dlog 2 b e, there are at most n + 1 recursive calls, and so the total running time is O ( n^3 ).

Euclid’s algorithm for finding the greatest common divisor of two numbers.

function Euclid( a , b ) Input: Two positive integers a , b with ab Output: gcd( a , b )

if b = 0 then return a return Euclid( b , a mod b )

3.4 An extension of Euclid’s algorithm

It turns out that gcd( a , b ) can always be expressed as an integer linear combination of a and b , that is, in the form ax + by where x , y are integers. It is not immediately obvious how one would calculate such x , y , even given exponential time, but in fact they can be found quickly by incorporating the following observation into the recursion in Euclid’s algorithm.

Lemma: If gcd( a mod b , b ) is an integer linear combination of a mod b and b , then gcd( a , b ) is an integer linear combination of a and b.

Proof : Write a in the form bq + r , where r = a mod b. By hypothesis, there are some integers x ′, y ′^ for which gcd( a mod b , b ) = bx ′^ + ry ′. Let x = y ′^ and y = x ′^ − qy ′; these are also integers and

ax + by = ay ′^ + b ( x ′^ − qy ′) = bx ′^ + ( abq ) y ′^ = bx ′^ + ry ′^ = gcd( a mod b , b ) = gcd( a , b ),

where the final equality is simply Euclid’s rule. 2

The Extended-Euclid algorithm given below directly implements this inductive reasoning. One way to prove its correctness in detail (you should try this) is to use induction on max( a , b ).

Theorem: For any positive integers a , b , the Extended-Euclid algorithm returns integers x , y such that ax + by = gcd( a , b ).

A simple extension of Euclid’s algorithm.

function Extended-Euclid( a , b ) Input: Two positive integers a , b with ab Output: Integers x , y , d such that d = gcd( a , b ) and ax + by = d

if b = 0 then return ( 1 , 0 , a ) by division find q , r such that a = bq + r ( x ′, y ′, d ) = Extended-Euclid( b , r ) return ( x = y ′, y = x ′^ − qy ′, d )

This extension of Euclid’s algorithm is the key to dividing in the modular world. In real arithmetic every a 6 = 0 has a multiplicative inverse 1/ a , and dividing is the same as multiplying by this inverse. In modular arithmetic, a has a multiplicative inverse mod s iff gcd( a , s ) = 1. By the extended Euclid algorithm, if gcd( a , s ) = 1, then there are integers x , y such that ax + sy = 1. Reducing both sides of this sum modulo s , we get ax ≡ 1 mod s. In short: x is the multiplicative inverse of a modulo s , and we have a quick way of finding it.

Corollary: If a is relatively prime to s > a , then a has a multiplicative inverse modulo s , and this inverse can be found in time O (log^3 s ).

Returning to the question of how the decryption exponent d is selected. It turns out that d is the multiplicative inverse of the encryption exponent e mod ( P − 1 )( Q − 1 ). By the above discussion such a d exists and can be efficiently computed iff gcd( e , ( P − 1 )( Q − 1 )) = 1. Since for efficiency reasons we wish to choose e = 3, it follows that we must pick P , Q each congruent to 2 mod 3. d is then computed by using the Extended Euclid algorithm.

4 The Trapdoor

Let us now turn to the task of justifying the fact that RSA encryption is a permutation of the numbers modulo N , and the fact that for all 0 ≤ xN − 1, xed^ = xed^ = x mod N whenever ed = 1 mod ( P − 1 )( Q − 1 ).

We need the following theorem of Fermat’s:

Theorem [Fermat’s Little Theorem] If p is prime then for every 1 ≤ a < p ,

ap −^1 ≡ 1 (mod p ).

Proof : Let S be a set consisting of the numbers 1, 2 ,... , p − 1 modulo p. Now multiply them each by a modulo p and call that set T. We will show that S and T are identical: they have exactly the same elements. Therefore the products of their elements, ( p − 1 )! mod p and ap −^1 · ( p − 1 )! mod p respectively, are also identical. Dividing out by ( p − 1 )! completes the proof.

First we make sure a is invertible modulo p. From 1 ≤ a < p we know a and p are relatively prime, and by Corollary 3.4 a has a multiplicative inverse modulo p. Call it a −^1.

Next we show that S and T are identical because the elements a · i , i = 1... p − 1, are all distinct and non-zero modulo p. If a · ia · j (mod p ), then multiplying both sides by a −^1 gives i = j. By similar reasoning, none of the numbers in T are congruent to zero modulo p.

Bob chooses his public and private keys.

  • He starts by picking two large ( n -bit) random primes P and Q.
  • His public key is ( N , e ) where N = pq and e is a 2 n -bit number relatively prime to ( p − 1 )( q − 1 ). A common choice is e = 3 because it permits fast encoding.
  • His secret key is d , the inverse of e modulo ( p − 1 )( q − 1 ), computed using the Extended-Euclid algorithm.

Alice wishes to send message x to Bob.

  • She looks up his public key ( N , e ) and sends him y = ( xe^ mod N ), computed using an efficient modular exponentiation algorithm.
  • He decodes the message by computing yd^ mod N.

Figure 1: RSA.

6 RSA

Let us review the resulting RSA cryptosystem, which is named after its inventors Rivest-Shamir-Adleman. Anybody can send a message to anybody else using publicly-available information, rather like addresses or phone numbers. Each person has a public key known to the whole world, and a secret key known only to himself. When Alice wants to send message x to Bob, she encodes it using his public key e. He decrypts it using his secret key d , to retrieve x. Eve is welcome to see as many encrypted messages for Bob as she likes, but she will not be able to decode them, under certain simple assumptions.

There are two procedures involved in RSA: the initial choice of keys, and the message-sending protocol. These are described in above. Both make heavy use of the efficient number-theoretic primitives we have developed.

The security of RSA hinges upon a simple assumption:

Given N , e , and y = xe^ mod N , it is computationally intractable to determine x.

How might Eve try to guess x? She could experiment with all possible values of x , each time checking whether xe^ ≡ y mod N , but this would take exponential time. Or she could try to factor N to retrieve P and Q , and then figure out d by inverting e modulo ( P − 1 )( Q − 1 ), but we believe factoring to be hard. Moreover, it can be shown that guessing d is as hard as factoring N : once d is known, it is easy to recover P and Q. This intractability is normally a source of dismay; the insight of RSA lies in using it to advantage.