RSA Encryption
RSA is an encryption algorithm, used to securely transmit messages over the internet. It is based on the principle that it is easy to multiply large numbers, but factoring large numbers is very difficult. For example, it is easy to check that 31 and 37 multiply to 1147, but trying to find the factors of 1147 is a much longer process.
Overview
RSA is an example of public-key cryptography, which is illustrated by the following example: Suppose Alice wishes to send Bob a valuable diamond, but the jewel will be stolen if sent unsecured. Both Alice and Bob have a variety of padlocks, but they don't own the same ones, meaning that their keys cannot open the other's locks.
How did Alice send the diamond to Bob?
Solution:
- Bob first sends Alice an unlocked padlock. Note that Bob would give anyone an unlocked padlock, as the only use for one is to send Bob something.
- Alice adds Bob's lock and sends the package to Bob, and
- Bob removes his lock and opens the package.
This example demonstrates the ideas behind public-key cryptography, though the concept is actually slightly different. In public-key cryptography. Alice encrypts her message using Bob's public key, which can only be decoded by Bob's private key.
In RSA, the public key is generated by multiplying two large prime numbers \(p\) and \(q\) together, and the private key is generated through a different process involving \(p\) and \(q\). A user can then distribute his public key \(pq\), and anyone wishing to send the user a message would encrypt their message using the public key. For all practical purposes, even computers cannot factor large numbers into the product of two primes, in the same way that factoring a number like 414863 by hand is virtually impossible. However, multiplying two numbers is much less difficult, so a potential factorization can be verified quickly, as the following multiple-choice problem shows:
When the user receives the encrypted message, they decrypt it using the private key and can read the original text. A more detailed and technical explanation follows in the next section.
The Algorithm
The implementation of RSA makes heavy use of modular arithmetic, Euler's theorem, and Euler's totient function. Notice that each step of the algorithm only involves multiplication, so it is easy for a computer to perform:
- First, the receiver chooses two large prime numbers \(p\) and \(q\). Their product, \(n=pq\), will be half of the public key.
- The receiver calculates \(\phi(pq)=(p-1)(q-1)\) and chooses a number \(e\) relatively prime to \(\phi(pq)\). In practice, \(e\) is often chosen to be \(2^{16}+1=65537\), though it can be as small as \(3\) in some cases. \(e\) will be the other half of the public key.
- The receiver calculates the modular inverse \(d\) of \(e\) modulo \(\phi(n)\). In other words, \(de \equiv 1 \pmod{{\small\phi(n)}}\). \(d\) is the private key.
- The receiver distributes both parts of the public key: \(n\) and \(e\). \(d\) is kept secret.
Now that the public and private keys have been generated, they can be reused as often as wanted. To transmit a message, follow these steps:
First, the sender converts his message into a number \(m\). One common conversion process uses the ASCII alphabet:
A B C D E F G H I J K L M 65 66 67 68 69 70 71 72 73 74 75 76 77 N O P Q R S T U V W X Y Z 78 79 80 81 82 83 84 85 86 87 88 89 90 For example, the message "HELLO" would be encoded as 7269767679. It is important that \(m<n\), as otherwise the message will be lost when taken modulo \(n\), so if \(n\) is smaller than the message, it will be sent in pieces.
- The sender then calculates \(c \equiv m^e \pmod{n}\). \(c\) is the ciphertext, or the encrypted message. Besides the public key, this is the only information an attacker will be able to steal.
- The receiver computes \(c^d \equiv m \pmod n\), thus retrieving the original number \(m\).
- The receiver translates \(m\) back into letters, retrieving the original message.
Note that step 3 makes use of Euler's theorem.
Euler's theorem tells us that \(m^{\phi(n)} \equiv 1 \pmod n\). From the receiver's step 3, we deliberately chose \(d\) so that \(de \equiv 1 \pmod{{\small\phi(n)}}.\) Note that such a \(d\) exists by the definition of \(e\). This means that there exists an integer \(k\) satisfying \(de=k\phi(n)+1\). Then\[m^{de}\equiv m^{k\phi(n)+1}\equiv m^{k\phi(n)}\cdot m\equiv \big(m^{\phi(n)}\big)^k\cdot m\equiv 1^k\cdot m \equiv m \pmod{n}.\ _\square\]
For example, suppose the receiver selected the primes \(p=11\) and \(q=17\), along with \(e=3\).
- The receiver calculates \(n=pq=11 \cdot 17=187\), which is half of the public key.
- The receiver also calculates \(\phi(n)=(p-1)(q-1)=10 \cdot 16=160\). \(e=3\) was also chosen.
- The receiver calculates \(d=107\), since then \(de=321 \equiv 1 \pmod{{\small\phi(n)}}\) \(\big(\)since \(\phi(n)=160).\)
- The receiver distributes his public key: \(n=187\) and \(e=3\).
Now suppose the sender wanted to send the message "HELLO". Since \(n\) is so small, the sender will have to send his message character by character.
- 'H' is 72 in ASCII, so the message text is \(m=72\).
- The sender calculates \(m^e=72^3 \equiv 183 \pmod{187}\), making the ciphertext \(c=183\). Again, this is the only information an attacker can get, since the attacker does not have the private key.
- The receiver calculates \(c^d=183^{107} \equiv 72 \pmod{187}\), thus getting the message of \(m=72\).
- The receiver translates 72 into 'H'.
The rest of the letters are sent in the same way.
Applications and Vulnerabilites
Because of the great difficulty in breaking RSA, it is almost universally used anywhere encryption is required: password exchange, banking, online shopping, and even cable television. RSA is also used to ensure websites are legitimate since only the real website would have its private key. It therefore avoids man-in-the-middle attacks, in which an attacker intercepts a connection and shows the user a convincing fake, almost completely. All in all, a vulnerability in RSA would have catastrophic security consequences, so various attacks have been attempted.
The strength of RSA is measured in key size, which is the number of bits in \(n=pq\). 512-bit (155 digits) RSA is no longer considered secure, as modern brute force attacks can extract private keys in just hours, and a similar attack was able to extract a 768-bit (232 digits) private key in 2010. As of 2016, 1024-bit (309 digits) keys are considered risky, and most newly generated keys are 4096-bit (1234 digits).
One common attack on RSA bypasses the algorithm altogether. A computer can quickly compute the greatest common divisor of two numbers using the Euclidean algorithm, so an attacker can run this algorithm on two public keys. If their greatest common divisor is not 1, then the attacker has found a prime number dividing both keys, therefore breaking two keys at the same time. For example, suppose two public keys were 239149 and 166381. It is difficult to factor either of these two numbers by hand, but the Euclidean algorithm can be done by hand, revealing that the two numbers have a greatest common divisor of 379.
Other attacks are similar, taking advantage of poor random number generation. If \(p\) and \(q\) are too close together, \(e\) is so small that \(m^e<n\), or the private key is too small, then an attacker can efficiently determine the private key.
Another class of attacks focuses on hardware. By manipulating the power levels of a computer and causing power faults, Michigan researchers were able to decode a 1024-bit private key using only standard hardware[1]. Similarly, by studying the sounds a computer made while operating, Israeli researchers were able to extract a 4096-bit private key in under an hour[2].
In 1994, MIT mathematician Peter Shor developed a theoretical algorithm for quantum computers that factors numbers exponentially faster than current algorithms do. However, since only small quantum computers have been built, as of 2016 the algorithm has only managed to factor 16-bit numbers (the largest to date is 56153).
References
[1] University of Michigan. Fault based attack of RSA Authentication. Retrieved January 12th, 2016 from http://web.eecs.umich.edu/~valeria/research/publications/DATE10RSA.pdf
[2] Tel Aviv University. RSA Key Extraction via Low-Bandwidth Acoustic Cryptanalysis. Retrieved January 12th, 2016 from http://www.tau.ac.il/~tromer/papers/acoustic-20131218.pdf