Symmetric Ciphers
Symmetric ciphers use symmetric algorithms to encrypt and decrypt data. These ciphers are used in symmetric key cryptography. A symmetric algorithm uses the same key to encrypt data as it does to decrypt data. For example, a symmetric algorithm will use key \(k\) to encrypt some plaintext information like a password into a ciphertext. Then, it uses \(k\) again to take that ciphertext and turn it back into the password.
Symmetric ciphers are the opposite of asymmetric ciphers, like those used in public-key cryptography. These ciphers use asymmetric algorithms which use one key to encrypt data and a different key to decrypt ciphers. Typically, those two keys are called public and private keys, as is the case with RSA encryption. The public key is used to encrypt data, and the private key is used to decrypt data.
Symmetric ciphers have many important advantages, like speed. But they lack in other areas like security and key management. Due to these pros, however, there are a number of important symmetric ciphers in production today. The most popular of these is Advanced Encryption Standard (AES). Because of its security concerns, however, it is often used on a single machine for encryption and decryption. This eliminates the need to share the secret key. Symmetric ciphers are a good place to get started when learning cryptography as they were the first widespread systems used in modern computing.
Contents
Overview
Like all forms of cryptography, the general process of symmetric key cryptography is to first encrypt a message. This encryption algorithm will turn any plaintext data into ciphertext, an unreadable code. Then, that ciphertext is transmitted to another party who decrypts it to find the original message.
This process uses some sort of key in the encryption and decryption algorithms. Typically this key is only a series of bits, representing some number. What the key is exactly depends on the encryption being used. For symmetric ciphers, the same key is used in both the encryption and decryption algorithm.. So, the algorithm functions for encryption and decryption look like this:
\[ \textrm{encrypt(plaintext, key)} = \textrm{ciphertext} ,\] \[ \textrm{decrypt(ciphertext, key)} = \textrm{plaintext} .\]
Symmetric key cryptography is much simpler than public-key cryptography. Imagine that you want to exchange secret messages with your online friend. You want to be as careful as possible, but you unfortunately don't know what they look like. You decide to set up the following system.
- You buy a locker at the train station and a padlock with only 1 key.
- When it's your turn to send a message, you go down to station, leave a message, and lock it.
- You pick a time and place to meet with your online friend.
- You give them the key when you see them (however, you can't see their face).
- Your friend can then go to the locker, open it, and read your message.
- To send a message back to you, your friend must go through the same process.
Unfortunately, this protocl is very insecure. Check below for some pros and cons of this cryptographic system.
Simple Symmetric Ciphers
Caeser Cipher
Simple symmetric are the oldest forms of cryptography, dating back to the Caesar cipher, a cipher used by Julius Caesar to communicate in secret. This cipher, a type of substitution cipher, took any message that Caesar might write to someone, and shifted each character in that message by a certain amount. For example, the message "hello"
shifted by a value of 5 would result in "mjqqt"
. This cipher is symmetric because the same key, in this case 5, is used to encrypt and decrypt the message.
Caesar's cipher is especially prone to attacks like frequency analysis. Words and characters in lanaguage are not random. If an attacker intercepts enough messages, they might learn that they letter j
shows up a lot in the cipher text codes. Now the attacker has a clue that j
is Caesar cipher code for a common letter, probably an s
or an a
. Definitely not a z
. Repeating this process enough can break this code.
One-time pad
The one-time pad is another famous symmetric cipher. It's famous for its reported use by KGB and American spies during the Cold War. Let's say one spy wanted to get another spy a message, and for simplicity that message is in binary format. The two spies have already met up beforehand and decided on a key of 10110
for this one message. This key is usually decided at random to heighten security. The first spy, Alice, wants to send the message 01101
to Bob, the second spy. To encrypt Alice's message, she creates a new cipher text that has a 0
if the corresponding bit is the same between the original message and the key. Otherwise, it has a 1
. This is also called an XOR in boolean logic.
Now, Alice's encrypted message is 11011
. To decrypt this message, Bob does the exact same process. Two XOR operations result in a null operation in boolean logic. So,
\[11011 \oplus 10110 = 01101 \]
where \(\oplus\) is the mathematical symbol for XOR.
There is a problem with the one-time pad (apart from other general problems with symmetric ciphers). The key needs to be exactly as long as the message itself. Alice and Bob can get around this issue by simply looping around to the beginning of the key when they reach the end. However, this will make the code much easier to break by opposing spies using brute force or statistical analysis. However, one-time pads are perfectly secret in theory. This means that an attacker cannot know anything about a particular cipher text if they intercept it. In practice, however, it's insecure to distribute and exchange keys.
You and I are trying to get a message across enemy lines, and we don't want it intercepted. Luckily, we decided on an initial secret key of 01011
. Unluckily, I need to get you a message that's a bit longer than that. Decrypt this ciphertext to get my original message 10001011
.
You'll need to wrap around the key in order to solve this one. In other words, when I encoded this message and I got to the \( 6^\text{th}\) bit of my message, I would start back at the first bit of the key.
Types of Symmetric Ciphers
Modern symmetric key ciphers can be one of two types. The block cipher take in \(n\) bits of plaintext and \(n\) bits of key to produce \(n\) bits of ciphertext. The are known as block ciphers because they operate on blocks of \(n \ \textrm{x} \ n\) bits at a time. The block cipher is used in many current symmetric cryptosystems.
There a few important issues with basic block ciphers. They are malleable, which means the message they convey is subject to change by an attacker. If you encrypt a certain message, \(m\), to \(enc(m)\), an attacker can intercept that and change it to \(enc(m_0)\). The attacker then sends it along, and you're recipient decrypts your message to find \(m_0\), not \(m\). For example, an attacker could intercept a message that decrypts to "Meet me at 10PM" and change it to "Meet me at 11PM". Further, messages that are longer than one block long need to be split up into smaller messages, a process known as Electronic Code Book (ECB). Cipher Block Chaining (CBC) fixes some of these issues by introducing some elements of randomness into the process, however it has a terrible effect on performance.
The stream cipher is similar to the one-time pad in that is uses the XOR function on the plaintext with a pseudo-random sequence. So, instead of a secret key that it static, like 01011
from the TIY problem above, it constantly is changing and is always pseudo-random. The input plaintext is encrypted incrementally, often one byte at a time. A random sequence is not possible because distributing it to the receiver is inherently unsafe. So, a pseudo-random sequence is used. These pseudo-random sequence are the outputs of a generator, given an initial seed. A seed is simply a number used to initialize a pseudo-random number generator.
Stream ciphers are also malleable and often generate ciphertexts that are statistically correlated with their input plaintext. This is especially true if the randomness of the algorithm is poorly implemented. Often, sections of the key are thrown out completely. Even with popular stream ciphers like RC4, the few first kilobytes of the key are often discarded.
Advantages and Disadvantages
Choosing between symmetric and asymmetric (often called public-key) cryptography is very important because the choice will have severe impact on the entire system. Symmetric ciphers and systems are beneficial for performance because they operate at around 1000 times faster than a public-key cryptosystem. Plus symmetric ciphers are simpler and easier to implement. So, symmetric systems are used for bulk encryption, especially when security is not as big of as concern.
Some symmetric ciphers can be broken through a brute-force attack, lowering their overall security. If you decided to use Caesar's cipher to encrypt your data, you could be hacked immediately because breaking them with brute is trivial with current computers. Typically these attacks simply try every possible key until the correct one is found. Even with more advanced symmetric ciphers, this is still a problem that must be handled using key changes and randomness.
Symmetric ciphers are disadvantageous because they require a secret channel to exchange the key. This channel is inherently insecure, and so an attacker might steal the key, rendering the system vulnerable. Further, key management becomes an issue in large settings with symmetric keys. Say you have a company with 100 employees. If you used RSA encryption, you would only need 200 total keys (a public and private key for each person). However, if you used symmetric keys, you would need \(\frac{100*99}{2*1} = 4950\)! That's because each pair of people would need their own symmetric key.
Let's analyze our example to see how our system matches up with these pros and cons.
- Alice and Bob have a very simple system that was easy to implement. They only had to make one key (as opposed to 4, a public and private key for each person), and the whole thing was up and running quickly.
- A brute force attack (where the attacker tries every single possible secret key to find the correct one) could break down this system faster than if they needed 4 keys to do this, but that's not the main concern.
- If Alice and Bob wanted to scale their system up to more people, where each pair of people had their own locker for secure communication, they would need to buy \(\frac{n(n-1)}{2}\) keys, where \(n\) is the number of people in the system.
- The biggest concern with this system is the sharing of the key. They need to meet every time to hand it off. They don't know each other by sight, so Alice could give the key to the wrong person. Even if they did know each other by sight, someone could still come and take the key away from them while they are trading it. Both of these attacks (impersonation and man-in-the-middle attacks) happen all the time.
Symmetric Cipher Systems
Data Encryption Standard (DES)
DES is a symmetric system that was once a predominant standard in the 1970s but has since fallen our of favor due to its low security. Its introduction sparked heated debate about the role of standards in cryptography and led to much research and innovation in the field. However, DES is the archetype of block cipher systems, many systems today are based on its design.
DES uses block ciphers. The block ciphers in DES consist of 56 random bits, and 8 more bits are used for error detection. These error detecting bits make DES unmalleable - attackers can't change the cipher on its way to its destination because they might accidentally delete a bit used for error detection, and then receivers would know the data had been attacked. However, the relatively small key size was an issue of debate even in the 1970s. By 1999, DES could be broken in under a day. This was later solved by sequencing multiple DES systems together, called 3DES.
The data is first sent into the system and then cut into two 32-bit halves. Those two halves are sent through the entire system, criss-crossing using what's known as the Feistel system. There are 16 layers in DES, and at each layer, one half of the data goes through the Fiestel function. Once it's finished, it is XORd with the other half of the data. Each layer has its own subkey. The subkey is derived from the main, 56-bit key using a key scheduler.
The Fiestel function, which occurs in every block labeled \(F\) in the diagram to the right, has 3 steps:
- Expansion: The incoming 32-bit block has half of its bits duplicated, making it a 48-bit block.
- Mixing: The new, 48-bit input block is put through an XOR gate with this round's unique subkey.
- Substitution: The mixed, 48-bit block is divided into 8 6-bit pieces. Each of these 8 pieces is put through an S-block which will output only 4-bits using a non-linear-transformation. *
- Permutation: The 32 output bits are then arranged in a specific permutation that ensures that they will be distributed among different S-blocks in the next round.
*This is most important part of security in DES, it helps to avoid simple, algebra-based attacks.
The key scheduler:
- The 56-bit primary key is split into two 28-bit keys. These halves are hereafter treated separately.
- In each round, each half is rotated left or right by either one or two bits (depending on the round).
- 24 bits from the left half are chosen, and 24 from the right are chosen to make a 48-bit subkey.
Because we rotate on each round, each bit is only used in approximately 14 out of the 16 rounds. The key scheduler for encryption and decryption are the exact same except that the subkeys are in reverse order for decryption.
DES was vulnerable to brute force attacks as early as the 1970s, but there were other ways it was weak as well. Differential cryptanalysis, or the study of how changes in inputs can affect output, are very effective at breaking block ciphers and DES in particular. Linear cryptanalysis, which apply affine transformations to a cipher were also widely used.
Advanced Encryption Standard (AES)
AES is similar to DES in that it is symmetric and uses block ciphers. However, it is much more secure than DES and has become the international standard. It is at least 6 times faster than 3DES. Instead of Fiestel functions, AES uses a substitution-permutation network. This network is a series of operations than either replaces input with output bits (substitution) or shuffles the bits (permutation).
It uses 128-bit input plaintext, but it operates on bytes rather than bits. So, the input is represented as 16 bytes (because 128 bits = 16 bytes), arranged in a 4 x 4 matrix. This matrix, called the state, will be modified as the algorithm progresses. AES also operates in rounds, but the number of rounds is variable and is based on the length of the key used. A 128-bit key will run AES for 10 rounds, a 192-bit key for 12 rounds, and a 256-bit key will run for 14 rounds. Similar to DES, each round uses a different key. These subkeys are 128-bits in length and are calculated from the original key.
AES proceeds as follows:
Round 1
a. AddRoundKey
Rounds 2 through (n-1)
a. SubBytes
b. ShiftRows
c. MixColumns
d. AddRoundKey
Round n
a. SubBytes
b. ShiftRows
c. AddRoundKey
The first function, AddRoundKey
, takes the current state (a 16-byte matrix) and XORs it with the key for this particular round. The result is the new state.
SubBytes
is one of the substitution functions of AES. The 16 byte state matrix is substituted using a S-box from the design of the specific AES implementation. This step is very similar to the substitution
step in DES in that it uses non-linearity and an affine transformation to provide security to the system.
ShiftRows
shifts the bytes in each row with respect to each other. Typically, the top row of the state will remain unchanged, the second row will shift left one, the third row will shift left two and the fourth row will shift left 3. This step is done to ensure the columns are not linearly independent, which would turn AES into 4, independent block ciphers.
MixColumns
multiplies each column of the state by an invertible function, a fixed polynomial.
Decryption in AES is the same algorithm as encryption, but in a reverse manner. Decryption, unlike in Fiestel's structure, needs to be implemented separately because the functions are in reverse order, but they are very similar.
Blowfish
Blowfish is another symmetric, block cipher. It was created after DES but before AES. Its block size is 64 bits, and it can use key lengths of anywhere from 32 up to 448 bits. Like DES, it is a 16-round Fiestel cipher. It's S-boxes, unlike in DES, are key-dependent, so they are generated dynamically.
There are 5 subkey arrays in Blowfish, 1 18-entry P-array and 4 256-entry S-boxes. These subkey arrays store the subkeys used in every round of Blowfish. One P-array subkey is used in each round, and the remaining two entries are XOR'd with the final output after the last round. The S-boxes accept 8-bit input and produce 32-bit output. They are initialized using values from the hexadecimal digits of pi. Blowfish's key scheduler then takes the secret key and XORs it with all the P entries in order. Then, a 64-bit all zero block is encrypted with this algorithm and result replaces the first two entries in P. This process continues, with new subkeys, and the result replaces the following two entries in P, and so on. The entire P array (9 pairs of entries) and each of the 4 S-boxes (128 pairs each) will be replaced in this process. A total of 521 runs are needed to generate all the subkeys.
Each of the 16 rounds of Blowfish has 4 operations. In round \(r\):
- XOR the left half of the data with the \(r\)th entry in P.
- Use the output from (1) as the input for Blowfish's F-function\(\star\).
- XOR the output from the F-function with the right half of the data.
- Swap the left half and the right half of the data\(\star\)\(\star\).
\(\star\)The F-function breaks up the 32-bit input into 4 segments and feeds them to the 4 S-boxes, resulting in 4 32-bit outputs. These are added together modulo \(2^{32}\), and XOR'd to produce a 32-bit output.
\(\star\)\(\star\)After the very last round, undo this swap. Instead, XOR the left half of the data with the second to last entry in P, and XOR the right half of the data with the last entry in P.
The following example is how the first round Blowfish would proceed if every subkey was 8 bits instead of 32. For simplicity, the inner working of the S-boxes are not described, and their output is just a given.
P-array = [11010110, ...]
Input Data =
0110011011010010
XOR the left half of the data with the first entry in the P-array.
\(01100110 \oplus 11010110 = 10110000\)
Use the output from step one as the input to the F-function.
The 8-bit result is split into 4, 2-bit sections and fed into the 4 S-boxes. The result is 4 8-bit outputs: 11010110, 11001010, 00010101, 01101010. These are added together and XOR'd in the following way:
\[11010110 + 11001010 \mod 2^8 = 10100000 \textrm{ (adding outputs 1 and 2)},\] \[10100000 \oplus 00010101 = 10110101 \textrm{ (XORing the addition of 1 and 2 with output 3)},\] \[10110101 + 01101010 \mod 2^8 = 00011111 \textrm{ (adding the result of the previous equation with output 4)}.\]
XOR the output from the F-function with the right half of the data.
\[00011111 \oplus 11010010 = 11001101 \textrm{ (This is the new "left" half of the data)}.\]
Swap the left half and the right half of the data.
left:
11001101
right:
11010010
Now the data, after one round of Blowfish, is
1101001011001101
.
Decryption is the exact same as encryption (similar to DES), except for one thing. Instead of progressing through P from its first entry to its last, decryption will progress through P from its last entry to its first.
Twofish
Twofish is the successor to Blowfish and is similar in many ways. It almost beat AES to be the global standard. It is slower than AES when using 128-bit key sizes but slightly faster when using 256-bit key sizes. Twofish and Blowfish have a higher space complexit, meaning they take up more computer memory, than AES, making them less attractive. Due to its increased key sizes, Twofish is an improvement in security over Blowfish.
It uses block sizes of 128 bits and keys sizes of up to 256 bits. It, like DES and Blowfish, uses a Fiestel structure when encrypting and decrypting its messages and ciphers. It also uses pre-computed S-boxes, splitting it private key into two sections and using one to modify the encryption algorithm.
References
- Gothberg, D. Cryptography diagrams. Retrieved July 13, 2016, from https://commons.wikimedia.org/wiki/Category:Cryptography_diagrams
- Crypto, M. Cryptography diagrams. Retrieved July 13, 2016, from https://commons.wikimedia.org/wiki/Category:Cryptography_diagrams