The knapsack cryptosystem is a public-key cryptosystem based on a special case of the classic problem in combinatorics known as the knapsack problem. It was developed by Ralph Merklee and Martin Hellman in 1978 and is one of the earliest public key cryptosystems. This special application of the knapsack problem is also akin to the subset sum problem, where the solution is rather time-consuming to compute as it belongs in the complexity class of NP.
The knapsack cryptosystem relies heavily on the subset sum problem. This problem can be stated as follows:
Subset Sum Problem
Given a knapsack of volume and elements of various volumes , can a subset of these elements be found that will completely fill the knapsack?
Another alternative form of the problem can be stated as follows:
For positive integers and a sum , solve the equation
where the value of can be 0 or 1 for all to .
It is evident that there may be no solution or more than one solution to the subset sum problem, depending on the choice of the sequence and the integer . This is the same as saying that the equation
has no solution, whereas the equation
has 2 distinct solutions, which are
The solution to any randomly chosen knapsack problem is of time complexity NP-hard, meaning that it is not efficient for a computer to solve. That said, all the known algorithms are as time-consuming as conducting an exhaustive search, testing all the possibilities of . This is computationally infeasible when the number of variables, , is greater than .
In order to alleviate the need for the extensive time complexity presented in the subset sum problem, algorithms often attempt to identify special properties within the coefficients, or values, of each item in the knapsack, , that make the problem much easier to solve.
One of these characteristics arises when a sequence of integers is super-increasing, meaning that is greater than the sum of its preceding terms. Namely,
for to . A simple example of a super-increasing sequence is presented in the numbers where
If a particular knapsack problem is based on a super-increasing sequence, it has a unique solution, granted it is solvable at all. If we want to solve the problem
where is a super-increasing sequence of integers, then the equation is solvable if and only if
The solution can be attained by assigning
Then, the remaining terms are obtained by choosing
This assignment produces a computationally easy problem to solve, taking a marginal amount of time compared to the original problem.
For example, for a problem like
the largest coefficient in the equation is assigned first, , and the coefficient is assigned to be . The next coefficient is , and . In this case, the sum of the remaining coefficients is
which cannot fill in the knapsack. Therefore must be included, and now the original problem may be written as
which can be solved quite easily.
The knapsack cryptosystem, also known as the Merklee-Hellman cipher, begins by generating user encryption keys as two knapsacks, one public and one private. The public key is the hard knapsack and the private key is the super-increasing knapsack. Two numbers, one called a multiplier and the other a modulus, are used to convert the super-increasing knapsack into the hard knapsack. These same numbers are used to convert the sum of the subsets of the hard knapsack into the sum of the subsets of the easy knapsack, which is a problem solvable in polynomial time.
- Step 1. A typical user first chooses a super-increasing sequence .Then a modulus and a multiplier such that are selected. This ensures that the congruence has a unique solution, say, .
- Step 2. Then the sequence of integers defined by for all to where is selected. Carrying out this transformation generally destroys the super-increasing property enjoyed by .
- Step 3. The user keeps secret the original sequence and and , but publishes in a public directory. Anyone who wishes to send a message employs this publicly available sequence as the encryption key. The sender begins by converting the plaintext message into a string of 's and 's by using the ASCII alphabet binary representation of digits.
- Step 4. The string is then split into blocks of binary digits. The public encrypting sequence is then used to transform the given plaintext, say, into the sum where the number is the hidden information that the sender transmits over an insecure communication channel.
Since each is either or , the problem of obtaining the plaintext block from is equivalent to solving a difficult knapsack problem. However, the private key can be used to convert this hard knapsack into an easy one, allowing an efficient decryption process.
Knowing the values of and , the recipient computes where
Because was chosen such that , this leaves
must hold. The solution of this super-increasing knapsack problem furnishes a solution to a difficult problem, and the plaintext block is recovered, which is converted back to its alphabetic representation.
For a more thorough explanation of the knapsack cryptosystem, refer to this worked example of the algorithm.
This cipher aroused a great deal of interest at the beginning because it was based on a probably hard problem, but in 1982 Adi Shamir provided a fast algorithm for solving knapsack problems. The main weakness of this system is that the sequence is too special. The system can be made difficult by changing the values of and at regular intervals (iterating) but even that variant was cracked in 1985. Since the knapsack cryptosystem is obsolete, the RSA cipher is now the dominant public key cryptography algorithm.