×

# Genetic Algorithm for Cryptarithms

Among passionate puzzle solvers and recreational mathematicians, Cryptarithms are already pretty well-known. A Cryptarithmetic Problem is an equation consisting of words in which letters represent unique digits. To Solve a Cryptarithmetic Problem refers to the process of finiding out the relation between the letters and digits such that the substitution of the letters with their corresponding digits results a mathematically consistent equation.

A famous example of a cryptarithm, due to Henry E. Dudeney, is $\overline{\text{SEND}} + \overline{\text{MORE}} = \overline{\text{MONEY}}$

The solution to this puzzle is

$\left\{ (O \to 0),\\ (M \to 1),\\ (Y \to 2),\\ (E \to 5),\\ (N \to 6),\\ (D \to 7), \\ (R \to 8), \\ (S \to 9) \right\}$

Solution of cryptarithms is an NP-Complete problem.

Though the obvious way to solve the problem would be a depth first search, I found an implementation of Parallel Genetic Algorithms in this paper for solving them:

A. Encoding Individuals

The first part of solving a problem using GA is to define a way to encode the individuals into chromosomes. Assume that we are working with decimal numbers, then we can use an array of characters with length 10 which has indices 0 through 9; whenever we need to assign a digit to a letter we simply put that letter to the cell which has that index of the array. Since each chromosome represents a solution to the problem, changing the place of letters with each other or with empty cells may lead us to a different solution.

B. Fitness Function

Fitness function evaluates the quality of an individual. The fitness function for this problem is defined with the following formula: $FITNESS = |(F+S)-R|$ Assuming R as the result of the operation, F as the first operand and S as the second operand.

When the fitness of individuals gets closer to zero we will get better individuals. If the fitness of an individual is exactly zero then that individual is the correct solution.

C. Mutation

Mutation operator randomly generates two numbers between 0 and 9 and exchanges the content of the cells in these two indices of the chromosome.

The mutation operator should ensure that the exchange is not illegal, which means it should not allow an exchange that would result positioning any start letter of the words in the zero index of the chromosome. Furthermore, a more efficient mutation operator should not allow the exchange of empty cells.

D. Parallelization

Parallelization is applied to the process of creation of new generations. There is a coordinator thread which is in charge of organizing, finding and distributing the fittest individuals and some of generator threads which are responsible for making new generations. Each thread has an internal population. At the beginning each thread generates a random population of a fixed size and puts them into its internal population. As the thread generates a new individual, it places it in the right place of its internal population by using insertion sort in an increasingly manner according to individuals’ fitness. This way always the better individuals have lower indices and if there is a solution, it would be in index 0. When all of the threads generated new population, the coordinator thread picks a number of best individuals plus some randomly chosen individuals from each thread’s internal population and accumulates them in a list and then distributes these chosen individuals over all the threads as the individuals that the threads use to generate a new population. This process goes on until the coordinator thread finds an individual with a fitness equal to zero during the process of picking best individuals out of each thread’s internal population.

I highly recommend you read the paper for the full algorithm. The best thing about it is that it runs much faster than standard DFS.

Here is my python implementation of this algorithm. This implementation is a bit buggy and needs work.

2 years ago

Sort by:

Paste removed!! · 2 years ago