# Microstates

Much of physics is concerned with calculating exact outcomes of simple problems, e.g. the motion of a charged particle in a field, the curve of a string hung between two fixtures, or the expression of a high copy enzyme in a cell. Simple is not to say easy, but that the number of pieces to the problem is relatively small, and it is possible to calculate using the usual laws of dynamics. In many problems however, such as the spread of disease through a population, the material properties of polymers, the flow of traffic near rush hour, or the formation of memory in collections of neurons, where there are a large number of relevant, possibly interacting parts, an exact approach is not possible nor desirable, and we can instead attack the problem from a statistical perspective.

## Microstates as Incomplete Descriptions

One important feature of these kinds of problems is that we are not usually interested in a complete description of the states of their variables, but are instead after a small subset of them, e.g. instead of asking "will this particular PCR primer definitely bind to this particular target in the genome?", we ask "what fraction of targets in the genome will a PCR primer bind to?", or some condition that broadly classifies their behavior into one of several groups, e.g. "what is the ratio of infection rate to death rate above which *Ebola* will completely infect a population?" rather than "will person X contract Ebola?". To completely specify the state of a system of even modest complexity (so that we could begin to imagine a deterministic calculation) would take more time, and effort than most other pursuits in life, and would yield virtually nothing of value.

To get a handle on this, let us consider the system of a poker hand drawn from a standard deck of 52 cards. If we wanted to know the exact state of the system, we would ask the question "what cards do you have?" to which we could expect to hear any of the $\binom{52}{5}$ possibilities such as $5\spadesuit,\text{K}\heartsuit, \text{J}\clubsuit, 2\diamondsuit, \text{3}\heartsuit$. This kind of description is called the **microstate** of our system because it specifies exactly the value of every card. Further, if we ask the popular question "what is the chance of that?", the answer is simply $1/\binom{52}{5}$ because every hand has the same chance of coming out from the deck, i.e. every microsate is equiprobable.

Thankfully, in poker we are not usually interested in the exact microstate description of our hand, but instead in whether or not it is of a common pattern like a "flush", or a "straight", or "four of a kind", *et cetera*. This is called the **macrostate** description of our system, because it captures the essential information (a description of the pattern) without specifying the details (the exact cards).

Consider our task if we needed to determine winners in poker through the microstate description. For each set of hands, we would need to consult a lookup table of $\binom{52}{5} = 2,598,960$ entries, search through the list for the exact match to each hand, and compare their values.

In the macrostate description, we can focus our attention to a vastly smaller set of possibilities such as "hands for which any four of the cards in the hand have the same number", or "hands for which the five cards be arranged to give a consecutive sequence?", whose probabilities can be easily calculated with combinatorics. In poker, the only macrostates we care about (and their probabilities) are

$\begin{array}{|c|c|} \hline \text{Pattern} & \text{Microstates} \\ \hline \text{royal flush} & 4 \\ \text{straight flush} & (13-4)\times 4 =36 \\ \text{four of a kind} & 13\times 24 = 624 \\ \text{full house} & \binom{4}{2}\times\binom{4}{3}\times 13\times 12 = 3,744 \\ \text{flush} & 4\times\binom{13}{5} - 4 - 36 = 5,108 \\ \text{straight} & 10\times 4^5 - 40 =10,200\\ \text{any three of a kind} & 13\times\binom{4}{3}\times\binom{12}{2}\times\binom{4}{1}^2 - 3,744 = 51,168 \\ \text{two pairs} & \binom{4}{2}^2\times 13\times 12\times 2\times 11= 123,552 \\ \text{any two of a kind} & \binom{4}{2}\times 13\times \binom{12}{3} \times 4^3= 1,098,240 \\ \hline \end{array}$

Managing nine possibilties vs $2.5\times10^6$ (a $\sim$ 100,000-fold reduction) frees up quite a bit of thinking for other tasks.

## Microstates and Macrostates

We see in poker that macrostates enable us to cut through superfluous details, like whether we have $7\clubsuit,\text{K}\heartsuit, \text{2}\clubsuit, 2\heartsuit, \text{9}\heartsuit$ or $\text{A}\diamondsuit,\text{K}\spadesuit, \text{2}\heartsuit, 2\diamondsuit, \text{9}\clubsuit$, to deal in the likelihood of different card patterns, and thus gives us a game to play. In other words, for most patterns there are many, many possible hands, and it doesn't make much difference in the end which one of them we have. Our principal interest is in the value of the pattern, which is mostly independent of the minute details (in both of the above hands, we just have a pair of 2s). The situation is analagous in statistical mechanics, where by working in the macrostate description of our system, we eliminate a great deal of needless complexity introduced by microstates, and give ourselves a better chance to make progress.

Note that in poker the value of the macrostate is directly related to the number of underlying microstates that. If we assume that every individual card hand is equally probable (which is true, averaged over all deck shuffles), then the likelihood of a macrostate is simply given by the sum of all microstates which correspond to it, divided by the total number of possible microstates, subtracted from one. If we have a pattern $\gamma$, and a state $s$ that satisfies the pattern, we say $\gamma \sim s = 1$, otherwise $\gamma \sim \mathcal{S} = 0$.

Thus, we can write the probability of the pattern $\gamma$ as

$p(\gamma) = \frac{\sum_\mathcal{S} s\sim\gamma}{\sum_\mathcal{S}}$

and its value as $v(\gamma) = p(\gamma)^{-1} = \frac{\sum_\mathcal{S}}{\sum_\mathcal{S} s\sim\gamma}$

where $\mathcal{S}$ is the set of all states.

For example

- In poker, $\mathcal{S}$ is the set of all $\binom{52}{5}$ poker hands.
- In a messy room, $\mathcal{S}$ is the set of all arrangements of the clothes (very few arrangements correspond to a clean room!)
- In a dilute solution of non-interacting copies of an RNA sequence $\mathcal{S}$ is the set of all folded configurations of the RNA sequence.
- In a neural network, $\mathcal{S}$ is the set of all global neuron activities.

We can employ the following analogy to illustrate the connection between microstates, macrostates, and summing over irrelevant details.

Microstates, macrostates, and coarse-grainingIn poker (physics), we perform a sort of summation (coarse-graining) over the different possible instances (microstates) of a given pattern (macrostate), and deal in a description of the pure patterns.