For example, by the time you read the first four letters of a word starting with
calc, you can be pretty sure it's going to end up being
calculation. You don't need all the extra letters to distinguish the remaining possibilities. Concretely, if we take a list of all the words of a given length that exist, and sort them into alphabetical order, we only need \(\log_2 N\) questions to identify any given word in the list (where \(N\) is the number of words in the list). On the other hand, if we were making full use of the language, we could manage \(\displaystyle 26^L\) unique words of length \(L\).
Using the English language dictionary built in to
UNIX operating systems, and filtering for words of length 5, I find 10230 unique words. Taking words of length 5 as a proxy for the entire English language, how short, on average, could we make five letter words before someone with perfect reasoning couldn't read them anymore?