×

# Ball-avoiding randomness in an Hamming metric

This is an open ended question, I don't know the answer and would like your ideas on the topic.

Let me first give you a practical setting where this kind of problem arise: Banks need to issue account numbers for their customer. For example , let's say that $$A, B$$ are two customers of Tartaglia Bank and that they are issued the following 6-digits account numbers:

• $$A:$$ 21-33-45
• $$B:$$ 27-33-45

Now, it should be pretty evident that, by chance, the two account numbers happened to be pretty similar: they only differ in the second digit. This is not good! One digit is easily mistaken and A could possibly receive the money that someone inteded to send to B and viceversa.

It's clear that we would like any two random account numbers to be as different as possible, that is, in this case, to differ in all 6 digits.

This has now the following generalization, that is our actual question:

Let $$V$$ be a vocabulary set (a set of "letters", in our previous case it was the set of natural numbers) , let $$n$$ be a positive integer and $$W = V^n$$ the set of n-length words (in our previous example $$n=6$$ and $$W$$ was the set of all possible 6-digits account numbers). Define for $$a,b \in W$$ $$H(a,b)$$ to be the number of letters in which the two words differ (this is called an Hamming Distance and it can be shown to induce a Metric Space on W). Given any two words $$a,b \in W$$ , what is the chance that their distance is maxmised, i.e. that $$H(a,b)=n$$ ? Generalising further, given $$d > 0$$ what is the chance that $$k \geq 2$$ words $$a_1,...,a_k \in W$$ are such that $$H(a_i,a_j)\geq d$$ for all $$1\leq a_i < a_j \leq k$$?

I am fairly sure this problem or one of its variant should be fairly doable and I'll try to spend some time on it when I finish publishing these notes. It was just one of those real life inspired problems that I felt like to share on this platform as a way to test it (this is my first post here) and perhaps to interest some other people, so I am looking forward for your answers/ideas/generalisations/link to papers/critiques/suggestions.

Note by Ermes Trismegisto
1 year, 1 month ago

Sort by:

Given any two words $$a,b \in W$$ , what is the chance that their distance is maxmised, i.e. that $$H(a,b)=n$$ ?

Isn't it $$(V-1)^ n$$ · 1 year, 1 month ago

Yeah, I agree with you that's pretty easy: you choose the letters the first word $$a$$ and then when you choose the letters of $$b$$ you can't use one character for each letter, so that one is farily easy. · 1 year, 1 month ago

Generalising further, given $$d > 0$$ what is the chance that $$k \geq 2$$ words $$a_1,...,a_k \in W$$ are such that $$H(a_i,a_j)\geq d$$ ?

I assume both $$i$$ and $$j$$ can take any value from $$1$$ to $$k$$ except the case $$i = j$$. Right? · 1 year, 1 month ago

Oh shit, good point ,yes! I'll edit it now, thanks! · 1 year, 1 month ago

Hey there! Nice job on your first post! Welcome to the community.

I do not know a lot about Hamming Codes. Could you explain the question to me in some informal way? · 1 year, 1 month ago

Thank you very much! To be honest with you, I don't know much about them myself, just happened to know the name and learnt about metric space and it's a pretty popular example. I was a bit worried that I might have added some useless complications , so I'll try to rexpress the final question informally: basically, take $$k$$ different $$n-$$ letters words . (Here the words "letter" and "word" are to be intended in a more general mathematical fashion: for example £$%&??£ is a valid 9-letters word where each letter belong to the set of "keyboard keys". This set is called in general a vocabulary set and we can assume it to be finite: so your answer is also going to contain the number of element of this set (the cardinality of the set) as one of the parameters!). So now you have $$k$$ $$n-$$letter words.Given any two of these, you can count in how many characters or letters they differ: for example: • % % £ & & %$ %

• % \$ £ & & ? ? %

Those two words above are both 8 letters long and they differ in the 2nd,6th and 7th postion. We say that their Hamming distance is three, that is , they differ in three places, very easy!

Now you fix a positive number $$d$$ and you ask what is the chance that taken any pair of the $$k$$ words is such that their distance is at least $$d$$, i.e. each pair ar at least $$d$$ words "apart".

I hope this reformulation makes sense, let me know! · 1 year, 1 month ago

With you, so far.

So, we are trying to find the probability that a set of $$k$$ $$n$$ length words over some alphabet has the property that no two strings are closer than $$d$$. Right? · 1 year, 1 month ago

Replace "closer" with "further apart" and that's precisely it. ($$H(a_i,a_j) \geq d$$) so we want the distance to be at least d. · 1 year, 1 month ago