Let \(m>n>d\) be positive naturals and \(X=\langle x_{i}\mid i<m\rangle\in\{0,1\}^m\). Define \(f_{X}:[m]^{n}\longrightarrow \{0,1\}\) via \(f_{X}(i_{0},i_{1},\ldots,i_{n-1})=1\) exactly in case \(|\{j<m\mid x_{i_{j}}=1\}|=d\). Note, that the map \(X\in\{0,1\}^{m}\mapsto f_{X}\) is not necessarily injective.

Now assume one has access to just \(m,n,d,f\). How can one recover the list of all possible sequences \(X\) via computational means, such that \(f_{X}=f\)? Theoretically one can go through (via lexical ordering) the list of all elements \(Y\in\{0,1\}^{m}\) and compute each \(f_{Y}\) and check if \(f_{Y}=f\). But this will have a long average running time.

Construct an algorithm, \(\mathcal{A}(\cdot,\cdot,\cdot,\cdot)\) which satisfies \(\mathcal{A}(m,n,d,f)\subseteq\{0,1\}^{m}\) and \(\forall{X\in\{0,1\}^{m}:~}X\in \mathcal{A}(m,n,d,f)\Leftrightarrow f_{X}=f\) for all \(m>n>d\) and \(f:[m]^{n}\longrightarrow\{0,1\}\). Further let \(\mathcal{T}(m,n,d,f)\) be the total number of calls to the oracle \(f\) during the execution of \(\mathcal{A}(m,n,d,f)\) and \[\overline{\mathcal{T}}(m,n,d):=\frac{1}{2^{|[m]^{n}|}}\sum_{f\in\{0,1\}^{[m]^{n}}}\mathcal{T}(m,n,d,f)\]

which is effectively the average running time. Find an algorithm, \(\mathcal{A}(\cdot,\cdot,\cdot,\cdot)\), such that \(\overline{\mathcal{T}}(m,n,d)\) is minimised for all \(m,n,d\), if this is at all possible.

There is a more relaxed version of this problem in which \(d = 1\), and is a fixed value. This version may be more likely to have a polynomial time solution algorithm, or with an algorithm such that \( \exists m\ \exists n\ \exists d \exists X: \mathcal{T}(m,n,d,f) < |X| \).

A basic flowchart of how the solution algorithm should function in practice:

An example procedure:

The problem instance generator decides values \( m = 3,\ n = 2, \ d = 1, \ X = \{0, 1, 1\} \). This information is sent to the oracle who can perform function \(f\), and the values of \(m\), \(n\) and \(d\) are sent to the algorithm.

The algorithm decides to call f(0, 1) to which the oracle returns 1 as it knows exactly one of \( x_{0}\) and \( x_{1} \) is 1 and the rest are 0. Then, the algorithm calls f(0, 2) which returns 1 and f(1, 2) which returns 0. Since \( x_{1} \) and \( x_{2} \) match and both are different to \( x_{0} \), the algorithm can determine that \( X = \{0, 1, 1\}\) or \(X = \{1, 0, 0\} \). The algorithm sends both, and they compared with the \(X\) sent by the problem instance generator. Since there is a match, the algorithm has done a successful run for one combination of \(X\) for values \( m = 3, n = 2, d = 1\).

Credit to R Mathe for helping rewrite and improve this.

No vote yet

1 vote

×

Problem Loading...

Note Loading...

Set Loading...

Easy Math Editor

`*italics*`

or`_italics_`

italics`**bold**`

or`__bold__`

boldNote: you must add a full line of space before and after lists for them to show up correctlyparagraph 1

paragraph 2

`[example link](https://brilliant.org)`

`> This is a quote`

Remember to wrap math in \( ... \) or \[ ... \] to ensure proper formatting.`2 \times 3`

`2^{34}`

`a_{i-1}`

`\frac{2}{3}`

`\sqrt{2}`

`\sum_{i=1}^3`

`\sin \theta`

`\boxed{123}`

## Comments

Sort by:

TopNewestPlease clearly and formally redo the first set of bullet points and the paragraph before this. Your text is both mathematically and grammatically incomprehensible. For example

pronumerals?determine all values of \(X=\{x_{1},\ldots,x_{n}\}\) …fine, I get it, that’s the representation form,such that \(x\) maps to \(\{0,1\}\). Sorry, what? Do you mean that each \(x_{i}\in\{0,1\}\) or that each \(x_{i}\) is a \(\{0,1\}\)-valued function, and if so with what domain?…for any \(d\) in all \(m\)…This makes no sense in the context.Log in to reply

Thanks for the feedback. While they are admittedly obvious mistakes now that I look back, I haven't had experience writing "mathematically" like this. I've rewritten it and taken away the example case to try and make it read smoother. Please let me know if it makes sense to you now or if I missed something else.

It seems important to keep both versions of the problem as the first version seems more likely to have a polynomial-time algorithm.

Log in to reply

Your current formulation is a lot clearer. Having both versions is absolutely fine. Remaining issues:

viz\(X=\{0,1\}\).Log in to reply

I've added a diagram and a more complete example to describe what I expect as a solution. What I am interested in is the lowest average case number of calls to f a successful algorithm needs and how it is affected by changing m, n and d.

Log in to reply

allor to just findany onevalue of \(X\)?Log in to reply

prove that it is possible to determine all values of \(X\)is thus thoroughly misleading. The known inputs are \(m\), \(n\), \(d\), the hidden inputs is the sequence \(X\) itself. Moreover, the function is NOT to be executed as \(f(x_{0},x_{1},\ldots,x_{m-1})\), as per the original description, but rather as \(f(i_{0},i_{1},\ldots,i_{n-1})\) where \(i_{0},i_{1},\ldots,i_{n-1}\in\{0,1,2,\ldots,m-1\}\) are distinctindices. The problem is thus:Let \(m>n>d\) be positive naturals and \(X=\langle x_{i}\mid i<m\rangle\in\{0,1\}^{m}\). Define \(f_{X}:[m]^{n}\longrightarrow \{0,1\}\) via \(f_{X}(i_{0},i_{1},\ldots,i_{n-1})=1\) exactly in case \(|\{j<m\mid x_{i_{j}}=1\}|=d\). Note, that the map \(X\in\{0,1\}^{m}\mapsto f_{X}\) is not necessarily injective.

Now assume one has access to just \(m,n,d,f\). How can one recover the list of

allpossible sequences \(X\) via computational means, such that \(f_{X}=f\)? Theoretically one can go through (via lexical ordering) the list of all elements \(Y\in\{0,1\}^{m}\) and compute each \(f_{Y}\) and check if \(f_{Y}=f\). But this will have a long average running time.Construct an algorithm, \(\mathcal{A}(\cdot,\cdot,\cdot,\cdot)\) which satisfies \(\mathcal{A}(m,n,d,f)\subseteq\{0,1\}^{m}\) and \(\forall{X\in\{0,1\}^{m}:~}X\in \mathcal{A}(m,n,d,f)\Leftrightarrow f_{X}=f\) for all \(m>n>d\) and \(f:[m]^{n}\longrightarrow\{0,1\}\). Further let \(\mathcal{T}(m,n,d,f)\) be the total number of calls to the oracle \(f\) during the execution of \(\mathcal{A}(m,n,d,f)\) and

\[\overline{\mathcal{T}}(m,n,d):=\frac{1}{2^{m}}\sum_{X\in\{0,1\}^{m}}\mathcal{T}(m,n,d,f_{X})\]

which is effectively the average running time. Find an algorithm, \(\mathcal{A}(\cdot,\cdot,\cdot,\cdot)\), such that \(\overline{\mathcal{T}}(m,n,d)\) is minimised for all \(m,n,d\), if this is at all possible.

Note: \([m]^{n}=[\{0,1,2,\ldots,m-1\}]^{n}=\{A\subseteq\{0,1,2,\ldots,m-1\}\mid |A|=n\}\), thus \(|[m]^{n}|=\left(\begin{array}[m]{l}m\\n\\\end{array}\right)\) elements.

Log in to reply

Now that I think about it, the algorithm might not need to compute all values of \(X\) to find the average to a reasonable degree of accuracy, but yes that is why computing all of them seemed important at the time of writing.

A concrete purpose of this problem is investigating if it is possible that \( \exists m\ \exists n\ \exists d\ \forall X: \overline{\mathcal{T}}(m,n,d) < |X|\) or the more loose condition \( \exists m\ \exists n\ \exists d\ \exists X: \mathcal{T}(m,n,d,f) < |X| \) which could lead to a multiple compression algorithm where \(n\), \(m\) and \(d\) are commonly known and the calls to \(f_{X}\) as decided by the algorithm is stored inside a new \(X\), which can be reduced using the algorithm again and so on.

Thanks again for all of your help.

Log in to reply

ie\[\overline{\mathcal{T}}(m,n,d):=\frac{1}{2^{|[m]^{n}|}}\sum_{f\in\{0,1\}^{[m]^{n}}}\mathcal{T}(m,n,d,f)\]

or over all \(X\)

ie\[\overline{\mathcal{T}}(m,n,d):=\frac{1}{2^{m}}\sum_{X\in\{0,1\}^{m}}\mathcal{T}(m,n,d,f_{X})\]

I came up with an algorithm. I’ll post it later. Not sure if it’s the most efficient, tho’.

Log in to reply

Log in to reply

the problem is a kind of \(d\)-SAT problem: one is given (coded via \(f\)) knowledge of \(m\) propositions and when for every \(n\)-Tuple exactly \(d\) are true, and wishes to prove the existence of a model, satisfying these conditions. Then we need to compute the average over all \(f\).

the problem is of the form: a hidden \(X\) is given and we only see \(f_{X}\). We are to compute a possible value of \(X\). Then one needs to compute the average over all \(X\).

My first algorithm so far is really inefficient: \(\mathcal{T}(m,n,d,f)=m!/(m-n)!\sim m^{n}\). If \(n,d\) are fixed then this is polynomial in \(|X|\). If not, then this is exponential time (or worse!). The algorithm I employ is a simple recursion and doesn’t dynamically react to partial results, which is probably why it is not very fast. I need to think about this further.

Whilst your problem isn’t exactly \(3\)-SAT, there seem to be connections, and the latter is NP-hard, as far as I recall, so finding an efficient algorithm may be doomed. On the other hand, I’ve heard that replacing NP-problems by randomised counterparts may make more efficient algorithms possible.

Log in to reply

Unfortunately, I can't help you. Sorry !

Log in to reply

That's okay! It's still reassuring to know that my idea is worth your time.

Log in to reply