On non-adaptive majority problems of large query size

We are given $n$ balls and an unknown coloring of them with two colors. Our goal is to find a ball that belongs to the larger color class, or show that the color classes have the same size. We can ask sets of $k$ balls as queries, and the problem has different variants, according to what the answers to the queries can be. These questions has attracted several researchers, but the focus of most research was the adaptive version, where queries are decided sequentially, after learning the answer to the previous query. Here we study the non-adaptive version, where all the queries have to be asked at the same time.


Introduction
A widely studied problem in combinatorial search theory is the so-called Majority Problem.We are given n indexed balls -say the set [n] = {1, 2, ..., n} -as an input, each colored in some way unknown to us with one of two colors.A ball i ∈ [n] is called majority ball if there are more than n 2 balls in the input set that have the same color as i.We would like to find a ball of the majority color or show that there is no majority color by asking subsets of [n], that we call queries.We would like to determine the minimum number of queries needed in the worst case with an optimal strategy if all the queries are fixed at the beginning.We call this the non-adaptive version of the Majority Problem.If the queries may depend on the answers to the previous ones, that we call the adaptive version of the Majority Problem.
We still need to describe what kind of queries one can use.Each query corresponds to a subset of size k of the input set.There are different variants of this problem, according to what the answer to a query is.More precisely, a model of the Majority Problem is given by [n], the number of colors, the size of the queries k, the possible answers and whether it is adaptive or non-adaptive.In this paper we deal only with two colors, in the non-adaptive case.Sometimes we look at the set of queries as a hypergraph Q, with the queries being the (hyper)edges.We refer to balls also as vertices.

Models
The most basic model is the pairing model.In this model the size of a query is two, and the answer is YES, if the two balls have the same color and NO otherwise.
We note that the adaptive version of this problem, when the number of colors is not limited was investigated by Fisher and Salzberg [8], who proved that ⌈3n/2⌉ − 2 queries are necessary and sufficient.If the number of colors is two, then Saks and Werman [14] proved that the minimum number of queries needed is n − b(n), where b(n) is the number of 1's in the diadic form of n.The non-adaptive version was studied in [9].
In this paper we deal with generalizations of the pairing model, when we ask queries of larger size.The first model of this kind was introduced and investigated by De Marco, Kranakis and Wiener [6], then many related results appeared in the literature [2,3,5,7,10,11].However, most of them studied only the adaptive case.
The authors considered some of these models in [12], and improved the existing bounds in the adaptive case.Here we investigate the non-adaptive versions.We remark that the first arXiv version of [12] contained a section on the non-adaptive case, thus most of our results.Following the suggestion of an anonymous referee, we removed that part from [12], with the plan of publishing it separately.We also extended the results slightly.

Hypergraph language.
The queries can be considered as edges of a hypergraph, which we call query hypergraph and usually denote by Q.We introduce some hypergraph properties, that we will use later.A hypergraph has Property B if its vertices can be colored with two colors such that there is no monochromatic edge in the hypergraph, i.e. an edge with vertices of the same color.For k ≥ 1, let us denote by m(k) the cardinality of the edge set of a smallest k-uniform hypergraph that does not have Property B. This parameter is widely studied, the best lower bound on m(k) we are aware of is Ω(2 k k/ log k) due to Radhakrishnan and Srinivasan [13].
For n ≥ 2k − 1 we also consider m(k, n), which is the cardinality of a smallest k-uniform hypergraph with n vertices that does not have Property B. Obviously if n is large enough, then we have m(k, n) = m(k).
We also use a similar notion that we call Property C. Let k ≥ 2. We call a coloring of a k-set with two colors balanced if the cardinality of the two color classes differ by at most one.We say that a hypergraph has Property C if its vertices can be colored with two colors such that every edge is balanced.Let us denote by d(k) the cardinality of the edge set of a smallest k-uniform hypergraph that does not have Property C.
We also consider d(k, n), which is the cardinality of the edge set of a smallest k-uniform hypergraph with n vertices that does not have Property C. Obviously, if n is large enough, then we have For odd k, Eppstein and Hirschberg [7] proved d(k) ≤ k + 3 log k + 4. If k is even, this problem can also be formulated as we are looking for the smallest hypergraph with a positive discrepancy.If k = 2 mod 4, then it is easy to see that d(k) = 3.Let snd(i) be the smallest positive integer that does not divide i. Alon, Kleitman, Pomerance, Saks and Seymour [1] and Cherkashin and Petrov [4] proved that there exist constants c 1 and c 2 such that if k = 0 mod 4, then Structure of the paper.
The rest of the paper is organised as follows: in Section 2 we introduce the models and state the known results in the adaptive setting.In Section 3, 4 and 5 we state and prove our results regarding the different models.

Models
In this section we define the models we study, and state the results known in the adaptive case.Three of the models were introduced by De Marco and Kranakis [5], and their (adaptive) bounds were improved by Eppstein and Hirschberg [7] and by the authors [12].The last one was introduced by Borzyszkowski [2], who found the exact answer in the adaptive case.In each of these models, we are given n indexed balls, colored with two colors, and we ask queries of size k (k ≥ 2) that we denote by Q.The models differ only in the possible answers to the queries, thus we emphasize what the answers can be.We give each model an abbreviation that we indicate after its name.If the abbreviation is X, then we will denote by A(X, k, n) and N (X, k, n) the number of queries needed to ask in the worst case of the adaptive/nonadaptive version of that model, respectively.

Models, adaptive results
• Output (or Partition) Model = OM: Answer : {Q ′ , Q ′′ }, a partition of Q, where Q ′ is the set of balls of one color and Q ′′ of the other color.No indication is provided about which of the colors is in Q ′ .
In the following theorem, the upper bound if n is even is due to De Marco and Kranakis [5], while the other bounds are due to Eppstein and Hirschberg [7].
Theorem 1 (Eppstein, Hirschberg [7], De Marco, Kranakis [5]).For all 2 ≤ k ≤ n we have • Counting Model = CM: Answer : a number i ≤ k/2 such that the query has exactly i balls of one of the color classes (thus k − i balls of the other color class).
In the following result, the upper bound is due to Eppstein and Hirschberg [7] , while the lower bound is due to Gerbner and Vizer [12].
• General (or Yes-No) Model = GM: Answer : YES, if there exist two balls of different colors, NO otherwise.
Theorem 3 (Gerbner, Vizer [12]).For any 2 ≤ k < n with 2k − 1 ≤ n we have Answer : YES, if there exist two balls of different colors, and such a pair is pointed out, NO if all balls have the same color.

Basic inequalities
By definition, the following inequalities hold for the above models for all Proof: A hypergraph is connected if we cannot partition the underlying set into two parts such that no edge contains vertices from both parts.Observe that the least number of edges of a connected k-uniform hypergraph on an n-element underlying set is ⌈ n−1 k−1 ⌉.Indeed, we can build one by taking an arbitrary set of size k, and add new edges that intersect the union of the earlier ones in one element.On the other hand, if we are given a connected hypergraph, and we take an arbitrary set from it, there must exist another set that intersects it, then another set that intersects the union of the earlier ones, and so on.Similarly, if a hypergraph has d connected components, then the least number of edges it can have is ⌈ n−d k−1 ⌉.It is easy to see that if the query hypergraph Q is connected, we find more than a majority ball; we find the partition to color classes.This proves the upper bound in case n is even.If n is odd, we can ask a connected query hypergraph on n − 1 vertices.A ball that is a majority ball among those vertices is also a majority ball in the whole set of balls, while if there is no majority ball there, then the remaining ball is a majority ball.This finishes the proof of the upper bound.
Let us continue with the lower bound.If the query hypergraph Q is disconnected and n is even, we consider an arbitrary partition of the underlying set of balls into two parts with no edge intersecting both.
Then the answers might be that in both parts one of the color classes is larger by the same number l > 0.
What we mean is that we take a coloring with the above property, and then the answers do not tell any further information.If the larger color class is the same color in the parts, there is majority, otherwise there is not.
We cannot find out which one is the case, hence we cannot show a majority ball.This finishes the proof in case n is even.If n is odd, and the underlying set can be partitioned into three parts such that no edge contains vertices from two parts, then it is possible that in each part the difference between the size of the color classes is 1 or 2. In this case no majority ball can be shown.We have already mentioned that is easy to see that at least ⌈ n−2 k−1 ⌉ queries are needed to avoid this.

Counting model
Theorem 6.For 2 ≤ k, n with 2k − 1 ≤ n we have Proof: If k is even, then we ask all the k-sets containing a given (k − 1)-set A ⊂ [n].Note that -using that k is even -we know that i, j ∈ [n] \ A have the same color if and only if the answers to the queries A ∪ {i} and A ∪ {j} are the same.Thus we can partition [n] \ A into two parts such that the balls in each part have the same color.Additionally, if we ever get different answers for two such queries, then we know the number of balls of the corresponding colors inside A and we can choose a majority ball or find out that there is none.
If this is not the case, then -knowing that n ≥ 2k − 1 -we have that all balls in [n] \ A are of the majority color.
If k is odd, then the previous argument does not work, as the answers to some queries could be that there are (k + 1)/2 and (k − 1)/2 balls of the two colors, even if the balls added to A are of different color.
However, it cannot happen if A contains different number of red and blue balls.So we use unbalanced colorings here.Let us take a (k − 1)-uniform hypergraph F on [n] that does not have Property C and has cardinality d(k − 1, n).Moreover, if every pair of edges in F has intersection of size at least (k − 3)/2, then we add another edge of size k − 1 that intersects one of them in a set of size less than (k − 3)/2.This way we get a hypergraph F ′ , and the query set consists of all k-sets containing edges of F ′ .There is a set F ∈ F ′ that is unbalanced, i.e. it contains at least (k + 1)/2 ball of the same color, say blue.
If every answer (to every query) were the number (k − 1)/2, then all the balls not in F would be red.
In this case there would be exactly (k + 1)/2 blue balls and every member of F ′ would contain at least (k − 1)/2 of them.Then their intersection would have size at least (k − 3)/2, a contradiction.That is why we added the additional set to F .
Thus there is an answer different from (k − 1)/2.Now similarly to the case when k is even, we can find F , and then we know the relation of the outside balls to each other, and the number of the balls of the corresponding colors inside F .Theorem 7.For 2 ≤ k and sufficiently large n we have if k is even and n is even, if k is even and n is odd.
, n is odd and large enough.
Proof: To prove the lower bound for k even and n even, we first show that any query can contain at most one vertex of degree one.Indeed, suppose a query Q contains i and j (with i = j) of degree one, then it is possible that the answer to Q is the number (k − 2)/2 and there are (k − 2)/2 red and (k − 2)/2 blue balls besides i and j in Q.Furthermore, it is possible that altogether there are n/2 blue and (n − 4)/2 red balls besides i and j.We know that i and j have the same color, but we do not know if it is blue or red.Thus we do not know if there is a majority color or not.
Let q denote the number of queries, and x ≤ q be the number of vertices of degree one.First we show that there is no vertex of degree 0. Indeed suppose that there is a vertex v of degree 0. Then it is possible that among the other vertices, the number of blue balls is larger than the number of red balls by one.In this case we do not know that there is a majority ball or not, as we do not know anything about the color of v.
Therefore, we have at least n − x vertices of degree at least 2. Thus, the number of pairs (Q, w) where Q is a query, w is a vertex and w ∈ Q (i.e. the sum of the degrees) is at least 2(n − x) + x.On the other hand, this number is exactly kq.Thus we have kq ≥ 2n − x ≥ 2n − q and rearranging gives the bound.
In the other cases below, we just state and prove the degree conditions that are needed to obtain the desired bound, and omit the similar easy calculation that finishes the proof.
In the case k is even and n is odd, it is enough to show that at most one query can contain two elements of degree one.We prove it by contradiction, since otherwise we can get two monochromatic pairs, and it is possible that there are (n − 1)/2 blue and (n − 7)/2 red balls besides those four balls.In this case we cannot show a majority ball.
In the case k is odd, while n is even and large enough, it is enough to show that for every i ∈ [n], its degree is at least d(k − 1, n − 1).This is true, since otherwise we can color the hypergraph with vertex set ∪{Q : i ∈ Q} \ {i} and edge set {Q \ {i} : i ∈ Q} (the open neighborhood hypergraph or link hypergraph of i) in a balanced way, thus we do not get any information about the color of i.Then -using that n is large enough and even -we can color the remaining elements (i.e. [n] \ ∪{Q : i ∈ Q}) such that the coloring of all the balls is balanced.But then it depends on the color of i if there is a majority color or not.
If n is odd, a similar argument shows that all but one of the balls have degree at least d(k − 1, n − 1)/2.Indeed, otherwise there are i, j ∈ [n] with i = j such that less than d(k − 1, n − 1) queries contain at least one of i or j.Let F be the hypergraph that has these queries as edges.Let us remove i and j from them and for those queries containing both i and j, we add a new ball s ∈ [n] instead.The resulting (k − 1)-uniform hypergraph F ′ has less than d(k − 1, n − 1) edges, thus it has Property C.This gives a coloring that is balanced on every edge of F ′ .Let red be the color of s.This coloring can be extended to all the balls except for i and j such a way that there are (n − 1)/2 blue and (n − 3)/2 red balls among the balls in [n] \ {i, j}.
Then the answers to queries containing neither i nor j are according to this coloring.Moreover, a query Q containing exactly one of them can also be answered according to this coloring without knowing the colors of i and j, as Q \ {i, j} is balanced.Finally the answer to queries containing both i and j is the number It is easy to see that if at least one of i and j is red, the answers are consistent with the coloring.
Thus any color can be minority (i.e.not majority), hence a ball different from i and j cannot be the majority ball, as we know its color.But i can be red and j blue, or the other way around.In that case the red ball is in minority, thus we cannot say that i or j is a majority ball, finishing the proof.
The above theorem can be improved with similar, but more involved arguments, as we show below.For simplicity, we only deal with the case when k and n are both even.
Theorem 8.If k is even and large enough, and n is even, then we have for every k/2 < i < k.In particular, for every ε, there is a k 0 such that if k > k 0 is even, then there is an We remark that for any specific k, one can easily obtain from the first part of the above theorem a lower bound for N (CM, k, n) with a simple calculation.For example, if k = 8, then i = 5 gives the best lower bound, and it is slightly larger than the lower bound 2n/9 from Theorem 7.
Proof: Let F be the subhypergraph of the query hypergraph Q having as edges those queries that contain at least i + 1 vertices of degree at most 2. Let F ′ be the multi-hypergraph obtained from F by deleting the vertices of degree more than 2 from each edge.Note that F ′ may be non-uniform.
Claim 9.The total size of F ′ , i.e. the sum of the edge sizes is at most 2k + (i + 1)n/i.
Proof Proof of Claim: First we show that F ′ is a linear hypergraph.Indeed, otherwise there are two queries x and y that do not appear in any other query.Then one can color the other balls of Q 1 and Q 2 such a way that both Q 1 and Q 2 have (k − 2)/2 red and (k − 2)/2 blue balls besides x and y.Furthermore, one can color the other balls such that there are n/2 blue and (n − 4)/2 red balls besides x and y.Then the answer to every query is according to this coloring, and the answer to Q 1 and Q 2 is the number (k − 2)/2.Then we know x and y have the same color, but we do not know if it is blue or red, thus we do not know if there is a majority color or not.
Next we show that F ′ does not contain any linear cycle covering at most n/2 + 3 vertices.A linear cycle of length ℓ consists of ℓ edges h 1 , . . ., h ℓ such that the only intersections among those are the singleton intersections v j of h j and h j+1 for every j, modulo ℓ (i.e.we also have the singleton intersection v ℓ of h ℓ and h 1 ).Let V = {v 1 , . . ., v ℓ }.
Assume that there is a linear cycle of length ℓ covering m ≤ n/2 + 3 vertices, and for every j, let Q j denote the query that h j was obtained from.We define a coloring of all the balls not in V such that this coloring already determines the answers to every query in Q, but does not determine whether there is a majority ball (hence it leads to a contradiction).
We first color the vertices that are in some Q j but not in any h j ′ with j = j ′ , to red.Then we go through the edges h j in an arbitrary order, and color the vertices in h j \ V , such a way that every Q j contains (k − 2)/2 blue balls and (k − 2)/2 red balls not in V .This is doable, as only the vertices of Q j \ h j , thus at most k − (i + 1) ≤ (k − 2)/2 balls of h i had been colored (to red) when we arrived to h i .Observe that we colored to blue only the vertices covered by the linear cycle, except the vertices v 1 , . . ., v ℓ and ℓ ≥ 3.
Therefore, we have colored m − ℓ ≤ n/2 vertices to blue so far.We claim that so far the number of blue balls is at least the number of red balls.Indeed, for each j ≤ ℓ, we colored (k − 2)/2 vertices in h j \ V to blue.As these sets are vertex-disjoint, there are ℓ(k − 2)/2 blue balls.Similarly, for each j ≤ ℓ, we colored (k − 2)/2 vertices in Q j to red, thus there are at most ℓ(k − 2)/2 red balls.Now we can color the remaining balls (besides those in V ) so that there are n/2 blue and n/2 − ℓ red balls.Then we can give an answer to every Q j as the number (k − 2)/2, and answer every other query according to this coloring.Then we know that every ball in V has the same color, but we do not know if it is blue or red, thus we do not know if there is a majority color or not.
This implies that every linear cycle in F ′ has more than n/2 + 3 vertices.Next we show that F ′ contains at most three linear cycles.Let L and L ′ be two linear cycles in F ′ with edges h 1 , . . ., h ℓ and h ′ 1 , . . ., h ′ ℓ ′ Then by the pigeonhole principle they share at least two vertices.We claim that L and L ′ have to share a linear subpath and nothing more (a linear path is a linear cycle without one of the edges, or a single edge, or a single vertex).Indeed, otherwise they share two subpaths, thus following L ′ , one leaves L and returns to it at least twice.It is easy to see that this way we can find two vertex disjoint linear cycles, a contradiction.
Therefore, for some vertices x and y there are three linear paths P 1 , P 2 , P 3 between x and y that share only these two vertices.These paths define three linear cycles.We claim that these are all the linear cycles in F ′ .
A fourth cycle intersects each of our three cycles multiple times, in particular it intersects two of the paths, say P 1 and P 2 .It has a linear subpath P 4 between vertices u in P 1 and v in P 2 such that P 4 shares only u and v with our three cycles.Let l j denote the number of vertices in P j , then l 4 + (l 1 + l 2 )/2 > n/2 + 3, because P 4 forms a cycle with any of the two paths between u and v in P 1 ∪ P 2 .On the other hand, l 1 + l 3 > n/2 + 3 and l 2 + l 3 > n/2 + 3, thus (l 1 + l 2 )/2 + l 3 > n/2 + 3. Adding up the above inequalities, we obtain l 1 + l 2 + l 3 + l 4 > n + 6, a contradiction.Now we can delete two edges of F ′ to obtain a linear hypergraph F ′′ without any linear cycles.We show that this cycle-free property implies that sum of the degrees of F ′′ is at most (i + 1)n/i (which completes the proof of the claim).Indeed, we can build F ′′ the following way.We start with an arbitrary edge, and build a connected component by adding a new edge sharing one vertex with the component.If a component is finished, we start again with a disjoint edge and repeat this.This way every new edge adds j +1 to the sum of the degrees for some j ≥ i, but also increases by j the number of the vertices used.As (j + 1)/j ≤ (i + 1)/i, we will get the largest total size if we add an edge of size i + 1 all the time, that is, at most n/i times.
We also use that any query in Q has at most one vertex of degree one, which we obtained at the beginning of the proof of Theorem 7.
Let us count the total sum D of the degrees in Q.On the one hand, D is obviously k|Q|, because every query adds k to the sum.But look at this more precisely.Let D ′ be total sum of the degrees of balls with degree 1 or 2, and D ′′ be total sum of the degrees of balls with degree greater than 2, thus we have k|Q| = D = D ′ + D ′′ .Let us look at D ′ + D ′′ in a different way.For a query Q, let us denote by q ′ (Q) the number balls in Q with degree 1 or 2, and by q ′′ (Q) the number of balls in Q with degree greater than 2.
We say that Q adds q ′ (Q) to D ′ and q ′′ (Q) to D ′′ , since If a query is not in F , it adds to D ′′ at least (k − i)/i times as much as it adds to D ′ .The queries in F altogether add at most 2k + (i + 1)n/i to D ′ .These imply that red balls among the balls in [n] \ {i, j}.Then we answer according to this coloring, and in particular we answer YES to queries containing one of i and j (as both colors appear among the balls in those queries).
We also answer YES to the queries containing both i and j.We know that those queries contain a blue ball different from i and j.Therefore, it is easy to see that if one of i and j is red, the answers are consistent with the coloring.Thus any color can be minority, hence another ball cannot be the majority ball, as we know its color.But i can be red and j blue, or the other way around.In that case the red ball is in minority, thus we cannot claim i or j is a majority ball, finishing the proof for the General model.
In Borzyszkowski's Model, if n is even, the same proof works, as we can point out a pair of balls not equal to i in every query.If n is odd, the same proof does not work, as it is possible that only s is of color red in a query.In that case either i or j has to be put in the pair that is pointed out, thus we might find out its color.However, we can show that all but one of the balls have to be contained in at least m(k − 2)/2 queries.
Indeed, otherwise there are less than m(k − 2) queries containing either i or j.Let G be the hypergraph that has those queries as edges, and remove i and j from them, and for those queries containing only one of i and j, we remove another arbitrary ball.The resulting (k − 2)-uniform hypergraph G ′ has less than m(k − 2) edges, thus it has Property B. From here it works as in the General model: We find a coloring without a monochromatic edge, then extend this coloring to all the balls but i and j, such a way that there are (n − 1)/2 blue and (n − 3)/2 red balls.Then answer according to this coloring, in particular we can answer YES and point out a pair of two balls of different colors without using i and j.By the same reasoning as in the case of the General model, any ball can be a minority ball, finishing the proof.

Let a be the number of balls of degree 1 ,
b be the number of balls of degree 2 and c be the number of balls of degree greater than 2. Then a ≤ |Q|, n = a+ b + c, D ′ = a+ 2b and D ′′ ≥ 3c.Thus n ≤ |Q|/2 + D ′ /2 + than m(k − 1) edges, thus it has Property B. This gives a coloring on a subset of [n] \ {i, j} ∪ {s} without a monochromatic edge, let red be the color of s.So far less than m(k −1)(k −1) balls have colors, thus we can extend this coloring to all the balls except for i and j such a way that there are (n − 1)/2 blue and (n − 3)/2