Sequential selection of the k best out of n rankable objects

The objective of this paper is to ﬁnd in a setting of n sequential observations of objects a good online policy to select the κ best of these n uniquely rankable objects. This focus is motivated by the fact that it is hard to ﬁnd closed form solutions of optimal strategies for general κ and n . Selection is without recall, and the idea is to investigate threshold functions which maintain all present information, that is thresholds which are functions of all selections made so far. Our main interest lies in the asymptotic behaviour of these thresholds as n → ∞ and in the corresponding asymptotic performance of the threshold algorithm.


Introduction
The so-called secretary problem is a classical problem of optimal stopping on a sequence of rankable observations.It is the problem of choosing the best of n rankable options appearing in random arrival order with probability 1/n! each.It was first solved by Lindley [8] in 1961.See e.g.Ferguson [7] for an interesting review.This is the special case κ = 1 of the problem we consider here, namely to find with one stop without recall the best of n uniquely rankable objects or options.Arguably the shortest solution to this problem is provided by the more general odds-theorem of optimal stopping (Bruss [2]).The optimal strategy is to wait until the threshold index s ∈ N and to accept the first relative rank 1 from s onwards (if any).Here s is defined as the largest m ∈ N such that n j=m r j ≥ 1 where r j denotes the odds of the jth event being a candidate for stopping.If no such m exists, then s := 1 by definition.
For the classical secretary problem r j = (1/j)/(1 − 1/j) = 1/(j − 1) because the best candidate among the first j is the jth in chronological order with probability 1/j.The corresponding win probability is then Putting s := s(n) it is well-known that lim n→∞ s n /n = 1/e.V n has the same limit as the function s n /n, namely 1/e.Interestingly, (Bruss [3]), the value 1/e is also the precise lower bound in much more general problems of stopping with maximum probability on any last specific event.For further generalizations for the case κ = 1 see the review by Dendievel [6].

κ best choices
The problem of making sequentially the κ best choices, or in short the κ-best problem, is to find an efficient stopping rule to select (all) the κ best objects online from a sequence of n objects.The study of this problem is the main objective of the present paper.Indeed, if κ > 1 then the optimal solution as well as the limiting behaviour of the thresholds seem difficult.Actually, one would not expect such a difficulty because one might expect a tractable recursive structure.Some recursive structure is there, of course, but a little reflection shows that once one has selected 1 ≤ κ < κ objects, the optimal continuation does not only depend κ − κ but also on the values of the chosen ones.This difficulty motivated several authors before, and the present paper complements earlier results by Vanderbei [13] and Tamaki [12] by an independent approach rather than really solving it.However, as we shall argue, this complement has interest on its own.The problem is mathematically interesting.We agree that there are probably not many situations in everyday life where our model would directly fit (except for κ = 1 or κ = 2).However, we know of several people who have tried to get answers and found the problem untractable.We felt that one should not give up and try to give at least an (arguably) good answer.

Related work
Platen [9] studies the problem of choosing a fixed number of objects (secretaries) from a sequence of n but his objective function is different.He thinks of the rank k as generating the cost k and considers the problem of minimizing the sum of the cost of all selections.Notice that this is different from the problem of minimizing the expected rank as studied by Chow et al [5] and also again quite different from our problem of maximizing the expected payoff of one if and only if all the κ best objects are chosen, and zero otherwise.Platen also considers more general payoff (cost) functions, but, motivated by thinking of secretary problems, limits his interest to additive functions, which do not apply to our problem.
A more weakly related problem has been studied by Rose [10] for the case κ = 2. Rose introduces in the same model two decision makers, one of which is defined as dominant.The objective is to select the two best objects, the best one being assigned to the dominant decision maker.Rose considers a strategy as successful if and only if both decision makers reach their goal under the online assignment constraint.Hence here the optimal success probability should be (and is, as we shall see) smaller than in our problem without assignments.Note also that the concurrent assignment constraint gives Roses' problem quite a different character.
Another κchoice problem was studied by Ano et al. [1] who maximise the probability of finding the best (only) candidate with κ available sequential choices.Szajowski and Lebek [11] solve an interesting related investment problem for κ = 2.
Wilson [15] generalizes the model of Rose for general κ, also with simultaneous online assignment of ranks.The paper displays important optimality conditions for general κ, but closed-form solutions seem now to become hard.

Specifically related work of Tamaki and Vanderbei
Tamaki [12] looks at the same problem for κ = 3 and shows that, interestingly, in contrast to the case κ = 1 (secretary problem) and the case κ = 2 the optimal strategy is now no longer simple.Here simple means that, for the κ selections, the strategy waits until a threshold index s k and selects then consecutively the next k (if possible) record candidates.Tamaki's observation of the non-simplicity of the optimal rule for κ = 3 (and κ ≥ 3) is important in this context because it gives additional motivation to the work of the approach presented in this paper as well as to earlier work of Vanderbei [13] described below.
Vanderbei [13] studies the corresponding problem without online assignments with an original approach by formulating the problem as a control problem defined on a Markov chain.An explicit solution is found for n even and κ = n/2.The corresponding optimal win probability equals 1 n/2+1 .For general κ, this article confines its interest to studying a simple sub-optimal solution.Thus Vanderbei studies the same problem as ours but the approach is different.As far as we understand, Vanderbei's sufficient condition (see the last line of [13] p.481) is not a sufficient condition in the set of strategies which we consider.Indeed, we need in general also information about 2-records, 3-records, up to (κ − 1)-records, where a s-record is a candidate which is at the time of its arrival of relative rank s.Having said this we see by Theorem 5.1 of [13] that the sufficient condition we require is implicit by the definition of "marginal", and the definition of the sets T and S , so that our strategies may coincide.A closer comparison between approaches is difficult, however, since Vanderbei's approach leads to a difference-differential equation (see (5.3) in [13]) whereas our approach is purely algorithmic.Without being able to say more in definite terms, we think of our contribution as of being an attractive alternative to Vanderbei's result and as a way of access to a good solution for general κ.
Let us finally mention another interesting result by Vanderbei [14].In this paper, he considers a minor variant of this classical secretary problem.He wishes to pick not the best but the second best postdoc (the best is going to Harvard).In this case, an explicit solution can be given both for the optimal strategy and the associated optimal success probability.The probability of success is k 0 (n − k 0 )/(n(n − 1)), where k 0 = n/2 .Clearly, as n goes to infinity, the probability of success tends to 1/4.Hence, it is easier to pick the best than the second best, a phenomenon which is not only true in several other selection problems but, often enough, felt to be true in real life.

Algorithm
The approach to the algorithm we will propose consists of two parts: firstly, defining a suitable strategy, secondly, computing sequentially the necessary parameters we have to plug in.We are interested in understanding the asymptotic form of an optimal or, at least, efficient, strategy, and its corresponding value as n → ∞.
The κ choices threshold strategy: Our strategy, based on thresholds, is defined as follows: we use κ thresholds: 1 ≤ j 1 < j 2 < . . .< j κ ≤ n.The way we compute these thresholds will be characterized later on.It will turn out that there are no general closed forms for the optimum values for these thresholds.Following Vanderbei's terminology, we define first two kinds of candidates: 1. compulsory candidates i.e. candidates whom we must retain, given the relative ranks of the candidates we have already chosen.For instance, if we have chosen so far candidates, with relative ranks 1 < 2 < . . .< (relative rank 1 corresponds to the best candidate seen so far, rank 2 to the second best seen so far, etc.), we must retain any candidate with relative rank ≤ .
2. marginal candidates i.e. candidates whose relative rank is equal to + 1.
According to our strategy, we first choose the first record (if any) arriving at position u 1 ≥ j 1 .This is done for all possible values of j 1 where j 1 is some index between 1 and n.We then compute for each possible position u 1 an optimal threshold j 2 (u 1 ) ≥ u 1 .The reason behind this is that we have two possibilities, namely: 1. We observe a compulsory candidate at position u 2 < j 2 .We take this candidate and we start again the strategy at position u 2 , with a new threshold j 3 (u 2 ).The strategy iterates then this procedure accordingly over the position values.At this stage it is difficult to say already more, and the details are postponed to the beginning of Sec.4.
2. We do not observe any compulsory candidate before j 2 .Then, from j 2 on, we retain a compulsory candidate or a marginal candidate (if any) at position u 2 .We now start again the strategy at position u 2 , with a new threshold j 3 (u 2 ) and, as before, we iterate the procedure.

Examples
To give an example, we first define the 2-choice strategy.We define a (j 1 , j 2 )-strategy as the policy to act as follows: 1. wait until index j 1 without any action; 2. select, from j 1 onwards, the first record (if any).Thereafter instructions will split according to two possibilities: 3. (a) if we select only one record value up to a certain index j 2 ∈ (u 1 , n) then we accept a record value or 2-record value after j 2 , or else, (b) we select two consecutive records up to j 2 (if any.) For κ = 3, the definition of a (j 1 , j 2 , j 3 )-strategy if κ = 3 is in principle analogous but, again, it is better to postpone the details to the beginning of Sec.4.The corresponding optimal thresholds are denoted by j * 1 , j * 2 , j * 3 .The order of computation is backwards.
The asymptotic results for κ = 2 and κ = 3 are summarized in the following theorem: , the optimal thresholds j * 1 and j * 2 for the case κ = 2 satisfy the asymptotic relationship The asymptotic success probability of the (j * 1 , j * 2 )-strategy equals 0.2254366561 . . .
• In the case κ = 3 , the corresponding asymptotic relationships are and the asymptotic success probability of the (j * 1 , j * 2 , j * 3 )-strategy equals .1625200069 . . .Our computational technique can be extended to any κ.We will illustrate the optimal thresholds computation and analyze the performance, i.e. the success probability of our algorithm for the cases κ = 2, κ = 3.
As explained in Sec.1, we have two possibilities: u 1 < j 2 ≤ u 2 , or, alternatively, u 1 < u 2 < j 2 .The probability of success of the first case is given by To see this note that after index j 2 , we have two possible types of candidates, a compulsory or a marginal candidate at position u 2 .After u 2 , we must exclude any compulsory candidate.
We note that u 1 F 1 (j 2 ) is unimodal in j 2 because u 1 F 1 (j 2 ) − u 1 F 1 (j 2 + 1) yields after a few steps of simplification the form which changes sign at most once according to the value j 2 .
The probability of success of the second case is given by .
Here we note that after u 1 and before j 2 , we have one possible compulsory candidate at position u 2 .After u 2 , we must exclude any compulsory candidates.u 1 F 2 (u 1 , j 2 ) is clearly monotone in j 2 because it is, as seen in the last sum term, affine linear in j 2 .Consequently is unimodal in j 2 because the sum of a unimodal function and an affine linear function is unimodal.Of course, F 2 (u 1 , j 2 ) is quite simple here, but, viewing general κ, we prefer to keep the general notation.
As n → ∞, we use the Euler Maclaurin formula to replace sums by integrals (see also the continuous-time model Tamaki [12]).Moreover, we use continuous variables u 1 := u 1 /n, u 2 := u 2 /n, . ... By a slight abuse of notations, in order to simplify our expressions, we will continue to use u 1 , u 2 , . . .as continuous time variables.Also, to avoid any confusion, we will use the notation χ k for the (continuous) asymptotic of j k /n.
We must now compute j 1 .But we note that the first record can occur after j * 2 .It is clear from the unimodality of ϕ(χ 2 ), that, asymptotically, j 2 (u 1 ) is then exactly given by u 1 .Actually So we obtain the following success probability P (j 1 ) = P 1 (j 1 ) + P 2 (j 1 ), with, on the one hand and on the the other hand Note that ( 3) is again affine linear in j 1 .Therefore, if P 1 (j 1 ) defined in ( 2) is unimodal in j j then P (j 1 ) = P 1 (j 1 )+P 2 (j 1 ) is unimodal in j 1 because the sum of an affine linear function and a unimodal function is unimodel.However, since j 1 figures in (2) both in the numerator and as the starting point of summation we cannot decide upon unimodality of P 1 (j 1 ) without being able to quantify the summands ].However, we cannot do this, that is, here the problem becomes circular.Motivated by staying in line with our determination to keep things tractable (see also below) and to focus our interest on the asymptotic case as n → ∞, we disregard in this place the problem of unimodality for all n.(Here it is good to know that unimodality, which is a sufficient condition for our approach to be solid, need not be necessary!) The asymptotic total success probability is, independently of the problem of unimodality, always of interest and readily accessible, namely Maximizing V 2 (χ 1 ) then yields the asymptotically optimal threshold )) = 0.2291147286 . . .where W (x) is the Lambert function, that is the solution of We use its principal branch which is analytic at 0. This leads from the above to V 2 (χ * 1 ) = 0.2254366561 . . .and the value χ * 1 is unique since We have compared, for n = 500, V 2 (χ 1 ) with P (j 1 ).This is shown in Figure 3 (line=V 2 (χ 1 ), box= P (j 1 )).This is quite satisfactory.The relative error at j * 1 is given by

Remark on the circular part of the problem
How good is the approximation for finite n?Here the problem is the same as above; the mentioned circular part problem persists.We do not know, and we cannot do better because we would have to know the optimal strategy and its precise value.
However, there are good reasons to be optimistic.The few comparisons we could do with simplified problems (comparisons with upper bounds or lower bound stemming from problems for which we know the optimal strategy and its value) suggest that the resulting values should not be far off the optimum.Moreover, the asymptotic behaviour (see Fig. 3) of the values seems very nice.
Similarly, we cannot guarantee that the use of the asymptotic j * 2 is mathematically rigorous.What we can say is that we do not see what may go wrong in doing so, and, in the worst case, our results are at least lower bounds for the optimal success probability.4 The case κ = 3 The strategy for general κ proceeds as follows.Assume that we are at time u ∈ [j * k , j * k+1 ), • if we have already chosen k − 1 candidates or less, we select a compulsory or marginal candidate (if any) in [u, j * k+1 ); • if we have already chosen more than k −1 candidates, we select a compulsory candidate (if any) in [u, j * k+1 ).The initial condition is the following: for u ∈ [j * 1 , j * 2 ), the marginal candidate is the first record, by definition.Let us now consider the case κ = 3.We must first compute j 3 (u 2 ) ≥ u 2 , given u 2 (at this time, we have selected the second candidate).Again, we have two possibilities: The probability of success of the first case is given by Note that this expression is unimodal in j 3 if and only if is unimodal in j 3 .This is true if changes, as a function of j, at most once its sign.An easy calculation shows that The probability of success of the second case is given by which is clearly monotone in j 3 .Asymptotically, from F 3 (j 3 )+F 4 (u 2 , j 3 ), conditioning on u 2 , we must maximize, with respect to the index χ 3 , the function Note the similarity with V 1 given in equation ( 1).This gives χ * 3 = e −1/3 = .7165313106 . ... Again we note that V 3 (u 2 , χ 3 ) has the form ϕ 3 (u 2 ) + ϕ 4 (χ 3 ), and that ϕ 4 (χ 3 ) is unimodal in χ 3 .
We now compute j 1 .We note that u 1 can occur after j * 2 .It is then clear, from the unimodality of ϕ 6 (χ 2 ), that, asymptotically, j 2 (u 1 ) is then exactly given by u 1 .Actually now the split in the index j * 2 is given by We obviously need a concise notation for all cases we will consider (this will of course be applied to cases κ ≥ 4).
The idea is to create a bijection between the different cases and an urn model.Indeed, the adequate model here is the Bose-Einstein urn model, that is balls correspond to chosen candidates u 1 , u 2 , . . .and urns to intervals [j k , j k+1 − 1].We throw κ indistinguishable balls into κ ordered urns labelled 1, 2, . . ., κ.The number of possible cases is given by 2κ−1 κ for example 10 cases for κ = 3.We next label the balls in increasing order.Denote then by (α 1 , α 2 , α 3 ) the urn labels of balls (1, 2, 3).For instance, the case (1, 3, 3) corresponds to j 1 ≤ u 1 < j * 2 < j * 3 ≤ u 2 < u 3 because it means that the arrival time u 1 falls in the first urn defined by the borders j 1 and j * 2 − 1 and the second and third candidate fall in the third urn with bounds j * 3 and n.The corresponding probability is given by We will not detail the tedious, but routine computations corresponding to our 10 cases.We provide only the final asymptotic integrals.
We note that the much easier problem of predicting an index from which onwards we have exactly three record values has an optimal value of 0.22404 . . .See Bruss and Paindaveine [4], Sec. 6, equation (13): V ( ) = !e , Note that the three very best coincide with the last three record values only if the absolute ranks 3, 2, and 1 appear exactly in that order.There is a clear difference between 0.22404... and the lower bound of .1625200... we obtained for getting the three very best under optimal play.Still, to know now that is actually not that much harder to get the three very best is at least not evident at all.

Conclusion
It is in the nature of the studied problem that even asymptotic answers are already computationally involved.Still, this provides a feasible access.Using similar simple probabilistic arguments, we can compute a set of asymptotic thresholds and the final success probability for any κ.Further work would be to build a computer algebra system which could mechanically compute the different cases and the corresponding asymptotic probabilities.This computer algebra should bring the relevant involved functions in a sufficient tractable form to prove the unimodality for all κ in the asymptotic continuous case.(Remember that unimodality is only a sufficient condition and perhaps not needed.)This would be all we need to prove asymptotic optimality also for the case κ ≥ 4.
Hence there remain two interesting open problems: 6 Open problems PROBLEM 1: Is our procedure for finding the κ best secretaries out of n candidates optimal for all κ?
PROBLEM 2: Can this procedure be extended to the case where κ depends on n?
is given in Figure2.