Group testing

An animation of the false coin problem being solved for 10 coins, where the goal is to find the single coin that is lighter than the others. Many fewer than 10 tests are needed since at each stage, at least half of the 'good' coins are eliminated.

In combinatorial mathematics, group testing refers to any procedure which breaks up the task of locating elements of a set which have certain properties into tests on groups of items, rather than on individual elements. A familiar example of this type of technique is the false coin problem of recreational mathematics. In this problem there are n coins and one of them is false, weighing less than a real coin. The objective is to find the false coin, using a balance scale, in the fewest number of weighings. By repeatedly dividing the coins in half and comparing the two halves, the false coin can be found quickly as it is always in the lighter half.[a]

Schemes for carrying out such group testing can be simple or complex and the tests involved at each stage may be different. Schemes in which the tests for the next stage depend on the results of the previous stages are called adaptive procedures, while schemes designed so that all the tests are known beforehand are called non-adaptive procedures. The structure of the scheme of the tests involved in a non-adaptive procedure is known as a pooling design.

The field of (Combinatorial) Group Testing was introduced by Robert Dorfman in 1943. The motivation arose during the Second World War when the United States Public Health Service and the Selective service embarked upon a large scale project to weed out all syphilitic men called up for induction. Testing an individual for syphilis involves drawing a blood sample from them and then analysing the sample to determine the presence or absence of syphilis. However, at the time, performing this test was expensive, and testing every soldier individually would have been very cost heavy and inefficient.

Supposing there are n{\displaystyle n} soldiers, this method of testing leads to n{\displaystyle n} separate tests. If we have 70-75% of the people infected then this method would be reasonable. Our goal however, is to achieve effective testing in the more likely scenario where it does not make sense to test 100,000 people to get (say) 10 positives.

The feasibility of a more effective testing scheme hinges on the following property. We can combine blood samples and test a combined sample together to check if at least one soldier has syphilis. This is the central idea behind group testing. If one or more of the soldiers in this group has syphilis, then a test is wasted (more tests need to be performed to find which soldier(s) it was). On the other hand, if no one in the pool has syphilis then many tests are saved.

Group-testing algorithms can be described as adaptive or non-adaptive. An adaptive algorithm proceeds by performing a test, and then using the result (and all past results) to decide which next test to perform. On the other hand, in non-adaptive algorithms all tests are decided in advance. Although adaptive algorithms offer much more freedom in design, it is known that adaptive group-testing algorithms do not improve upon non-adaptive group-testing algorithms by more than a constant factor in the number of tests required to identify the set of defective items.[2][3] In addition to this, non-adaptive methods are often useful in practice because one knows in advance all the tests one needs to perform, allowing for the effective distribution of the testing process.

As well as adaptivity, all group testing algorithms are either combinatorial or probabilistic. A combinatorial algorithm finds all the defectives with certainty. In contrast, a probabilistic algorithm has some non-zero probability of making a mistake (i.e. deciding a defective item is non-defective or vice versa). It is known that zero-error algorithms require significantly more tests asymptotically (in the number of defective items) than algorithms that allow asymptotically small probabilities of error.[4]

Another class of algorithms are so-called noisy algorithms. These deal with the situation that with some non-zero probability, q{\displaystyle q}, the result of a group test is erroneous (e.g. comes out positive when the test contained no defectives). A noisy algorithm will always have a non-zero probability of making a mistake.

Let the total number of items to be tested be n{\displaystyle n}, and an upper bound on the number of defective items be d{\displaystyle d}. The (unknown) information about which items are defective is described as a vector, x=(x1,x2,...,xn){\displaystyle \mathbf {x} =(x_{1},x_{2},...,x_{n})}, where xi=1{\displaystyle x_{i}=1} if i-th item is defective and xi=0{\displaystyle x_{i}=0} otherwise. x{\displaystyle \mathbf {x} } is called the input vector.

The Hamming Weight of x{\displaystyle \mathbf {x} } is defined as the number of 1{\displaystyle 1}'s in x{\displaystyle \mathbf {x} }. Hence, |x|≤d{\displaystyle |\mathbf {x} |\leq d} where |x|{\displaystyle |\mathbf {x} |} is the Hamming weight. The vector x{\displaystyle \mathbf {x} } is an implicit input since we do not know the positions of the 1{\displaystyle 1}'s. The only way to find out is to run the tests.

The goal of group testing is to compute or estimate x{\displaystyle \mathbf {x} } and minimize the number of tests required to do so. Here, we have no direct questions or answers. Any piece of information can only be obtained using an indirect query. Group testing is ultimately a question of combinatorial searching. The major issue with such problems is that the solution can grow exponentially in the size of the input.

t(d,n){\displaystyle t(d,n)} is defined as the minimum number of non-adaptive tests that one would have to make to detect all of d{\displaystyle d} defective items with a total of n{\displaystyle n} items. Similarly, we use ta(d,n){\displaystyle t^{a}(d,n)} to denote the number of adaptive tests that one would have to make to detect all the defective items.

Consider the case when only one person in the group will test positive. Then if we tested in the naive way, in the best case we would at least have to test the first person to find out if he/she is infected. However, in the worst case one might have to end up testing the entire group and only the last person we test will turn out to really be the one who was infected. Hence, 1≤t(d,n)≤n{\displaystyle 1\leq t(d,n)\leq n}. We also have that ta(d,n)≤t(d,n){\displaystyle t^{a}(d,n)\leq t(d,n)} due to the fact that any non-adaptive test can be performed by an adaptive test by running all of the tests in the first step of the adaptive test. Adaptive tests can be more efficient than non-adaptive tests since the test can be changed after certain things are discovered.

A typical group testing setup. A non-adaptive algorithm first chooses the matrix M{\displaystyle M}, and is then given the vector y. The problem is then to find an estimate for x.

Algorithms for non-adaptive group testing consist of two distinct phases. First, it is decided how many tests to perform and which items to include in each test. This is usually encoded in an t×n{\displaystyle t\times n} binary matrix, M{\displaystyle M}, where t{\displaystyle t} is the number of tests and n{\displaystyle n} is the total number of items. Each column of M{\displaystyle M} represents an item and each row represents a test, with a 1{\displaystyle 1} in the (i,j)th entry indicating that the i-th test included the j-th item and a 0{\displaystyle 0} indicating otherwise. In the second phase, often called the decoding step, the results of each group test are analysed to determine which items are likely to be defective.

As well as the vector x{\displaystyle \mathbf {x} } (of length n{\displaystyle n}) that describes the (unknown) defective set, we introduce the vector y{\displaystyle \mathbf {y} }, of length t{\displaystyle t}, describing the results of each test. A 1{\displaystyle 1} in the j-th entry of y{\displaystyle \mathbf {y} } indicates that the j-th test was positive (i.e. contained at least one defective). With these vectors the problem can be reframed as follows: first we choose some testing matrix, M{\displaystyle M}, after which we are given y{\displaystyle \mathbf {y} }. Then the problem is to analyse y{\displaystyle \mathbf {y} } to find some estimate for x{\displaystyle \mathbf {x} }.

These notions can be expressed more formally as follows. Let S⊆[n]{\displaystyle S\subseteq [n]} be a test and define χi∈{0,1}n{\displaystyle \chi _{i}\in \{0,1\}^{n}} such that i∈S⇔χs(i)=1{\displaystyle i\in S\Leftrightarrow \chi _{s}(i)=1}, so the k-th test is described by χk{\displaystyle \chi _{k}}. Let M{\displaystyle M} be the t×n{\displaystyle t\times n} matrix with rows χi{\displaystyle \chi _{i}}, so that (M)i,j=χi(j){\displaystyle (M)_{i,j}=\chi _{i}(j)}. This construction is based on the grounds that non-adaptive testing with t{\displaystyle t} tests is represented by a t{\displaystyle t}-subset Si⊆[n]{\displaystyle S_{i}\subseteq [n]} where 1≤i≤t{\displaystyle 1\leq i\leq t}.

Under this setup, y{\displaystyle \mathbf {y} } is defined by the matrix multiplication relation: Mx=y{\displaystyle M\mathbf {x} =\mathbf {y} }, where multiplication is logical AND (∧{\displaystyle \wedge }) and addition is logical OR (∨{\displaystyle \vee }). Here, y{\displaystyle \mathbf {y} } will have a 1{\displaystyle 1} in position i{\displaystyle i} if and only if (M)i,j{\displaystyle (M)_{i,j}} and xj{\displaystyle \mathbf {x} _{j}} are both 1{\displaystyle 1} for any j{\displaystyle j}. That is, if and only if at least one defective item was included in the i-th test.

Test a group of size 2α{\displaystyle 2^{\alpha }}. If the outcome is negative, every item in the group is declared to be non-defective; set n:=n−2α{\displaystyle n:=n-2^{\alpha }} and go to step 1. Otherwise, use a binary search to identify one defective and an unspecified number, called x{\displaystyle x}, of non-defective items; set n:=n−1−x{\displaystyle n:=n-1-x} and d:=d−1{\displaystyle d:=d-1}. Go to step 1.

Non-adaptive group-testing algorithms tend to assume that the number of defectives, or at least a good upper bound on them, is known.[7] We will denote this quantity d{\displaystyle d}. If no bounds are known, there are non-adaptive algorithms with low query complexity that can help estimate d{\displaystyle d}.[8]

An illustration of the COMP algorithm. COMP identifies item a as being defective and item b as being non-defective. However, it incorrectly labels c as a defective, since it is “hidden” by defective items in every test in which it appears.

Combinatorial Orthogonal Matching Pursuit, or COMP, is a simple non-adaptive group-testing algorithm that forms the basis for the more complicated algorithms that follow in this section.

First, each entry of the testing matrix is chosen i.i.d. to be 1{\displaystyle 1} with probability 1/d{\displaystyle 1/d} and 0{\displaystyle 0} otherwise.

The decoding step proceeds column-wise (i.e. by item). If every test in which an item appears is positive, then the item is declared defective; otherwise the item is assumed to be non-defective. Or equivalently, if an item appears in any test whose outcome is negative, the item is declared non-defective; otherwise the item is assumed to be defective. Of particular note here is that this algorithm never creates false negatives, though a false positive occurs when all locations with ones in the j-th column of M{\displaystyle M} (corresponding to a non-defective item j) are “hidden” by the ones of other columns corresponding to defective items.

The COMP algorithm requires no more than ed(1+δ)ln⁡(n){\displaystyle ed(1+\delta )\ln(n)} tests to have an error probability less than or equal to n−δ{\displaystyle n^{-\delta }}.[9]

In the noisy case, we relax the requirement in the original COMP algorithm that the set of locations of ones in any column of M{\displaystyle M} corresponding to a positive item be entirely contained in the set of locations of ones in the result vector. Instead, we allow for a certain number of “mismatches” – this number of mismatches depends on both the number of ones in each column, and also the noise parameter, q{\displaystyle q}. This noisy COMP algorithm requires no requires no more than 4.36(δ+1+δ)2(1−2q)−2dlog⁡n{\displaystyle 4.36({\sqrt {\delta }}+{\sqrt {1+\delta }})^{2}(1-2q)^{-2}d\log {n}} tests to achieve an error probability at most n−δ{\displaystyle n^{-\delta }}.[10]

The definite defectives method is an extension of the COMP algorithm that attempts to remove any false positives. Performance guarantees for DD have been shown to strictly exceed those of COMP.[11]

The decoding step uses a useful property of the COMP algorithm: that every item that COMP declares non-defective is certainly non-defective (that is, there are no false negatives). It proceeds as follows:

We first run the COMP algorithm, and remove any non-defectives that it detects. All remaining items are now “possibly defective”

Next the algorithm looks at all the positive tests. If an item appears as the only “possible defective” in a test, then it must be defective, so the algorithm declares it to be defective.

All other items are assumed to be non-defective. The justification for this last step comes from the assumption that the number of defectives is much smaller than the total number of items.

Note that steps 1 and 2 never make a mistake, so the algorithm can only make a mistake if it declares a defective item to be non-defective. Thus the DD algorithm can only create false negatives.

SCOMP is an algorithm that makes use of the fact that DD makes no mistakes until the last step, were we assume remaining items to be defective. Let the set of declared defectives be K{\displaystyle K}. We say that a positive test is explained by K{\displaystyle K} if it contains at least one item in K{\displaystyle K}. The key observation with SCOMP is that the set of defectives found by DD may not explain every positive test, and that every unexplained test must contain a hidden defective.

The algorithm proceeds as follows:

First carry out steps 1 and 2 of the DD algorithm to obtain K{\displaystyle K}, an initial estimate for the set of defectives.

If K{\displaystyle K} explains every positive test, terminate the algorithm: K{\displaystyle K} is our final estimate for the set of defectives.

If there are any unexplained tests, find the “possible defective” that appears in the largest number of unexplained tests, and declare it to be defective (that is, add it to the set K{\displaystyle K}). Go to step 2.

In simulations, SCOMP has been shown to perform close to optimally.[12]

Fix a valid group testing scheme with t{\displaystyle t} tests. Now, for two distinct input vectors x{\displaystyle \mathbf {x} } and x′{\displaystyle \mathbf {x} ^{\prime }} where |x|,|x′|≤d{\displaystyle |\mathbf {x} |,|\mathbf {x} ^{\prime }|\leq d}, the resulting vectors will not be the same i.e. y(x)≠y(x′){\displaystyle \mathbf {y(x)} \neq \mathbf {y(x^{\prime })} }, were y(x){\displaystyle \mathbf {y(x)} } is the resultant for input vector x{\displaystyle \mathbf {x} }. This is because two valid inputs will never give us the same result. If this ever happened, then we would always have an error in finding both x{\displaystyle \mathbf {x} } and x′{\displaystyle \mathbf {x} ^{\prime }}. This gives us that the total number of distinct results is the volume of a Hamming Ball of radius d{\displaystyle d}, centered about n{\displaystyle n} i.e. Vol2(d,n){\displaystyle Vol_{2}(d,n)}. However, for t{\displaystyle t} bits, the total number of possible distinct vectors is 2t{\displaystyle 2^{t}}. Hence, 2t≥Vol2(d,n){\displaystyle 2^{t}\geq Vol_{2}(d,n)}. Taking the log{\displaystyle \log } on both sides gives us t≥log⁡{Vol2(d,n)}{\displaystyle t\geq \log\{Vol_{2}(d,n)\}}.

Now, Vol2(d,n)≥(nd)≥(nd)d{\displaystyle Vol_{2}(d,n)\geq {n \choose d}\geq \left({\frac {n}{d}}\right)^{d}}. Therefore, we will end up having to perform a minimum of dlog⁡nd{\displaystyle d\log {\frac {n}{d}}} tests.

Thus we have proved that ta(d,n)≥dlog⁡nd{\displaystyle t^{a}(d,n)\geq d\log {\frac {n}{d}}}

Since we know that the upper bound on the number of positives is d{\displaystyle d}, we run a binary search at most d{\displaystyle d} times or until there are no more values to be found. To simplify the problem we try to give a testing scheme that uses O(log⁡n){\displaystyle O(\log {n})} adaptive tests to figure out a i{\displaystyle i} such that xi=1{\displaystyle x_{i}=1}. The related problem is solved by splitting [n]{\displaystyle [n]} in two halves and querying to find a 1{\displaystyle 1} in one of those and then proceeding recursively to find the exact position in the half where the query returned a 1{\displaystyle 1}. This will take 2⌈log⁡n⌉{\displaystyle 2\lceil \log {n}\rceil } time or if the first query is performed on the whole set, it will take ⌈log⁡n⌉+1{\displaystyle \lceil \log {n}\rceil +1}. Once a 1{\displaystyle 1} is found, the search is then repeated after removing the ith{\displaystyle i^{th}} co-ordinate. This can be done at most d{\displaystyle d} times. This justifies the running time of O(dlog⁡n){\displaystyle O(d\log {n})}. For a full proof and an algorithm for the problem refer to: CSE545 at the University at Buffalo

t(1,n)≤⌈log⁡n⌉{\displaystyle t(1,n)\leq \lceil \log {n}\rceil } This upper bound is for the special case where d=1{\displaystyle d=1} i.e. there is a maximum of 1 positive. In this case, the matrix multiplication gets simplified and the resultant y{\displaystyle \mathbf {y} } represents the binary representation of i{\displaystyle i} for test i{\displaystyle i}. This gives a lower bound of ⌈log⁡n⌉{\displaystyle \lceil \log {n}\rceil }. Note that decoding becomes trivial because the binary representation of i{\displaystyle i} gives us the location directly. The group test matrix here is just the parity check matrix Hm{\displaystyle H_{m}} for the [2m−1,2m−m−1,3]{\displaystyle [2^{m}-1,2^{m}-m-1,3]}Hamming code.

Thus as the upper and lower bounds are the same, we have a tight bound for t(d,n){\displaystyle t(d,n)} when d=1{\displaystyle d=1}. Such tight bounds are not known for general d{\displaystyle d}.

For non-adaptive group testing upper bounds we shift focus toward disjunct matrices, which have been used for many of the bounds because of their nice properties. It has been shown that Ω(d2log⁡dlog⁡n)≤t(d,n){\displaystyle \Omega \left({\frac {d^{2}}{\log {d}}}\log {n}\right)\leq t(d,n)}. Also for upper bounds we currently have t(d,n)≤O(d2log⁡n){\displaystyle t(d,n)\leq {\mathcal {O}}(d^{2}\log {n})} with a strongly explicit construction. So the smallest upper bound and biggest lower bound they are only off by a factor of 1/log⁡d{\displaystyle 1/\log {d}}, which is fairly small.

Separately, we can see that the current known lower bound for t(d,n){\displaystyle t(d,n)} is already a factor of d/log⁡d{\displaystyle d/\log {d}} larger than the upper bound for ta(d,n){\displaystyle t^{a}(d,n)}.

^A bit more precisely – if there are an odd number of coins to be weighed, pick one to put aside and divide the rest into two equal piles. If the two piles have equal weight, the bad coin is the one put aside, otherwise the one put aside was good and no longer has to be tested.

Hwang, Frank K. (September 1972). "A Method for Detecting All Defective Members in a Population by Group Testing". Journal of the American Statistical Association. 67 (339): 605–608. doi:10.2307/2284447. JSTOR2284447.