Sometimes you may need to randomly select items from a list so that some items are selected more frequently than others. For example, you might take a list of applications and their download counts, and randomly pick a “featured application” based on the number of downloads.

There are several ways to accomplish this in PHP. In this post I’ll show you two approaches to weighted random selection – one suited for a small list of possible choices, and another optimized for a larger number of items.

Simple Weighted Random Selection

Here’s one simple and very common algorithm :

Pick a random number between zero and the sum of all weights.

Scan down the list of choices adding each element’s weight to a counter.

Check if the counter is above or equal to the picked random number. If yes, return the current element. Otherwise go to Step #2.

This algorithm is easy to implement and pretty fast when the number of possible choices is small, or when you only need to do the selection once. Below is a function that takes an array of possible choices and a matching array of weights, and returns a randomly selected element from the first array. You can use any positive integer as a weight.

Randomly Choosing From Thousands Of Elements

The above algorithm can become very slow when the list of choices is large and you need to do several selections. This is because it has to scan the entire array each time.

However, the algorithm can be extended to make it significantly faster. Instead of calculating the total weight (step #1) and the counter (step #2) every time, lets do it only once and store the counter values in a sorted array. Then we’ll be able to use binary search to quickly select the right element.

The above script also contains two new utility functions – calc_lookups which calculates the lookup data, and binary_search which is used to find a randomly picked number in the lookup array. Use the new functions like this :

In Conclusion

To give you an idea of how fast these two algorithms are : I used each one to select a random entry from a list of 10 000 possibilities, 10 000 times in a row. The first algorithm took 13 seconds in total. The improved algorithm took only 0.09 seconds.

Of course, this is not the limit. You can find some interesting hints about even faster algorithms here (Python).

10 Responses to “Fast Weighted Random Choice In PHP”

With the calc_lookups function, you can remove the count($weights) from the parameters of the for loop to speed up execution:

$cweights = count($weights);
for ($i=0; $i<count($weights); $i++) {

instead of:

for ($i=0; $i<count($weights); $i++) {

that way the 'count' function isn't called each iteration of the loop.

You could also do small things like remove function paramters from the main function and just force a call to list($lookup, $total_weight) = calc_lookups($values, $weights);, or use incrementation like ++$i; instead of $i++, which executes faster.

@Nitin: Compiler writers good at making such optimizations as assigning values only once, if the the variable is independent of the loop and many more. We just need to write the code that looks good for us. Even then with your solution, it would be more to write as