Select a Random Node from a Singly Linked List

Given a singly linked list, select a random node from linked list (the probability of picking a node should be 1/N if there are N nodes in list). You are given a random number generator.

Below is a Simple Solution
1) Count number of nodes by traversing the list.
2) Traverse the list again and select every node with probability 1/N. The selection can be done by generating a random number from 0 to N-i for i’th node, and selecting the i’th node node only if generated number is equal to 0 (or any other fixed number from 0 to N-i).

We get uniform probabilities with above schemes.

i = 1, probability of selecting first node = 1/N
i = 2, probability of selecting second node =
[probability that first node is not selected] *
[probability that second node is selected]
= ((N-1)/N)* 1/(N-1)
= 1/N

Similarly, probabilities of other selecting other nodes is 1/N

The above solution requires two traversals of linked list.

How to select a random node with only one traversal allowed?
The idea is to use Reservoir Sampling. Following are the steps. This is a simpler version of Reservoir Sampling as we need to select only one key instead of k keys.

(1) Initialize result as first node
result = head->key
(2) Initialize n = 2
(3) Now one by one consider all nodes from 2nd node onward.
(3.a) Generate a random number from 0 to n-1.
Let the generated random number is j.
(3.b) If j is equal to 0 (we could choose other fixed number
between 0 to n-1), then replace result with current node.
(3.c) n = n+1
(3.d) current = current->next

Note that the above program is based on outcome of a random function and may produce different output.

How does this work?
Let there be total N nodes in list. It is easier to understand from last node.

The probability that last node is result simply 1/N [For last or N’th node, we generate a random number between 0 to N-1 and make last node as result if the generated number is 0 (or any other fixed number]

The probability that second last node is result should also be 1/N.

The probability that the second last node is result
= [Probability that the second last node replaces result] X
[Probability that the last node doesn't replace the result]
= [1 / (N-1)] * [(N-1)/N]
= 1/N

Similarly we can show probability for 3rd last node and other nodes.

This article is contributed by Rajeev. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above