Here's an interesting one. The goal is to rank probabilities that questions will be asked in an interview.

Let's say we have a databank full of all kinds of questions. The subject matter of the question is unknown. Let's also say that this central databank of questions will be referenced to compile a list of questions to be used in an interview. Each interview can have different lists of questions, but all questions must come from the databank. Questions may be re-used, and the goal is to gauge the relative probability that it will be used in the next interview.

Here's what we can track:

1. How often a question has been used in prior interviews. Let's call it "F" for frequency.
2. The time when each question was used, which we can refer to as "t." These will be referenced in points, such as t0, t1, t2 for points in time.
3. The time when the question was added to the databank, which we call "tA."

The rules:

Probablistically, we know that the more often a question is used, the more likely it will be used in the next interview (since it gets asked often).
We also know that the more recently a question was added to a databank, the more likely a person intends to ask it during the next interview (he added it with intent to use it in the very near future).

A corollary involving the time intervals between use of a given question is expressed by example as follows:

From t0 (a point in time) to t100 (another point in time), we can measure the mean time interval between uses of the question. We can plot a curve with this data. Let's say it's like this example:

"Who's your daddy?" is a question. There was an early period (t0 to t100) when the question was asked with frequency, say on average, every 5th interview. Then, from t100 - t200, it was not asked as often, say every 20th interview on average. Then, from t200 - t300, it was used more frequently again, say every 10th interview.

Now, let's have a second question, "What is your degree in?" While it has been in the databank a long time, it wasn't used much. However, from t200 - t300, it was used on average in every 5th interview. Let's say the overall use count of this question (F) is not as high as the previous question. For example, the previous question might have been asked 1,000 times, while this question was asked only 100 times. However, the recent uptick during the recent time interval, t200 - t300, might make the use of this question more probable than the previous question.

Given the available information, can a formula be derived which expresses the probability an interviewee will be asked particular questions from the databank in such a way that we can rank the probability each will be asked from high to low?

Also, we know the total number of questions in the databank, and we also know the number of previous interviews that were held. I think this might have been apparent in the way the problem was presented, but want to make it clear.