Advertisements

In article <>,
G G <> wrote:
> This program is an example from Deitel and Deitel, "C How to Program"
> My question is about the logic in the shuffle function.
>
> j = rand() % 52;
>
> What keeps j from having the same value more than once?
>

Nothing. But it doesn't matter if it does or not. The cards have already
been assigned, so you won't have any duplicate cards. j is only the
position in the deck that you will be switching with the card in
position i. It's not an optimal shuffling routing, but without using a
better PRNG than rand(), it won't make any difference.

Advertisements

G G <> writes:
> This program is an example from Deitel and Deitel, "C How to Program"
> My question is about the logic in the shuffle function.
>
> j = rand() % 52;
>
> What keeps j from having the same value more than once?

Nothing, if rand() is a good pseudo-random generator.

If, by some cleverness, you could be sure that you could evaluate that
expression 52 times with no duplicates, then you would not really need a
shuffling algorithm -- just put cards 0, 1, 2 ... 51 in the positions
return by the magic expression.

That is in effect what a shuffling algorithm must do. It must take some
sequence of uniform random numbers and produce the numbers 0 to 51 in
some order, with every order being equally likely.

The shuffling algorithm you show is not very good at doing that. It's
main problem is that it is not uniform -- not all permutations of the
deck are equally probable.

On 9/19/2013 7:27 PM, G G wrote:
> This program is an example from Deitel and Deitel, "C How to Program"
> My question is about the logic in the shuffle function.
>
> j = rand() % 52;
>
> What keeps j from having the same value more than once?

Mark Storkamp <> writes:
> In article <>,
> G G <> wrote:
>
>> This program is an example from Deitel and Deitel, "C How to Program"
>> My question is about the logic in the shuffle function.
>>
>> j = rand() % 52;
>>
>> What keeps j from having the same value more than once?
>>
>
> Nothing. But it doesn't matter if it does or not. The cards have already
> been assigned, so you won't have any duplicate cards. j is only the
> position in the deck that you will be switching with the card in
> position i. It's not an optimal shuffling routing, but without using a
> better PRNG than rand(), it won't make any difference.

Do you mean that the quality of the shuffle routine won't matter if rand
is bad? I don't see that. Even if rand() is bad, why introduce more
bias with a non-uniform shuffle?

Mark wrote:
>Nothing. But it doesn't matter if it does or not. The cards have already
>been assigned, so you won't have any duplicate cards. j is only the
>position in the deck that you will be switching with the card in
>position i. It's not an optimal shuffling routing, but without using a
>better PRNG than rand(), it won't make any difference.

Ok, so what may happen is that as i goes to 51, there may be some cards
that are switch back to there original position?

Mark Storkamp wrote:
) In article <>,
) G G <> wrote:
)
)> This program is an example from Deitel and Deitel, "C How to Program"
)> My question is about the logic in the shuffle function.
)>
)> j = rand() % 52;
)>
)> What keeps j from having the same value more than once?
)>
)
) Nothing. But it doesn't matter if it does or not. The cards have already
) been assigned, so you won't have any duplicate cards. j is only the
) position in the deck that you will be switching with the card in
) position i. It's not an optimal shuffling routing, but without using a
) better PRNG than rand(), it won't make any difference.

Yes it will, a huge difference even. I once made a histogram for a
couple thousand runs of a 10-card shuffle, and there were clear outliers.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

On 2013-09-20, G G <> wrote:
> Ok, so what may happen is that as i goes to 51, there may be some cards
> that are switch back to there original position?

That is certainly possible.

If it weren't, that would be a pretty bad shuffle...

-s
--
Copyright 2013, all wrongs reversed. Peter Seebach / http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
Autism Speaks does not speak for me. http://autisticadvocacy.org/
I am not speaking for my employer, although they do rent some of my opinions.

Seebs wrote:
) On 2013-09-20, G G <> wrote:
)> Ok, so what may happen is that as i goes to 51, there may be some cards
)> that are switch back to there original position?
)
) That is certainly possible.
)
) If it weren't, that would be a pretty bad shuffle...

No, it wouldn't. Of course, a card should have a chance to remain in its
original position (a 1/52 chance), but there is no reason why a shuffling
algorithm should switch a card to another position, and then back to its
original position.

For shits and giggles, you should make a program that uses the OP's
algorithm to shuffle a deck, and then runs that a large number of times
to build up a 52x52 matrix where each cell holds the probability of
card X ending up in position Y.

Or for bonus points, do it quantummechanically. I.E., after each step of
the algorithm, record the probability that each card is in each position.
You can use the 52x52 matrix for that.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

In article
<>,
Ben Bacarisse <> wrote:
> Mark Storkamp <> writes:
>
> > In article <>,
> > G G <> wrote:
> >
> >> This program is an example from Deitel and Deitel, "C How to Program"
> >> My question is about the logic in the shuffle function.
> >>
> >> j = rand() % 52;
> >>
> >> What keeps j from having the same value more than once?
> >>
> >
> > Nothing. But it doesn't matter if it does or not. The cards have already
> > been assigned, so you won't have any duplicate cards. j is only the
> > position in the deck that you will be switching with the card in
> > position i. It's not an optimal shuffling routing, but without using a
> > better PRNG than rand(), it won't make any difference.
>
> Do you mean that the quality of the shuffle routine won't matter if rand
> is bad? I don't see that. Even if rand() is bad, why introduce more
> bias with a non-uniform shuffle?

As I understand the problem, you are looking at either 52**52
combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
0, there is some pigeon holing, and my guess is that any such error is
swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
(2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
shuffles anyway.

Mark Storkamp <> writes:
> In article
> <>,
> Ben Bacarisse <> wrote:
>
>> Mark Storkamp <> writes:
>>
>> > In article <>,
>> > G G <> wrote:
>> >
>> >> This program is an example from Deitel and Deitel, "C How to Program"
>> >> My question is about the logic in the shuffle function.
>> >>
>> >> j = rand() % 52;
>> >>
>> >> What keeps j from having the same value more than once?
>> >>
>> >
>> > Nothing. But it doesn't matter if it does or not. The cards have already
>> > been assigned, so you won't have any duplicate cards. j is only the
>> > position in the deck that you will be switching with the card in
>> > position i. It's not an optimal shuffling routing, but without using a
>> > better PRNG than rand(), it won't make any difference.
>>
>> Do you mean that the quality of the shuffle routine won't matter if rand
>> is bad? I don't see that. Even if rand() is bad, why introduce more
>> bias with a non-uniform shuffle?
>
> As I understand the problem, you are looking at either 52**52
> combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
> 0, there is some pigeon holing, and my guess is that any such error is
> swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
> (2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
> shuffles anyway.

My point is that it seems unwise to say that since rand is, by
specification, unable to give you every sequence, you should not care
about adding yet more bias. It may be that the one form of bias is more
or less significant to whatever the program is ultimately doing, so I
don't think you can be casual about one just because of the other.

In article
<>,
Ben Bacarisse <> wrote:
> Mark Storkamp <> writes:
>
> > In article
> > <>,
> > Ben Bacarisse <> wrote:
> >
> >> Mark Storkamp <> writes:
> >>
> >> > In article <>,
> >> > G G <> wrote:
> >> >
> >> >> This program is an example from Deitel and Deitel, "C How to Program"
> >> >> My question is about the logic in the shuffle function.
> >> >>
> >> >> j = rand() % 52;
> >> >>
> >> >> What keeps j from having the same value more than once?
> >> >>
> >> >
> >> > Nothing. But it doesn't matter if it does or not. The cards have already
> >> > been assigned, so you won't have any duplicate cards. j is only the
> >> > position in the deck that you will be switching with the card in
> >> > position i. It's not an optimal shuffling routing, but without using a
> >> > better PRNG than rand(), it won't make any difference.
> >>
> >> Do you mean that the quality of the shuffle routine won't matter if rand
> >> is bad? I don't see that. Even if rand() is bad, why introduce more
> >> bias with a non-uniform shuffle?
> >
> > As I understand the problem, you are looking at either 52**52
> > combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
> > 0, there is some pigeon holing, and my guess is that any such error is
> > swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
> > (2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
> > shuffles anyway.
>
> My point is that it seems unwise to say that since rand is, by
> specification, unable to give you every sequence, you should not care
> about adding yet more bias. It may be that the one form of bias is more
> or less significant to whatever the program is ultimately doing, so I
> don't think you can be casual about one just because of the other.

True, but I wouldn't want to try to explain, through USENET, the details
to someone just learning to program. IMO, better to let them know of the
potential problem, and they can research it on their own, or ask for
clarification, if they wish. And even though I haven't tried to prove
it, I still believe that rand() is the greater source of bias. So it's
being casual based on intuitions. It might be an interesting exercise to
find out for sure. I've got a PRNG with enough states to get a proper
shuffle, but I'm not sure how long I'd have to run it before noticing an
effect from the two shuffle routines.

You're missing the point. As Willem points out,
the shuffle routine is simply defective. You don't
need simulations to see this; you can count cases for
small decksizes. There are 4^4 cases for a 4-deck and
one will see
0th goes to 0'th 64 times
1st goes to 0'th 75 times
2nd goes to 0'th 63 times
3rd goes to 0'th 54 times
This bias will dwarf any flaw in your PRNG.

Mark Storkamp <> writes:
> In article
> <>,
> Ben Bacarisse <> wrote:
>
>> Mark Storkamp <> writes:
<snip>
>> > As I understand the problem, you are looking at either 52**52
>> > combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
>> > 0, there is some pigeon holing, and my guess is that any such error is
>> > swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
>> > (2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
>> > shuffles anyway.
>>
>> My point is that it seems unwise to say that since rand is, by
>> specification, unable to give you every sequence, you should not care
>> about adding yet more bias. It may be that the one form of bias is more
>> or less significant to whatever the program is ultimately doing, so I
>> don't think you can be casual about one just because of the other.
>
> True, but I wouldn't want to try to explain, through USENET, the details
> to someone just learning to program. IMO, better to let them know of the
> potential problem, and they can research it on their own, or ask for
> clarification, if they wish. And even though I haven't tried to prove
> it, I still believe that rand() is the greater source of bias.

I don't think that measuring the bias and deciding how much is due to
what is necessarily very useful. As I say below, the exact *kind* may
be much more significant.
> So it's
> being casual based on intuitions. It might be an interesting exercise to
> find out for sure. I've got a PRNG with enough states to get a proper
> shuffle, but I'm not sure how long I'd have to run it before noticing an
> effect from the two shuffle routines.

No, I think it should be easy. The killer is that the kind of bias
introduced by the "simple shuffle" algorithm is easy to analyse. For
example, the relative probability of the last card ending up in the
first position is 2 * (n-1)^(n-1) * n^(1-n) for an n-card deck. That's
0.742913... (rather than 1) when n is 52. That's an edge I'd be
prepared to bet on in some circumstances!

The bias introduced by a limited-state rand is going to be huge (there
will be many impossible sequences, for example) but I would not want to
try to work out any particular exploitable property given the PRNG and
shuffle algorithm. (Actually, I'd have a go if I knew that the rand()
was *really* awful and had say, say, an alternating low bit, but it's
still going to be quote hard and, anyway, are we even considering that
kind of bad?)

Of course, your original point was about which is worse when the two are
combined -- a bad rand *and* the simple shuffle -- and that's hard to
predict, but I'd guess that the order bias would still be detectable. I
might try it and see (there's lots of bad rands out there!) if I have
time.

If there is a big picture here, it's that the amount of bias (from one
source or another) is often far less important that how that bias
interacts with the overall task.

Mark Storkamp <> writes:
> In article
> <>,
> Ben Bacarisse <> wrote:
>
>> Mark Storkamp <> writes:
>>
>> > In article <>,
>> > G G <> wrote:
>> >
>> >> This program is an example from Deitel and Deitel, "C How to Program"
>> >> My question is about the logic in the shuffle function.
>> >>
>> >> j = rand() % 52;
>> >>
>> >> What keeps j from having the same value more than once?
>> >>
>> >
>> > Nothing. But it doesn't matter if it does or not. The cards have already
>> > been assigned, so you won't have any duplicate cards. j is only the
>> > position in the deck that you will be switching with the card in
>> > position i. It's not an optimal shuffling routing, but without using a
>> > better PRNG than rand(), it won't make any difference.
>>
>> Do you mean that the quality of the shuffle routine won't matter if rand
>> is bad? I don't see that. Even if rand() is bad, why introduce more
>> bias with a non-uniform shuffle?
>
> As I understand the problem, you are looking at either 52**52
> combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
> 0, there is some pigeon holing, and my guess is that any such error is
> swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
> (2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
> shuffles anyway.

Your guess is wrong. If you try actually playing around with
various RNG implementations I expect you'll find the observed
behaviors contradict your intuition in this case. Even the
notoriously bad RANDU(), with only 30 bits of state, does a fair
job of producing good 52-element shuffles using an "ideal"
shuffle algorithm, but miserable results using the "bad" shuffle
algorithm. It's not hard to get tolerable results out of RNG's
having as little as 20 bits of internal state (I didn't try
anything smaller) using an unbiased shuffle, but basically
impossible using the "just swap each element once to anywhere"
algorithm.

Tim Rentsch <> writes:
> Mark Storkamp <> writes:
[...]
>> As I understand the problem, you are looking at either 52**52
>> combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
>> 0, there is some pigeon holing, and my guess is that any such error is
>> swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
>> (2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
>> shuffles anyway.
>
> Your guess is wrong. If you try actually playing around with
> various RNG implementations I expect you'll find the observed
> behaviors contradict your intuition in this case. Even the
> notoriously bad RANDU(), with only 30 bits of state, does a fair
> job of producing good 52-element shuffles using an "ideal"
> shuffle algorithm, but miserable results using the "bad" shuffle
> algorithm. It's not hard to get tolerable results out of RNG's
> having as little as 20 bits of internal state (I didn't try
> anything smaller) using an unbiased shuffle, but basically
> impossible using the "just swap each element once to anywhere"
> algorithm.

Hmm. I think it depends on what you mean by a good shuffle.

Suppose you have an algorithm that produces a perfect shuffle given
an ordered deck of cards and a sequence of truly random numbers,
where I define "perfect shuffle" to mean that all 52! (about 8.1e67)
possible orders are equally probable.

Using that algorithm with a PRNG that has only 30 bits of state
can generate no more than 2**30 distinct shuffles, which is smaller
than the set of all possible orderings by a factor of about 7.5e58.

Perhaps those 2**30 shuffles will be satisfactory in some sense,
but they're only a tiny fraction of all possible orderings.
For example, it will very probably *never* produce an ordering
that's the exact reverse of the input (or, much less likely, it
will do so far more often than it should); a perfect shuffle would
produce such an ordering one out of 52! times.

Depending on the application, this may or may not matter.

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

On Sunday, 22 September 2013 23:03:26 UTC+1, Keith Thompson wrote:
> Tim Rentsch <> writes:
>
> > Mark Storkamp <> writes:
>
> [...]
>
> >> As I understand the problem, you are looking at either 52**52
>
> >> combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
>
> >> 0, there is some pigeon holing, and my guess is that any such error is
>
> >> swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
>
> >> (2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
>
> >> shuffles anyway.
>
> >
>
> > Your guess is wrong. If you try actually playing around with
>
> > various RNG implementations I expect you'll find the observed
>
> > behaviors contradict your intuition in this case. Even the
>
> > notoriously bad RANDU(), with only 30 bits of state, does a fair
>
> > job of producing good 52-element shuffles using an "ideal"
>
> > shuffle algorithm, but miserable results using the "bad" shuffle
>
> > algorithm. It's not hard to get tolerable results out of RNG's
>
> > having as little as 20 bits of internal state (I didn't try
>
> > anything smaller) using an unbiased shuffle, but basically
>
> > impossible using the "just swap each element once to anywhere"
>
> > algorithm.
>
>
>
> Hmm. I think it depends on what you mean by a good shuffle.
>
>
>
> Suppose you have an algorithm that produces a perfect shuffle given
>
> an ordered deck of cards and a sequence of truly random numbers,
>
> where I define "perfect shuffle" to mean that all 52! (about 8.1e67)
>
> possible orders are equally probable.
>
>
>
> Using that algorithm with a PRNG that has only 30 bits of state
>
> can generate no more than 2**30 distinct shuffles, which is smaller
>
> than the set of all possible orderings by a factor of about 7.5e58.
>
>
>
> Perhaps those 2**30 shuffles will be satisfactory in some sense,
>
> but they're only a tiny fraction of all possible orderings.
>
> For example, it will very probably *never* produce an ordering
>
> that's the exact reverse of the input (or, much less likely, it
>
> will do so far more often than it should); a perfect shuffle would
>
> produce such an ordering one out of 52! times.
>
>
>
> Depending on the application, this may or may not matter.
>

I remember reading an article on just this point recently (though forget where!)

If your random number generator has only a 32-bit seed, then the number of possible shuffles is very low compared to the actual possibilities, such that it is theoretically possible to deduce the the whole pack after seeing only the first six or seven cards. Clearly this would be unsuitable for a gambling application. Indeed, as the most normal way of getting a seed involves reading the time, the number of shuffles becomes relatively tiny. The article considered the case where a gambling application rashly published theprogram that did the shuffle. If the game is played once, it is possible (after about two days of number-crunching) to establish what seed was used, and thus to see what the time is on the shuffling computer and how much it differs from yours. So next time, if you know the time of the shuffle say to within 10 seconds, there are only about 10000 possible shuffles, and yourcomputer can run through them in the time allowed for playing the game.

On 9/22/2013 6:03 PM, Keith Thompson wrote:
> [...]
> Hmm. I think it depends on what you mean by a good shuffle.
>
> Suppose you have an algorithm that produces a perfect shuffle given
> an ordered deck of cards and a sequence of truly random numbers,
> where I define "perfect shuffle" to mean that all 52! (about 8.1e67)
> possible orders are equally probable.
>
> Using that algorithm with a PRNG that has only 30 bits of state
> can generate no more than 2**30 distinct shuffles, which is smaller
> than the set of all possible orderings by a factor of about 7.5e58.
>
> Perhaps those 2**30 shuffles will be satisfactory in some sense,
> but they're only a tiny fraction of all possible orderings.
> For example, it will very probably *never* produce an ordering
> that's the exact reverse of the input (or, much less likely, it
> will do so far more often than it should); a perfect shuffle would
> produce such an ordering one out of 52! times.

<topicality level="low" better_suited_for="comp.sci.math">

Consider an experiment in which you are confronted with two
shuffle routines A and B. One of them (you don't know which) uses
the well-known "fair" shuffle, the other uses the faulty shuffle
from the O.P.'s code. Both use PRNG's of identical quality.

Now, the question: How many shuffles must you perform with
each routine to have a better-than-even chance of identifying
which is which?

Follow-up question: How many shuffles ... to make the
identification with X% confidence, instead of the 50% of the
original question?

Follow-up follow-up question: If you could perform one
shuffle per nanosecond (with whichever routine you select for
that particular trial), what confidence level could you have
attained if you'd been shuffling continuously since the Big
Bang, which in this context we might call the Big Deal? (I
estimate you could do not quite 5e26 shuffles in that time;
would that be enough to discern which routine achieved the
fairer distribution over 8e67 possibilities?)

</topicality>
> Depending on the application, this may or may not matter.

Keith Thompson <> writes:
> Tim Rentsch <> writes:
>> Mark Storkamp <> writes:
> [...]
>>> As I understand the problem, you are looking at either 52**52
>>> combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
>>> 0, there is some pigeon holing, and my guess is that any such error is
>>> swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
>>> (2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
>>> shuffles anyway.
>>
>> Your guess is wrong. If you try actually playing around with
>> various RNG implementations I expect you'll find the observed
>> behaviors contradict your intuition in this case. Even the
>> notoriously bad RANDU(), with only 30 bits of state, does a fair
>> job of producing good 52-element shuffles using an "ideal"
>> shuffle algorithm, but miserable results using the "bad" shuffle
>> algorithm. It's not hard to get tolerable results out of RNG's
>> having as little as 20 bits of internal state (I didn't try
>> anything smaller) using an unbiased shuffle, but basically
>> impossible using the "just swap each element once to anywhere"
>> algorithm.
>
> Hmm. I think it depends on what you mean by a good shuffle.
> [snip elaboration]

True, it does. Here's a thought experiment.

Compare a PRNG-based shuffle against a truly random shuffle (eg,
based on the outcome of quantum mechanical events). Do this by
perfoming a million iterations of shuffling followed by dealing
out seven five-card poker hands, tabulating results for each of
the two methods. Compute some set of statistical measures, eg,
how many of each kind of hand, how often one pair is beaten by
two pair, etc. If the PRNG-based shuffle is less than, say,
one standard deviation away from the truly random shuffle, then
I think it's reasonable to say the PRNG has done a fair job of
producing a good shuffle (all the foregoing made with the
understanding that the PRNG-based shuffle is done using one of
the standard unbiased algorithms).

As the number of iterations increases, the truly random shuffle
will get closer and closer to an ideal distribution, whereas
a PRNG-based shuffle will not have this property if its internal
state is substantially less than, eg, 52! in this case.

However, if all we're interested in is a few tens of millions
of shuffles, there's no reason a PRNG with only 30 bits of
internal state can't do a reasonable job of approximating
a truly random shuffle.

Note that a million shuffles is more than enough to clearly
differentiate an unbiased shuffle algorithm and the "just swap
each element once to anywhere" algorithm, using the simple
measure of which card ends up in which position. The results
are quite stark, and not at all hard to reproduce if one is
interested in doing so.

Share This Page

Welcome to The Coding Forums!

Welcome to the Coding Forums, the place to chat about anything related to programming and coding languages.

Please join our friendly community by clicking the button below - it only takes a few seconds and is totally free. You'll be able to ask questions about coding or chat with the community and help others.
Sign up now!