$\begingroup$Hi. For clarity, are you simply trying to use a computer to generate secure random numbers in the range 0 to 10,000? Or is the incorporation of seeds and nonces an integral requirement? Surely rolls should be independent of the clients, no?$\endgroup$
– Paul UszakAug 27 '18 at 23:31

$\begingroup$@PaulUszak These schemes are build for clients that do not completely trust the server. This way a random value can be calculated that depends on both client and server. It doesn't matter if either one cannot be trusted because the result depends on both. Well, kind of anyway, the server can always change the source of the page, of course.$\endgroup$
– Maarten Bodewes♦Aug 27 '18 at 23:35

1

$\begingroup$@PaulUszak Trust is layered as well. This can help people trust the server. But besides not gambling, you won't see my enter my credit card. But given the track record of the Dutch "Staatsloterij" (state lottery), I'd rather go to to some shady server that uses this kind of scheme.$\endgroup$
– Maarten Bodewes♦Aug 28 '18 at 0:05

1

$\begingroup$No, it is not a typo. Random.Next(1, x) / (x / 10) will always produce something between 1 and 10(common sense), this is the same logic. This is assuming Random.Next(inclusiveLowerBound, inclusiveUpperBound)$\endgroup$
– Nikolai FrolovAug 28 '18 at 0:10

2 Answers
2

$7295$ occur with a probability exactly equal to $429497$ in $2^{32}$.

$2705$ occur with a probability exactly equal to $429496$ in $2^{32}$.

One value, specifically the value ten thousand, occurs with probability exactly equal to $1$ in $2^{32}$.

The mean value of the distribution is approximately $4999.5$. It is approximately a uniform distribution over the range $[0, 9999]$. It is much more uniform than a real-world fair die roll's distribution, but it's also a huge bias in the world of cryptography.

It's not a bias large enough to discourage gamblers. Cryptographers aren't going to be inclined to gamble even for more fair odds based only on the odds and payouts (A lot like the story/rumor about physicists' conferences being unwelcome in Vegas. However that's not to say mathematically minded people might still be irrational or find other reasons for playing.)

Ignoring the value $10000$, each of the $7295$ over-represented values deviates from its expected probability by a factor of $1.0000006298$ and each of the other $2705$ values deviates by a factor of $0.9999983015$.

The multiplication method (using exact math, floating-point arithmetic is not necessarily exact) of transforming one uniform distribution to another uniform distribution over a smaller range has the same amount of bias as the naive division-remainder (without rejection sampling) method that people warn programmers not to use. The bias for the multiplication method, however, is more subtle because it doesn't lead to the over-represented elements all being low elements.

For this specific distribution $0$, $1$, and $2$ are all slightly over-represented. $3$ is slightly under-represented. $4$, $5$, $6$ over-represented, $7$ under-represented, $8$, $9$, $10$ over-represented, $11$ and $12$ over-represented, $13$ under-represented. I think most (or all) of the over-represented elements occur in runs of two or three. And the under-represented in runs of one.

As for the protocol you described, I can't say much about the specifics because you aren't precise in your description.

I can say that the output of algorithms like SHA-512, and by extension HMAC-SHA-512, can safely by treated as a uniform distribution of all 512-bit long bit-strings. Hash functions are often modeled as a random oracle. Wikipedia describes it as:

In cryptography, a random oracle is an oracle (a theoretical black box) that responds to every unique query with a (truly) random response chosen uniformly from its output domain. If a query is repeated it responds the same way every time that query is submitted.

The hash output can begin with 00000000 or FFFFFFFF or any other 32-bit value. In fact, every possible prefix of a given length $n$ is expected to occur each with the probability $2^{-n}$.

It is possible to create a fair shared random value between multiple parties using hash functions. Two parties choose a random secret number, exchange those numbers between each other simultaneously (so no one can cheat by changing their number), and create a new agreed-upon shared random value by hashing the two secrets. (They obviously need to agree on the order they hash each number.)

A commitment scheme is used to prevent parties from changing their secret number after observing the other player's number. They send each other hashes of their own secrets. They don't reveal to each other the actual value of their secret by exchanging hashes. It's not possible to determine the input from just the hash function's output assuming the input is sufficiently unpredictable. (Having high entropy.) A player cannot swap out the secret value they commit to without finding a collision in the hash function. (Using a prepended random nonce makes finding collisions even harder because you cannot use precomputed collisions.)

The final hash of both secrets, the nonce, and whatever other data is involved results in a random value from a uniform distribution (under the assumption that the hash function behaves like a random oracle). Any change in input to a hash function results in unpredictable changes to the output. This means that one party doesn't have to trust that the other party's secret is random as long as their own number was chosen randomly.

However, even with commitment schemes, it is still possible to cheat if aborting the protocol allows the second person to walk away with a smaller loss by not cooperating by revealing their secret than the loss they would incur by cooperating. (Sort of like people who hit the reset button on their game consoles when they're losing to another player. Or flipping a chess board off the table.) The second person has enough information to calculate the result of the game on their own, and so they might pretend to have their computer crash to avoid paying out larger winnings.

And again, your description of the protocol is vague, so we can't say whether the actual protocol is secure or not. (And even if the protocol were secure it's another matter whether or not the implementation is correct.)

This is the mechanism of the rolling system of one of the best known crypto faucet website. I think that the previous answer above is a bit misleading. However, when the math is confusing, a simple test with some code and some pseudo-random generated data should not leave any doubt.

Your statement:

"..the way I see it, it is 50% less likely to hit the first number (1 or 0) or the last number (10000).."

Assuming a good entropy in the generation of (32-bit) numbers, produced and represented by the first $4$ hex numbers ($8$ hex chars as a (4-byte) unsigned integer) of the resulting SHA512 digest, then you are right:

$0$ (not $1$) and $10000$occurs less frequently than every other rolled numbers, $50\%$less on the average, ( ~$0.005\%$ probability instead of ~$0.01\%$ )

why? the key is in the ROUNDING phase that cuts the ranges.

only (32-bit) numbers, greater or equal than ( $429496.7295 \times 9999.5$ ), are rounded to $10000$, from $4294752546.64$ (~$4294752547$) to $4294967295$, then $214749$ numbers, because we are limited to $2^{32}$ numbers.

the same goes for $0$, every (32-bit) integer less or equal than $214748$, is rounded to $0$, a total of $214749$ different numbers.

Every other rolled number, for example $9999$, is the result of rounding a (32 bits) integer ranging from:

$429496.7295 \times 9998.5$ to $429496.7295 \times 9999.499\ldots =$ ~ not less than $429496(.7295)$ different number on the average.

Every (32-bit) integer in the interval $[4294323050, 4294752546]$
produce a $9999$ roll, then $429497$ different values.

on the contrary, only the integers between the interval $[ 5368710, 5798206 ]$ round/roll to the number $13$, then $429496$ different values.

it is almost two times the range of (32-bit) numbers that could be rounded, respectively, to $10000$ or $0$.

How many roll-numbers are under-represented in the interval$[1,9999]$?

1 out of ~ 4 numbers, precisely $1$ out of $3.696858..$.

we use the fractional part to calculate it:

$f = (1-0.7295) = 0.2705$

$t = (1/f) = 3.696858..$

$r = round(9999/t) = 2705$ numbers

$u = 2705 \times 429496 = 1161786680$(32-bit) input integers

It's quite simple to generate the k-th number, for every k in the interval $[1, 2705]$, using this formula: