Dimitri DeFigueiredo Ph.D.Computing, finance, security and collaborative decisionshttp://dimitri.xyz/
Mon, 26 Nov 2018 12:40:03 +0000Mon, 26 Nov 2018 12:40:03 +0000Jekyll v3.7.4Longest simple cycles for fun and profit<p>Algorithms are an incredibly interesting field. In this post, we explore a rather profitable application of their power: finding arbitrage cycles in cryptocurrency markets.</p>
<p>What does it mean to finding an arbitrage cycle? We want to find a cycle of crypto currencies that we can trade, such that after we complete all the trades, we have more than we started with. There are many automated solutions to do this for pairs of crypto currencies.</p>
<p>For example, traders can look at the price differences in the Bitcoin/Ethereum markets between two exchanges (such as Coinbase and Binance). If the price is different between those exchanges, we may be able to find a cycle such as:</p>
<ol>
<li>Buy Ethereum at Binance (paying with Bitcoins)</li>
<li>Sell Ethereum at Coinbase (receiving Bitcoins)</li>
</ol>
<p>If we sell as much Ether as we buy, the amount of Ether we have will not change, but we may end up with more (or less) Bitcoins than we started with. We could describe this as a short cycle:</p>
<p>BTC ⟶ ETH ⟶ BTC</p>
<p>In this example, each cryptocurrency is a vertex in a graph and there is a directed edge from \(V_1\) to \(V_2\) (e.g. from BTC to ETH) if there is a market where we can buy some \(V_2\) paying with \(V_1\). The above cycle may or may not be profitable to trade, we need to investigate. There may also be cycles with more edges. For example:</p>
<p>BTC → LTC → ETH → EOS → BCH → ADA → BTC</p>
<p>To make a profit using this cycle, we would need to successfully execute 6 orders, but it may still be profitable. In fact, it might be more profitable than all other opportunities available. This is a key insight.</p>
<h3 id="solving-the-wrong-problem">Solving the Wrong Problem</h3>
<p>The problem we are trying to solve is very similar to but <em>not the same as</em> the problem of finding negative cycles in a directed graph. If we label each edge in the graph described in the example above with the logarithm of the (buy) price, there will be an arbitrage cycle if and only if there is a negative cycle in the graph.</p>
<p>Given the popularity of single source and all-pairs shortest path algorithms, finding negative cycles in graphs has become a textbook problem <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>. These algorithms are fast and elegant. However, they do not do what we want.</p>
<p>The problem is that the Floyd–Warshall algorithm and others will tell us if there is a profitable arbitrage cycle (i.e. a negative cycle), but they will <em>not</em> tell us if the negative cycle found is the only one available or which cycle is the most profitable when there are many. In short, they do not solve the problem we’re interested in solving. We want to find <em>all</em> negative cycles (all profitable arbitrage opportunities), not just one.</p>
<p>To find the most profitable cycle, we actually need to solve the longest simple cycle problem, instead. We say a cycle is simple if there are no repeated vertices in it. Unfortunately, this has been shown to be NP-hard <sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>. To see why, consider a reduction. The reduction is from the Hamiltonian path problem to an instance of the longest simple path when the edge weights are all 1. Shall we despair?</p>
<h3 id="separating-topology-from-profitability">Separating Topology From Profitability</h3>
<p>I would not be writing this blog post if the answer was to give up and go home! There are two facts that help us:</p>
<ol>
<li>Our graph is sparse - It is much easier to enumerate all simple cycles in a sparse graph as there are far fewer possible paths to check. So, even an algorithm with an exponential running time in the worst-case may perform reasonably well for our purposes.</li>
<li>We have plenty of time to calculate - Although the prices of crypto currencies change multiple times a second, new markets become available and/or disappear only every day or so.</li>
</ol>
<p>Again, consider the graph where each vertex represents a crypto currency and there is a directed edge between vertices A and B if there is a market for buying B by making payments in A. We can separate changes in the topology of this graph from changes in the weight of each edge. Changes in edge weight are very frequent, as the prices change multiple times a second, but changes in the overall topology of the graph (i.e. which markets exist) are much less frequent. Exchanges have to do engineering work to list a new crypto token, so it takes a while for those markets to be created, thus changing the topology.</p>
<p>There is one more restriction that we will not consider in detail here. We do not actually need to find all profitable cycles. We need all profitable cycles <em>that we can use</em>. It turns out, that it is too risky to use cycles with many edges in them, even if the sum of all edge values is quite large. If we have 34 edges in a cycle, we need to execute 34 orders to capture the arbitrage opportunity. This is too risky. Order placement is not perfect and the market prices change constantly. Maybe it is best to look for cycles with at most 11 edges, for example. This means that we are actually looking for cycles with a large sum of edge values but with only a small number of edges.</p>
<p>We can use Johnson’s elementary circuit enumeration algorithm <sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup> to find all simple cycles in our sparse graph. Because the topology of the graph does not change very often, it does not matter if this calculation takes a long time. This is what lets us separate changes in the topology of the graph from changes in edge values (which represent changes in price).</p>
<h3 id="maximal-profitability">Maximal Profitability</h3>
<p>After we use Johnson’s algorithm to find all simple cycles, our task becomes much simpler.</p>
<p>Assume we have obtained a list of all simple cycles (or elementary circuits) in the graph. Call these cycles \(C_i\). These \(C_i\) represent all possible arbitrage cycles. We can then calculate the profit or loss we would get by performing the trades described by each cycle. There may be a very large number of such cycles, but <em>we can do this calculation in parallel</em>, so it doesn’t take very long!</p>
<p>We do have to be careful when two distinct cycles trade on the same markets. If we buy up some Bitcoin at Coinbase when executing trades related to our most profitable cycle, that initial Bitcoin price may no longer be available when we want to execute the trades in the second most profitable cycle. We need an adjustment here, but this is not an insurmountable problem.</p>
<p>Assume there are two profitable cycles in the graph \(C_1\) and \(C_9\), which one should we pick? The answer is always: the most profitable one. Unfortunately, the profit or loss we obtain in a cycle is not a single value, but a function of the volume traded. For example, we may obtain a profit when trading 1 BTC, but suffer a loss when trading 2 BTC. That is, the profit or loss depends on the volume traded.</p>
<p>For each available cycle \(C_i\), we need to obtain its “transfer function” \(T_i(x)\). This function represents the amount we get out of the cycle when we put \(x\) units in. If \(T_i(x) &gt; x\) for some \(x\), the cycle can be profitable as we can get more out than we put in.</p>
<p>We know that any transfer function \(T_i(x)\) starts at the origin (you get zero when you pay zero) and is monotonically increasing (the more you pay, the more you get). Assuming all orders on the order books can be partially executed, we can also deduce the \(T_i\) are piecewise linear and <em>concave</em> (as better prices always execute first). The concavity is very important because it gives us a simple algorithm to maximize our profit when trading in all available cycles: <em>water filling</em>.</p>
<p>The concavity of the \(T_i\) means it is optimal to be greedy. Starting at 0, for each extra \(\delta x\) of crypto currency we want to trade, we evaluate all derivatives \(T_i’(0)\) and choose the highest one, i.e. \(max \; T_i’(0) \). This is the most profitable cycle for any initial amount we want to trade. Because the \(T_i\) are all piecewise linear, we can keep adding volume to the chosen cycle until the end of the current line segment. We are guaranteed this will be the best deal until the initial segment ends.</p>
<p>Once the current segment ends, \(\frac{dT_i}{dx}\) needs to be re-evaluated for the new segment. After we take into account the new (lower) available returns at the new segment (after we already traded \(\delta x\) ), it may be that another cycle \(C_j\) becomes more profitable than cycle \(C_i\) and we repeat the process. We keep doing this while \(max \; \frac{dT_i}{dx} &gt; 1\) and \(T_i(x) &gt; x\), beyond this point, trading ceases to be profitable. We may also stop before any desired rate of return, for example 1.05 for a minimum of 5% return, instead of 1.</p>
<h3 id="conclusion">Conclusion</h3>
<p>There is still a lot more to be done to run a profitable trading operation. For example, we still need to calculate exactly which orders to place in each market. Thankfully, this can be done through a different interpreter on the same arbitrage cycle description. We also need to take into account the probability of execution failures and other “real-world” factors, but I hope this exposition convinced you that finding arbitrage cycles is an interesting algorithmic problem. I also hope it is easy to see how important it is to clearly understand exactly which algorithmic problems we need to solve and which we can avoid (or work around).</p>
<p><em>I would like to thank Fabricio Oliveira and Balaji Venkatachalam for their many helpful suggestions to improve this post.</em></p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p><a href="https://courses.csail.mit.edu/6.046/spring04/handouts/ps7sol.pdf">https://courses.csail.mit.edu/6.046/spring04/handouts/ps7sol.pdf</a> <a href="#fnref:1" class="reversefootnote">&#8617;</a></p>
</li>
<li id="fn:2">
<p>Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2001. Introduction to Algorithms, Second Edition (2nd ed.). The MIT Press <a href="#fnref:2" class="reversefootnote">&#8617;</a></p>
</li>
<li id="fn:3">
<p><a href="https://epubs.siam.org/doi/abs/10.1137/0204007">Donald B. Johnson “Finding All the Elementary Circuits of a Directed Graph”, SIAM J. Comput., 4(1), 77–84, 1975</a> <a href="#fnref:3" class="reversefootnote">&#8617;</a></p>
</li>
</ol>
</div>
Mon, 26 Nov 2018 00:00:00 +0000http://dimitri.xyz/crypto-arbitrage-cycles/
http://dimitri.xyz/crypto-arbitrage-cycles/Generating random integers from random bytes<p>Here is a problem that appears simpler than it is. Programmers many times have at their disposal a source of random bits but need to generate uniformly distributed random integers in a given range. The immediate algorithms for doing this, using remainders or scaling lead to biased distributions. A cursory review of open source software shows that using these is a common problem. They even show up in cryptographic libraries! Usually with minor effects. If you want to learn how to do this “conversion” properly, read on.</p>
<p>Imagine you are running a simple lottery that works like a raffle. All the tickets are sold at the same price. People buy tickets throughout the week and a single winning ticket is announced at the end of the week. There is a winner every week; the total prize does not accumulate week-to-week.</p>
<p>A basic requirement for this lottery is that every ticket must have the same chance of winning as every other ticket. In fact, this may be a legal requirement. It is not fair for a ticket to have a higher chance of winning than any other ticket as they all cost the same.</p>
<p>The number of people buying tickets changes from week to week; but no matter how many tickets are sold, the lottery must always be fair. Each week, the wining ticket must be drawn from a uniform distribution over all the tickets sold. You don’t want to break the law.</p>
<p>To aid you in executing the lottery you are given a <em>perfect</em> source of random bytes. Imagine that by using some oscillators, heat and quantum wizardry Intel has just release a wonderful new chip. This chip will give you as many random bits as you want. Each bit is drawn from a Bernoulli distribution with probability \(1/2\). In the javascript world, this is equivalent to a perfect version of <code class="highlighter-rouge">getRandomValues()</code> or Node.js’ <code class="highlighter-rouge">randomBytes()</code>.</p>
<p>Armed with your perfect source of random bits you set out to code the software that will draw the winning lottery ticket. Here are a few ways <em>not</em> to do it.</p>
<h3 id="bias-from-remainders">Bias from Remainders</h3>
<p>Assume that in a given week \(N=10\) tickets were sold. Using our source of randomness we use the next power of 2, that is 16, and pick a random number between 0 and 15 by using 4 perfectly random bits. We then proceed to calculate the winning ticket using simple modular arithmetic.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s2">"use strict"</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">crypto</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="s1">'crypto'</span><span class="p">)</span>
<span class="cm">/* use just 4 random bits - a value from 0 to 15 */</span>
<span class="kd">const</span> <span class="nx">randomBits</span> <span class="o">=</span> <span class="nx">crypto</span><span class="p">.</span><span class="nx">randomBytes</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="nx">readUInt8</span><span class="p">()</span> <span class="o">&amp;</span> <span class="mh">0x0F</span>
<span class="kd">const</span> <span class="nx">N</span> <span class="o">=</span> <span class="mi">10</span>
<span class="kd">let</span> <span class="nx">winner</span> <span class="o">=</span> <span class="nx">randomBits</span> <span class="o">%</span> <span class="nx">N</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`And the winner is ticket number </span><span class="p">${</span><span class="nx">winner</span><span class="p">}</span><span class="s2"> !`</span><span class="p">)</span>
</code></pre></div></div>
<p>Unfortunately, this generates a biased distribution because \(N\) is not a power of 2. We will be grouping the 16 values that can be returned by <code class="highlighter-rouge">randomBytes</code> into 10 “equivalence classes”. Some classes will collect more values than others, as the next diagram shows.</p>
<p><img src="http://dimitri.xyz/assets/mod-16-diagram.png" alt="Using mod 16 bits" /></p>
<p>This is really bad for the lottery as tickets 0 through 5 are twice as likely to win as the other tickets. We can mitigate this problem by using more random bits. If we use 6 random bits to pick a number between 0 and 63 and then apply the modulus (i.e. <code class="highlighter-rouge">%</code>) operation, we have the following situation:</p>
<p><img src="http://dimitri.xyz/assets/mod-64-diagram.png" alt="Using mod 64 bits" /></p>
<p>Two points to note:</p>
<ol>
<li>Because \(N = 10\) is not a power of 2, there will <em>always</em> be some favored tickets that are more likely to win.</li>
<li>The discrepancy between the most likely and the least likely tickets is a function of the number of random bits used. If we use \(m\) random bits (we used \(m = 6\) in the last diagram), the difference in probability between the more likely and least likely tickets is \(1/2^m\).</li>
</ol>
<p>The second point above means that we can make the unfairness imperceptibly small by using lots of random bits. This is good. However, the difference in probability never actually goes away. We will never generate a truly uniform distribution this way.</p>
<h3 id="bias-from-scaling">Bias from Scaling</h3>
<p>Another attempt at generating a uniform distribution goes like this. Imagine we have a random number \(r\) in the interval \([0,1)\). We can multiply \(r\) by the number of tickets sold \(N = 10 \) to obtain a random number \(r N\) in the interval \([0,10)\) and then just take the integer part.</p>
<p>This mistaken algorithm <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random#Getting_a_random_integer_between_two_values">is used a lot</a> in javascript because the output of <code class="highlighter-rouge">Math.random</code> looks a lot like the random number \(r\). Here’s what it would look like in Node.js code:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s2">"use strict"</span><span class="p">;</span>
<span class="cm">/* get random number in [0,1) */</span>
<span class="kd">const</span> <span class="nx">r</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">random</span><span class="p">()</span>
<span class="kd">const</span> <span class="nx">N</span> <span class="o">=</span> <span class="mi">10</span>
<span class="kd">let</span> <span class="nx">winner</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">floor</span><span class="p">(</span><span class="nx">r</span> <span class="o">*</span> <span class="nx">N</span><span class="p">)</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`And the winner is ticket number </span><span class="p">${</span><span class="nx">winner</span><span class="p">}</span><span class="s2"> !`</span><span class="p">)</span>
</code></pre></div></div>
<p>Unfortunately, this idea also generates a biased distribution. This is because we don’t actually have a random number \(r\) in the interval \([0,1)\). All we have are random bits. We would need an <em>infinite</em> number of random bits to be able to generate a random number in \([0,1)\). Because we do not have an infinite number of bits, the best we can do is to approximate \(r\) and this approximation leads to the bias. To see why, first consider using 4 bits to generate a random number in \([0,1)\). This can be done by sticking a “0.” in front of the bits generated and reading them as a binary number. This is equivalent to dividing the output of 4 bits by \(16=2^4\) as shown below.</p>
<p><img src="/assets/scaling-4-bits-diagram.png" alt="random 4 bit fraction" style="width: 400px;" /></p>
<p>These are the possible values that we can generate inside the \([0,1)\) interval with only 4 random bits. Now consider what will happens in the expression <code class="highlighter-rouge">Math.floor(r * N)</code> for \(N = 10 \). Any value of \(r\) smaller than 0.1 will be floored to zero. Similarly, values of \(r\) in the range \([0.1 , 0.2)\) will be output as 1, values in the range \([0.2 , 0.3)\) will be output as 2, and so on. In short, this function partitions the \([0,1)\) interval into 10 distinct regions. Unfortunately, we have a different number of possible inputs falling in each region as the next diagram shows.</p>
<p><img src="/assets/scaling-16-line.png" alt="regions of unit interval" style="width: 250px;" /></p>
<p>This asymmetry makes some outputs more likely than others. For example, zero which corresponds to values in the region \([0 , 0.1)\) is twice as likely as 2 which corresponds to the region \([0.2 , 0.3)\)). Again, this happens because some regions or “equivalence classes” contain more possible input values. Each equivalence class corresponds to a lottery ticket, so some lottery tickets are more likely to win.</p>
<p>Unless \(N\) is a power of 2, this will always happen. Just like in the case of remainder bias, we can mitigate the problem by using a large number of bits. If we use a standard 53 bits, the difference in probability between the most likely and the least likely tickets will be \(2^{-53}\approx \frac{1}{10^{16}}\). That is virtually undetectable and good for most (non-cryptographic) applications, but this is not a truly uniform distribution. We can do better.</p>
<p><strong>Warning:</strong> The trick of using more random bits to obtain a smaller bias works if we don’t also have to increase the number of equivalence classes \(N\), otherwise we might be back where we started.</p>
<h3 id="doing-it-right--rejection-sampling">Doing it right — Rejection Sampling</h3>
<p>There is, in fact, a simple way to generate our winning ticket from a perfectly uniform (i.e. 100% fair) distribution. We sample our source of random bits and then simply reject samples we don’t like.</p>
<p>For example, we use our perfect source of random bits to generate an integer and simply reject samples outside our range. We can generate an integer between 0 and 15 using 4 bits of our perfect source. If the output is smaller than 10 then it is accepted; otherwise, it is rejected and we try again. Here’s what it looks like in code:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s2">"use strict"</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">crypto</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="s1">'crypto'</span><span class="p">)</span>
<span class="kd">function</span> <span class="nx">sample</span><span class="p">(){</span><span class="k">return</span> <span class="nx">crypto</span><span class="p">.</span><span class="nx">randomBytes</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="nx">readUInt8</span><span class="p">()</span> <span class="o">&amp;</span> <span class="mh">0x0F</span><span class="p">}</span>
<span class="kd">const</span> <span class="nx">N</span> <span class="o">=</span> <span class="mi">10</span>
<span class="cm">/* Rejection Sampling */</span>
<span class="kd">var</span> <span class="nx">s</span>
<span class="k">do</span> <span class="p">{</span>
<span class="nx">s</span> <span class="o">=</span> <span class="nx">sample</span><span class="p">()</span> <span class="c1">// s is a value from 0 to 15</span>
<span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="nx">s</span> <span class="o">&gt;=</span> <span class="nx">N</span><span class="p">)</span> <span class="c1">// reject if outside our desired range</span>
<span class="kd">let</span> <span class="nx">winner</span> <span class="o">=</span> <span class="nx">s</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`And the winner is ticket number </span><span class="p">${</span><span class="nx">winner</span><span class="p">}</span><span class="s2"> !`</span><span class="p">)</span>
</code></pre></div></div>
<p>Our source of randomness is perfect, so rejecting out of bounds samples will <em>not</em> bias the samples that are allowed through. However, there is something slightly troubling about this algorithm. It may never terminate! The random source may continually output random bits whose value is larger than 9 forcing us to sample forever. This is highly unlikely as our source is random, but we cannot set a hard worst-case bound on how long this algorithm will run.</p>
<p>This is a trade-off we face, in the previous biased-distribution examples we could guarantee that our algorithms would terminate, but had to use more and more bits to make the bias in the generated probability distributions small. With rejection sampling, we guarantee that there is no bias in the generated probability distribution, but we may have to use more and more bits to ensure our algorithm terminates.</p>
<h3 id="termination">Termination</h3>
<p>When one does have a good source of randomness, it is easy to ensure “timely” termination. It will be extremely unlikely that the rejection sampling algorithm will have to sample many times to be able to produce a valid sample. More precisely, the likelihood that \(k \) samples are needed before a valid sample is found decreases exponentially as \(k \) increases and is smaller than \(2^{-k}\). We just have to ensure that the probability that any single sample is accepted is at least \(1/2\).</p>
<p>Consider our running example. There is a \(10/16 = 5/8\) chance that one sample will fall in our desired range. In other words, there is a \(3/8\) chance that it will be rejected. What is the likelihood that we have to try 10 or more samples to get one in the desired range?</p>
<p>We only try 10 or more samples if on the previous 9 attempts we got a sample that was larger than 9. As the samples are independent, that is going to happen with probability \( (\frac{3}{8})^9 \approx 0.00015 \). This is once every 6818 runs. The probability we will need more than 50 samples is a minuscule \(5 \times 10^{-22} \).</p>
<p>In the previous example, we only used 4 bits of randomness to generate an integer from 0 to 15. If we had used 32 bits from our random source to obtain a number between 0 and \( 2^{32} - 1 = 4294967295 \) and rejected all samples larger than 10, we would have a problem because we would be rejecting most of the samples. In this case, the probability we will have to sample at least 10 times becomes \( ( \frac{2^{32}-10}{2^{32}})^9 \approx 0.99999998 \). And the probability we will need more than 50 samples becomes 0.99999988. In other words, we will most likely be sampling more than 50 times every time we run our algorithm. This is definitely not what we want.</p>
<h3 id="extending-our-range">Extending our range</h3>
<p>The problem with using 32 bits as above is that we rejected too many samples. The probability of finding an acceptable sample was too low. Again, we have to ensure that the probability that any single sample is accepted is at least \(1/2\).</p>
<p>Consider the simpler example of using 8 bits, that is a number between 0 and 255 to obtain a lottery winner \(N = 10\). Instead of rejecting all samples equal to 10 or above, we calculate the largest multiple of 10 that is less than \(2^8\). That number is 250. We then only reject values at or above this number. Obviously, if we do this we will get a number between 0 and 250, which is not what we want. But because 250 is an exact multiple of 10, we can now safely apply the remainder to get a number in the desired range without biasing our perfectly uniform distribution.</p>
<p>Here’s the code:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s2">"use strict"</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">crypto</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="s1">'crypto'</span><span class="p">)</span>
<span class="kd">function</span> <span class="nx">sample</span><span class="p">(){</span><span class="k">return</span> <span class="nx">crypto</span><span class="p">.</span><span class="nx">randomBytes</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="nx">readUInt8</span><span class="p">()}</span>
<span class="kd">const</span> <span class="nx">maxRange</span> <span class="o">=</span> <span class="mi">256</span>
<span class="kd">const</span> <span class="nx">N</span> <span class="o">=</span> <span class="mi">10</span>
<span class="cm">/* extended range rejection sampling */</span>
<span class="kd">const</span> <span class="nx">q</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">floor</span><span class="p">(</span> <span class="nx">maxRange</span> <span class="o">/</span> <span class="nx">N</span> <span class="p">)</span>
<span class="kd">const</span> <span class="nx">multiple_of_N</span> <span class="o">=</span> <span class="nx">q</span> <span class="o">*</span> <span class="nx">N</span>
<span class="kd">var</span> <span class="nx">s</span>
<span class="k">do</span> <span class="p">{</span>
<span class="nx">s</span> <span class="o">=</span> <span class="nx">sample</span><span class="p">()</span> <span class="c1">// 0 to 255</span>
<span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="nx">s</span> <span class="o">&gt;=</span> <span class="nx">multiple_of_N</span><span class="p">)</span> <span class="c1">// extended acceptance range</span>
<span class="kd">let</span> <span class="nx">winner</span> <span class="o">=</span> <span class="nx">s</span> <span class="o">%</span> <span class="nx">N</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`And the winner is ticket number </span><span class="p">${</span><span class="nx">winner</span><span class="p">}</span><span class="s2"> !`</span><span class="p">)</span>
</code></pre></div></div>
<p>Furthermore, the probability each sample is accepted is \( \frac{250}{256} \) and that is larger than \( \frac{1}{2} \) as we wanted.</p>
<p>We also need to make sure we have more random bits than the range of integers we need.</p>
<h3 id="polishing-up">Polishing up</h3>
<p>We have assumed, so far, that we have a perfect source of random bits. This gave us assurance that our algorithm will (with high probability) eventually be done. However, if we have a buggy or biased source of randomness, our rejection sampling algorithm can go into an infinite loop by rejecting all samples. This should never happen with a truly random source. Actually, never is a strong word, let’s just say it should not happen in the next billion years. So, we want to signal there is an error when this happens, rather than go into an infinite loop.</p>
<p>The final version of our algorithm sets an upper bound on how many times sampling is attempted before a valid sample is found. We will set this upper bound to be 100 attempts. The probability that we need more than 100 random samples is less than \( 2^{-100} \), so it is more likely we have a bug in our source of randomness than that this long sequence of rejected samples actually happened by chance.</p>
<p>We also provide a simple utility function <code class="highlighter-rouge">getRandIntInclusive</code> to generate integers in an arbitrary range that <em>includes</em> the upper and lower bounds. We also made everything work sensibly when those bounds are fractional numbers: it works as long as there is an integer in the range. So that,</p>
<ul>
<li><code class="highlighter-rouge">getRandIntInclusive(2, 3)</code> returns 2 or 3;</li>
<li><code class="highlighter-rouge">getRandIntInclusive(2.1, 2.9)</code> fails; and</li>
<li><code class="highlighter-rouge">getRandIntInclusive(2.1, 3.9)</code> always returns 3.</li>
</ul>
<p>Here’s the final version:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s2">"use strict"</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">crypto</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="s1">'crypto'</span><span class="p">)</span>
<span class="c1">// 32 bit maximum</span>
<span class="kd">const</span> <span class="nx">maxRange</span> <span class="o">=</span> <span class="mi">4294967296</span> <span class="c1">// 2^32</span>
<span class="kd">function</span> <span class="nx">getRandSample</span><span class="p">(){</span><span class="k">return</span> <span class="nx">crypto</span><span class="p">.</span><span class="nx">randomBytes</span><span class="p">(</span><span class="mi">4</span><span class="p">).</span><span class="nx">readUInt32LE</span><span class="p">()}</span> <span class="c1">//Node.js, change for Web API</span>
<span class="kd">function</span> <span class="nx">unsafeCoerce</span><span class="p">(</span><span class="nx">sample</span><span class="p">,</span> <span class="nx">range</span><span class="p">){</span><span class="k">return</span> <span class="nx">sample</span> <span class="o">%</span> <span class="nx">range</span><span class="p">}</span>
<span class="kd">function</span> <span class="nx">inExtendedRange</span><span class="p">(</span><span class="nx">sample</span><span class="p">,</span> <span class="nx">range</span><span class="p">){</span><span class="k">return</span> <span class="nx">sample</span> <span class="o">&lt;</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">floor</span><span class="p">(</span><span class="nx">maxRange</span> <span class="o">/</span> <span class="nx">range</span><span class="p">)</span> <span class="o">*</span> <span class="nx">range</span><span class="p">}</span>
<span class="cm">/* extended range rejection sampling */</span>
<span class="kd">const</span> <span class="nx">maxIter</span> <span class="o">=</span> <span class="mi">100</span>
<span class="kd">function</span> <span class="nx">rejectionSampling</span><span class="p">(</span><span class="nx">range</span><span class="p">,</span> <span class="nx">inRange</span><span class="p">,</span> <span class="nx">coerce</span><span class="p">){</span>
<span class="kd">var</span> <span class="nx">sample</span>
<span class="kd">var</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">do</span><span class="p">{</span>
<span class="nx">sample</span> <span class="o">=</span> <span class="nx">getRandSample</span><span class="p">()</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">i</span> <span class="o">&gt;=</span> <span class="nx">maxIter</span><span class="p">){</span>
<span class="c1">// do some error reporting.</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">"Too many iterations. Check your source of randomness."</span><span class="p">)</span>
<span class="k">break</span> <span class="cm">/* just returns biased sample using remainder */</span><span class="p">}</span>
<span class="nx">i</span><span class="o">++</span>
<span class="p">}</span> <span class="k">while</span> <span class="p">(</span> <span class="o">!</span><span class="nx">inRange</span><span class="p">(</span><span class="nx">sample</span><span class="p">,</span> <span class="nx">range</span><span class="p">)</span> <span class="p">)</span>
<span class="k">return</span> <span class="nx">coerce</span><span class="p">(</span><span class="nx">sample</span><span class="p">,</span> <span class="nx">range</span><span class="p">)</span>
<span class="p">}</span>
<span class="c1">// returns random value in interval [0,range) -- excludes the upper bound</span>
<span class="kd">function</span> <span class="nx">getRandIntLessThan</span><span class="p">(</span><span class="nx">range</span><span class="p">){</span>
<span class="k">return</span> <span class="nx">rejectionSampling</span><span class="p">(</span><span class="nb">Math</span><span class="p">.</span><span class="nx">ceil</span><span class="p">(</span><span class="nx">range</span><span class="p">),</span> <span class="nx">inExtendedRange</span><span class="p">,</span> <span class="nx">unsafeCoerce</span><span class="p">)}</span>
<span class="c1">// returned value is in interval [low, high] -- upper bound is included</span>
<span class="kd">function</span> <span class="nx">getRandIntInclusive</span><span class="p">(</span><span class="nx">low</span><span class="p">,</span> <span class="nx">hi</span><span class="p">){</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">low</span> <span class="o">&lt;=</span> <span class="nx">hi</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">l</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">ceil</span><span class="p">(</span><span class="nx">low</span><span class="p">)</span> <span class="c1">//make also work for fractional arguments</span>
<span class="kd">const</span> <span class="nx">h</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">floor</span><span class="p">(</span><span class="nx">hi</span><span class="p">)</span> <span class="c1">//there must be an integer in the interval</span>
<span class="k">return</span> <span class="p">(</span><span class="nx">l</span> <span class="o">+</span> <span class="nx">getRandIntLessThan</span><span class="p">(</span> <span class="nx">h</span> <span class="o">-</span> <span class="nx">l</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span> <span class="p">}}</span>
<span class="kd">var</span> <span class="nx">winner</span> <span class="o">=</span> <span class="nx">getRandIntInclusive</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">9</span><span class="p">)</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`And the winner is You! Ticket number </span><span class="p">${</span><span class="nx">winner</span><span class="p">}</span><span class="s2"> !`</span><span class="p">)</span>
</code></pre></div></div>
<p>(Here’s a <a href="https://gist.github.com/dimitri-xyz/ba6f6d81a9db39d2a918fb8ecece9a76">haskell version</a> of the same code.)</p>
<p>That’s all folks! Now, you know how to properly generate uniformly distributed random integers from random bytes. Tricky, isn’t it!?</p>
<p><em>Acknowledgments: I would like to thank <a href="http://jelv.is/">Tikhon Jelvis</a> for some great suggestions to improve this blog post.</em></p>
Fri, 21 Apr 2017 00:00:00 +0000http://dimitri.xyz/random-ints-from-random-bits/
http://dimitri.xyz/random-ints-from-random-bits/Flushing Mortgage Payments Down the Toilet<p>Nobody likes to pay rent. You never see that money again. From the point of view of the renter, paying rent is just like flushing your money down the toilet. The money is gone.</p>
<p>Many proud home owners live under the impression that they do not flush their money down the toilet each month. Unfortunately, that is not entirely true. Most home owners have to make monthly mortgage payments. Part of these payments does pay for the house, but another part pays for interest on the debt. Paying interest on debt is very similar to paying for rent. The money is gone, you will never see it again. To find out how much of your mortgage payments will be flushed down the toilet each month, we need to do some calculations.</p>
<p>Buying a house is a highly leveraged and, therefore, risky bet. Buying a property with a 20% down payment is a bet with 5× leverage. An 8% decrease in the value of the home means that the home owner just lost 5×8 = 40% of his down payment! The calculations that follow should be considered in conjunction with changes in home value or when home prices are expected to remain stable over the long run.</p>
<h3 id="how-much-interest-are-you-paying">How much interest are you paying?</h3>
<p>As you pay your mortgage, how much is owed to the bank decreases with time. The home owner only pays interest on the remaining balance, not the whole amount borrowed.</p>
<p>To illustrate this point consider a simple mortgage payment scheme with decreasing payments. Assume you borrow $500,000 from your bank at an interest rate of 5% per year (or 0.407% per month) for 25 years (300 months). One way to pay down the mortgage debt is to pay, each month, a fixed part of your total debt and all interest accrued during that period. If you choose to pay your mortgage in 25 years (300 months), then at each month you pay 1/300th of the initial balance of $500,000 and all the interest accrued during that same month. This way, any interest owed never compounds for more than a month. If the amount initially borrowed is $500,000, then the amount of principal paid each month is</p>
<p>$500,000/(300 payments) = $1,666.67 per month.</p>
<p>The interest accrued during the <em>first</em> period is</p>
<p>500,000 × 0.00407 = $2,037.06.</p>
<p>So, the first month’s payment is</p>
<p>1,666.67 + 2.037.06 = $3,703.73</p>
<p>You owe an initial balance \(B_{0}\) of $500,000 when you take out the money. After you make your first payment, at the end of your first month, you will owe</p>
<p>\(B_{1} = 299/300 × 500,000 = $498,333.33 \)</p>
<p>At the end of the second month, you will again pay $1,666.67 corresponding to 1/300th of the principal, but this time your remaining balance has decreased to $498,333 and so you will pay a little less interest. The total payment at the end of the second month is given by</p>
<p>1,666.67 + (299/300 × 500,000) × 0.00407 = 3,696.94</p>
<p>At the end of the second month, the debt is reduced to 298/300ths of its original value.</p>
<p>\(B_{2}\) = 298/300 × 500,000 = $496,667</p>
<p>Fast forward 30 years and you will only owe 1/300th of $500,000</p>
<p>\(B_{299}\) = 1/300 × 500,000 = $1,666.67</p>
<p>during the last month of your mortgage. You will pay a correspondingly very small amount of interest on your debt then, only 0.00407 × $1,666.67 = $6.79. Your final payment will be</p>
<p>1,666.67 + 6.79 = 1,673.46</p>
<p>Notice that the amount of interest paid on the first month is 300 times larger than the amount of interest paid on the last month! This large difference makes this scheme impractical for many people who can easily afford the final payments but not the initial ones.</p>
<p>In this decreasing payments scheme, the monthly payments decrease linearly by \( 0.00407 × 1,666.67 = $ 6.79 \) per month. Simply adding up all the payments shows that you will end up paying a total of $806,579 on $500,000 of debt. In other words, 1.613 times the original amount.</p>
<h3 id="interest-paid-through-fixed-payments">Interest paid through Fixed Payments</h3>
<p>Most mortgages are paid through fixed (rather than decreasing) monthly payments.</p>
<p>In a fixed-payments mortgage, the home owner borrows an initial amount \( B_0 \) from the bank and makes a total of \(n\) fixed monthly payments of \(P\) dollars to pay off his debt. As the home owner makes his payments, the remaining balance decreases and a progressively smaller fraction of each payment goes towards paying interest. Consequently, more and more money goes towards paying the principal. This is shown in the figure below.</p>
<p>The red areas depict how much interest is being paid each month and the \(a_k\) represent the corresponding amount of principal being paid on the k-th month. The remaining balance after the k-th payment is \(B_k\). After the home owner makes his last payment this balance is zero.</p>
<p><img src="http://dimitri.xyz/assets/fixed-payments1.png" alt="Fixed Payments" /></p>
<p>Let us calculate what the monthly payments should be. Instead of using the interest rate \(i\), we will use the <em>gross </em>interest rate given by \(r=1+i\) to simplify the calculation. For the monthly rate in our example above \(i=0.00407\) and \(r=1.00407\).</p>
<p>The initial balance is \(B_0\). After the first month, the remaining balance will be</p>
<script type="math/tex; mode=display">B_1 = r B_0 - P</script>
<p>The balance at the end of the second month is</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align}
B_2 & = r B_1 - P \\
& = r ( r B_0 - P) - P \\
& = r^2 B_0 - r P - P
\end{align} %]]></script>
<p>At the end of the third month</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align}
B_3 & = r B_2 - P \\
& = r ( r B_1 - P) - P \\
& = r ( r ( r B_0 - P ) - P) - P) \\
& = r^3 B_0 - r^2 P - r P - P
\end{align} %]]></script>
<p>In short, after each month the remaining balance is being multiplied by the gross interest rate \(r\) and then a payment of \(P\) is subtracted. At the end of \(n\) months the home owner will have paid his mortgage and we will have</p>
<p>\( 0 = B_n = r^n B_0 -r^{n-1} P - r^{n-2} P - \ldots - r P - P \)</p>
<p>\( 0 = r^n B_0 - P ( r^{n-1} + r^{n-2} + \ldots + r + 1) \)</p>
<p><a href="http://dimitri.xyz/topic-pages/gp-sum">Calculating the sum</a> for the standard geometric progression on the right we obtain.</p>
<p>\[ 0 = r^n B_0 - P \frac{r^n - 1}{r - 1} \]</p>
<p>Finally, rearranging the formula for \(P\)</p>
<p>\[ P = B_0 \frac{r^n(r - 1)}{r^n - 1}\]</p>
<p>This formula is all we need to calculate how much your monthly payments should be. For our $500,000 loan with 5% yearly interest we obtain that \(P\)=$2890.69. Simply adding up all 300 payments shows that you will pay a total of $867,207 which is 1.73 times the borrowed amount. Put it another way, each payment is 73% larger, an extra 2890.69 - 1666.67 = $ 1224.02, than it would be for a zero interest rate loan.</p>
<p>We left out the very important tax breaks. The U.S. government gives a tax break for interest paid on mortgages. If we assume that the home owner is in a 33% tax bracket and can benefit from both federal and state tax incentives. The amount paid in interest is reduced by 2/3. In our example, this means that only \(\frac{2}{3}\) × 1224.02 = $ 816.01 is flushed down the toilet each month. Add to this amount the extra costs of owning a home and compare the total to your rent payments. If you are still paying more than the total in rent and you don’t expect home prices to fall, buying a home may be a good move for you.</p>
<h3 id="conclusion">Conclusion</h3>
<p>Don’t fool yourself into thinking buying a home is always a good investment because you will not be paying rent. Calculate how much interest you will be paying, what tax breaks you will get, the extra costs of owning a property (recurring property tax, insurance, maintenance and depreciation and the one-time costs of closing a deal) add the opportunity cost of not having savings and compare tall hat to your rent payments before making a decision. The $816 dollars above only take into account the interest itself.</p>
<p><em>This entry has been updated to include the opportunity cost and make explicit that the calculation above does not include all costs. The layout has also been changed.</em></p>
Sat, 15 May 2010 16:41:01 +0000http://dimitri.xyz/2010/05/15/flushing-mortgage-payments-down-the-toilet/
http://dimitri.xyz/2010/05/15/flushing-mortgage-payments-down-the-toilet/The Liar&#8217;s Paradox<blockquote>
<p><strong>“This sentence is false.”</strong></p>
</blockquote>
<p>If this statement is true, then what it states must be true. It states that it is false. So, if we assume the statement is true we conclude it must be false. If the statement is false, then what it states should be false, but it correctly states that it is false. So, it is true. Thus, if we assume the statement is false, we conclude that it must be true. The statement appears to be neither true nor false and yet it must be either true or false. That is a paradox.</p>
<p>Now consider another statement: “This statement is true.” Is this true or false? Could it be both? Be careful with self-referential statements, they are tricky and can prove anything. The following statement can never be false, “I’m the hottest Brazilian in the world or this statement is false.” Think about it. I have just proved I’m the hottest Brazilian alive!</p>
Mon, 08 Feb 2010 23:04:43 +0000http://dimitri.xyz/2010/02/08/the-liars-paradox/
http://dimitri.xyz/2010/02/08/the-liars-paradox/counterintuitivecounterintuitive