Time complexity of update and lookup in binary random access list - Software Engineering Stack Exchangemost recent 30 from softwareengineering.stackexchange.com2019-09-15T11:12:27Zhttps://softwareengineering.stackexchange.com/feeds/question/297021https://creativecommons.org/licenses/by-sa/4.0/rdfhttps://softwareengineering.stackexchange.com/q/2970212Time complexity of update and lookup in binary random access listalfhttps://softwareengineering.stackexchange.com/users/291082015-09-11T15:56:54Z2015-09-11T22:08:35Z
<p>I'm trying to get through one of the exercises in Okasaki's "Purely Functional Data Structures," where he presents a zeroless binary numbers as a structure for random-access list, and asks to</p>
<blockquote>
<p>9.6 Show that <code>lookup</code> and <code>update</code> on element <em>i</em> now run in <em>O(log i)</em> time.</p>
</blockquote>
<p>"now" here contrasts zeroless representation with a plain binary tree backed random access list, for which the lookup and update time was estimated as <em>O(log n)</em>.</p>
<p>The approach in both cases similar: we maintain a Cons list of "digits", with every non-zero digit containing a tree of size corresponding to the digit's weight. The list as a whole is a binary (or a zeroless binary) representation of the number of elements in the list.</p>
<p>For example, a list of 6 elements, <code>[1, 2, 3, 4, 5, 6]</code> (binary for 6 is 111, zeroless binary is 22) will look as </p>
<pre><code>[
Zero,
One(Node(Leaf(1), Leaf(2)),
One(Node(Node(Leaf(3), Leaf(4)), Node(Leaf(5), Leaf(6))))
]
</code></pre>
<p>in binary, or</p>
<pre><code>[
Two(Leaf(1), Leaf(2)),
Two(Node(Leaf(3), Leaf(4)), Node(Leaf(5), Leaf(6))
]
</code></pre>
<p>in zeroless binary representation. Now, the <code>lookup</code> and <code>update</code> times are defined by the time you spend locating the tree with the <em>i</em>-th element, and by the time you spend navigating that tree.</p>
<p>What I don't get is why loookup of <em>i</em> th element in a binary random access list takes <em>O(log n)</em> rather than <em>O(log i)</em>? We need to locate the tree containing the <em>i</em> th element — that's <em>O(log i)</em>, navigation within the tree is capped by <em>O(log w)</em>, where <em>w</em> is the size of the tree, which is clearly less than <em>i*2</em>.</p>
<p>What do I miss?</p>
https://softwareengineering.stackexchange.com/questions/297021/-/297051#2970511Answer by alf for Time complexity of update and lookup in binary random access listalfhttps://softwareengineering.stackexchange.com/users/291082015-09-11T21:37:52Z2015-09-11T22:08:35Z<p>Well, compliments to @snowman for the advice to rewrite the question in plain English — I came up with an answer while doing that. </p>
<p>The <em>O(log n)</em> boundary comes from the way binary numbers work: they have zeroes. For instance, a binary for 8 is 100, or, reversed, <code>001</code>. So the binary random access list of 8 elements (e.g. <code>[1, 2, 3, 4, 5, 6, 7, 8]</code>) will have the following structure:</p>
<pre><code>[
Zero,
Zero,
One(
Node(
Node(
Node(Leaf(1), Leaf(2)),
Node(Leaf(3), Leaf(4))
),
Node(
Node(Leaf(5), Leaf(6)),
Node(Leaf(7), Leaf(8))
)
)
)
]
</code></pre>
<p>Here, it's clear that any element will take exactly <em>2 log n</em> steps to get to (if you don't trust me on this one, play with <em>n = 2^k</em>, e.g. 256, 1024, 65536...). So the premise that "We need to locate the tree containing the i th element — that's <em>O(log i)</em>, navigation within the tree is capped by <em>O(log w)</em>, where <em>w</em> is the size of the tree, which is clearly less than <em>i*2</em>" was basically all wrong: the size of the tree does not depend on the element's index, and the position of the tree is dictated by the bits of <em>n</em>, not <em>i</em>. </p>
<p>In zeroless numbers, on contrary, we always have at least one tree of each possible size, so we can be sure that the tree will indeed be located in <em>O(log i)</em>, and its size will be not more than <em>2i</em>.</p>