Member Search

Division 1 saw an interesting match, with a reasonably easy 250 pointer, a not-so-standard medium
problem that startled quite a lot of coders, and a hard problem worth only 900 points.

Petr blazed through the coding phase; he was the first one who had all three problems submitted,
and he enjoyed a comfortable lead before the challenge phase. But as the clock was ticking away,
Revenger's challenges rocketed him upwards, and soon the difference between the top two was
only 1.1 points. In the last few minutes Revenger's challenge streak continued and brought him
a well-deserved win in this SRM. Petr finished second. A similar battle was fought for the third
place, where Mg9H's five successful challenges decided the bronze medal for this SRM. Twelve other coders solved all three problems correctly.

In Division 2 the problem set contained a 250 that was simple to implement once you understood
the problem statement correctly, a tricky 500 pointer, and a hard 1000 point problem.
The only one who solved all three problems correctly was omax. Second place went to enefem21,
and third place to IgorTPH – the only other coder who successfully solved the hard problem.

This was one of the problems that rewards those who are able to calm down, read the whole
problem statement and look at the examples.

The general idea of this problem was clear: given some measurements, identify which of them
are valid, and compute the average of the valid ones.

The definition of an invalid measurement was similar to the approach most of us know from
secondary school physics: If a measurement gives a value wildly different from all others,
throw it away.

The trouble with temperature sensors is that the temperature can vary during the day. Thus
the definition said a measurement is invalid if and only if it is wildly different from
all other measurements made at roughly the same time. In other words, a measurement
is valid if and only if at least one other measurement made at roughly the same time
gives a similar value. (The problem statement gave an exact definition, where
"roughly the same time" was "within two minutes", and "similar" was "differs by at most 2".)

When implementing a solution probably the best approach is to split it into two parts:
first identify the valid measurements, then compute their average. Java code follows.

The direct approach

There almost has to be a simple way to compute everything necessary just by looking at the three given integers:
smallest, largest, and number.
All we have to do is a careful case analysis.
First, two clear facts:

If number > largest, it's surely not a valid prefix.

If number lies in [smallest, largest] then it is a valid prefix.

Now we may assume that number < smallest.

Let us first solve an easier task: assume that smallest and largest have the same number of digits.
In this case the solution is simple:

Let d be the number of digits in number.

Let s1 be the number formed by the first d digits of smallest.

Let l1 be the number formed by the first d digits of largest.

number is a valid prefix if and only if s1 ≤ number ≤ l1.

For example, 479 is a valid prefix for the range [3138, 5120], because 479 lies between 313 and 512, inclusive.

Now for the general case (when smallest can have less digits than largest): We may split the
whole valid range into several shorter ranges, and then conclude that number is a valid prefix
if and only if it is a valid prefix for one of the shorter ranges.

For example, we may split the range [879, 12345] into [879, 999], [1000, 9999], and [10000, 12345].

This would already be enough to write a program, however we can make one more observation. If digit counts for
smallest and largest differ by more than one, all numbers smaller than smallest are
valid prefixes. (In the example above note that one of the ranges is [1000, 9999]. For this range all
numbers with up to four digits are valid prefixes.)

The trying-some-possibilities approach

Wow, that was quite a lot of cases one can easily overlook, wasn't it? A general advice for
such situations: Grab a pen and paper, write everything down, and make damn sure you didn't
miss any case before you start coding.

Or think of a different approach.

The problem statement asked us whether we can add digits to (the right end of) number so that
the result lies in the range [smallest, largest]. First, how many digits will we add?
Clearly, largest has at most 10 digits, number at least one, thus we will only add
up to 9 digits.

Now, for example, let number=347. What numbers can we make by adding three new digits? The
smallest one is 347000, the largest one is 347999, and clearly we can make all the numbers between
these ranges.

And that's almost all the insight we need.

The entire solution will look as follows: Try all possibilities saying how many digits you'll add.
For each possibility, compute the range of numbers you can make. If this range intersects
[smallest, largest], you can make a valid value, and thus the input is valid.

This problem was maybe a little bit different than most Div1-mediums tended to be in the past months.
The problem statement was pretty clear on what you had to do: generate temperatures, compute medians,
and (just to minimize output size) keep their sum. The only thing the problem statement didn't mention
was how to do it.

And there were quite a plenty of ways. The necessary first observation was that the most
straightforward ones will time out. But beyond that, there were still plenty of approaches to choose
from. We will present some of the more efficient approaches below.

Heavy artillery approach

Any theoretical computer scientist would just wave his hand and say: Simply store the current set of
numbers in a balanced tree. You can add new elements, delete old elements, and query for the median,
all in O(log K) time.

Of course, that's easier said than done. The trouble is that none of the tree-like data structures
available in the standard libraries support a fast k-th element query ("given k, find the k-th smallest
of the stored numbers"). You are on your own here. With a pre-written tree structure you already won,
with a textbook and more than half an hour of time you stand a chance.

The thing you have to add to a canonical balanced tree structure is information about subtree sizes:
for each node in the tree, keep the count of elements in the subtree rooted at this node.
For a particularly easy-to-implement balanced tree structures, check out splay trees and treaps.

Cleverly using what is available

Can we circumvent the need for k-th element queries somehow? The answer is yes. The trick is to
split the active elements into a "lower half" and an "upper half".

More precisely, we will keep two multisets of numbers, the first one will contain ceil(K/2) smallest
elements, and the other one will contain the remaining ones. The median is the largest element in
the first multiset.

The time complexity of this approach is the same as for the previous solution, O(N log K). Note that
in each loop we only have to move O(1) elements between the multisets to maintain the balance.

A note to Java users: Java doesn't have a multiset data structure. In the official solution
I used a TreeMap to store records of the type "value->multiplicity". For example, if the
values were {2, 3, 4, 4, 4, 5, 5}, then lowerHalf would be "2->1 3->1 4->2" and upperHalf
would be "4->1 5->2".

Using the limited range of possible values, part one

Another possibility was to notice that the possible values are only integers from a limited range [0,65535].
In this case we can use a data structure that is really simple to implement. I've seen it under many
names, the one I like most is "interval/segment trees".

The idea is pretty simple: For each interval of the type [ a*2^b, (a+1)*2^b ] we will keep the
count of elements that lie inside the interval.

For each of the segments shown in the picture we will keep the count of elements it contains.

This data structure allows a wide range of operations to be done in logarithmic time:
insert an element, erase an element, find the k-th smallest element, find the count of elements
within a given range, and so on. We will only need the first three.

Insert and erase are both pretty simple: If there are R=65536 different values, then there are
1+log2 R = 17 levels in our image, and on each level exactly one of the intervals
contains the element in question. Thus we have to update 17 stored values (or, in general, the
count of operations is logarithmic in the length of the range).

Finding the k-th smallest element will work similarly as binary search does: If there are enough
elements in the left half, look for the k-th element there, otherwise subtract the count of
elements on the left from k, and look at the right half. This can also be done in logarithmic time.

A final note: Our implementation is lazy and it uses O(R log R) space, where R is the size of the
range of allowed values. Clearly the total number of segments is 2R-1. Moreover, the information
contained in them is redundant. A simple optimization is to store the information for the left half
only. Using this optimization, a single R-element array is enough to store the tree. Bitwise
operations can be used to simplify the implementation even more.

Using the limited range of possible values, part two

Yet another possible optimization: The pseudorandom number generator generates the next number
based on the previous number only. This means that once a generated value repeats, we entered a cycle,
and from now on the same sequence will be generated over and over.

Thus we can do a brute-force solution until all the active temperatures lie in the cycle. Then we compute
the medians for one complete iteration of the cycle, and count each of them the right number of times.

Other approaches

A few contestants managed to squeeze in solutions that maintained the active temperatures in
a sorted array. The constraints suggested that this solution may pass, if it can be optimized enough,
but still it was a gamble, and I suspect that there were many who failed, and only a few who managed
to optimize their solutions enough. (E.g., one solution I've seen used memcpy to move large chunks of the
sorted array.) Only go this way if you are out of any other ideas, and make sure to test your solution
on large timeout cases before submitting.

There were also some other ways to solve the problem, e.g. instead of the complete interval tree one could use
256-element buckets to store partial information about the temperatures. This may not be more simple to
implement, but it surely is easier to think of if you never implemented an interval tree before.

Final notes

This problem (a floating median) is a common problem in meteorological software. How would you
compute a floating
minimum? And how about a floating average? (If you found the last one too easy: what about a floating average
that doesn't accumulate rounding errors over time?)

One image is better than a thousand words, so we will start by
looking at the images from the problem statement once again.

The most problematic part of the surface is the top side.
From some of the faces only a half is visible, from some of them
three fourths... looks complicated, doesn't it?

Well, it doesn't. The partial areas may have any shapes they please, still the total
area is an integer. To see why it is so, just look at the pyramid from the top. The total
area we see is the total area of all the top sides – but also it is the total area
of the bottom side of the pyramid.

We are left with two tasks: determine how the bottom of the pyramid looks like, and count
the total area of the vertical faces.

The bottom side will almost always be a square. If we have enough cubes to build a pyramid
of size N≥3, we have enough cubes to build a (N+1) by (N+1) square. So the only cases
where this has to be handled are when we have less than 14 cubes. More precisely, the bottom
side is not a square only for N=2, 3, 6, 7, and 8. The values 6, 7, and 8 made good challenge
cases.

A simple way to handle this: if we don't have enough cubes to build the bottom square, they
all sit on the bottom, thus the total area of the bottom side is the count of cubes we have.

The most natural solution when counting the vertical faces is to process the pyramid by levels,
just as it is built. For each level, find out the number of "rows" and "columns" of cubes
you can (at least partially) fill. The total number of vertical faces on this level is then
2*(rows+columns), and this holds even if the last row is not complete. (Look at the image
of the incomplete pyramid, and use the same argument as for the top side.)

Usually the number of columns will be equal to the "size" of the current level, but this
doesn't have to be true all the time. For example, for K=10 the result is a 3 by 3 square
with a single cube standing on it. This was also one of the more popular challenge cases.

This task didn't cause much trouble FOR people comfortable with memoization and/or
exponential-time dynamic programming. Still, not every coder at TopCoder has years of
experience under his belt. Without seeing and solving ten similar problems, finding
the right idea on how to solve this problem is hard. And this is why this problem
was used as the hard problem, even if the count of successful solutions was similar
to the medium problem.

Suppose that we have a partially built tower. The only things that matter now
are: which boxes are still unused, and what does the top side of the current tower look like.

The top side is a side of one of the used boxes. Each box has (up to) three different sides.
Thus the total count of different situations we can get into is 2N*N*3.
For each of the situations we will compute the best tower.

As usual, there are two isomorphic views of this problem. We can either turn it into
a dynamic programming solution that processes the situations with less free boxes first,
or we can write a recursive function and memoize the values it computes.

A slow but cute solution

One should note that there is a neat O(3N*poly(N)) solution: For each box,
try 3 possible orientations (it only matters which side is vertical). Now, order the
boxes according to the area of their bottom side, and use a simple DP to compute
the height of the best tower. (This solution can handle up to 11 boxes within the
time limit, maybe you can squeeze 12 out of it, if you try hard.)