Monday, June 25, 2012

As I write this, there are 7 minutes left before the end of the coding phase and I have supposedly finished coding 250 and 500. Supposedly, because I have so many doubts... It is scary.

Div1 250: Expect hypotenuse!

This problem is cool. I hope it is original though it would be surprising to see nobody thought of this before. Even though it is so cool.

You are given w,x and y. At position 0, a line of integer height picked uniformly at random between 1 and x. At position w, another line this time picked between 1 and y. Return the expected length of a straight line connecting the top of each line. PS x,y < 1000000.

We cannot do a O(x*y) search trying each case. The second best choice would be a O(x) search - fix a height px for the first line and then somehow calculate the rest.

The somehow can be done by using an array acumy[i] that returns the sum of the hypotenuses of a 90 degrees triangle with base w and heights < i. This array can be accumulated in O(y) time.

With that array, you can calculate the sum between all the straight lines that could start from the first line and end in the second line. You just need to be clever about how to use it. You can later divide the sum by y and add it to a result. Then divide the total result by x.

Tons of doubts about this problem. The approach is correct, but I wonder if I have made an error somewhere. To calm myself, I coded a bruteforce solution and began running random test cases comparing the results. So far, so good.

When I submitted this I was shocked to find that almost everyone else in the room already submitted it. And a red coder already solved 500...

Div1 500: The one with the big table

You are given a table with some height and some width. The numbers on the table are such that the cell at (i,j) has value (i*width + j). Find a subrectangle of that table such that the total sum of the rectangle is equal to S and minimize the area of the rectangle. If no rectangles exist, return -1

The trick to this problem is to notice that, given a width w for the subrectangle, you already know something about it.

It is hard to explain, but given a rectangle of width w and height h, then it will be something like this:

0 1 2 ... w
0 1 2 ... w
...
0 1 2 ... w

Added to this:

0 0 0 ... 0
W W W ... W
...
2*W 2*W 2*W ... 2*W

(Where W is the total width) And this:

K K K ... K
K K K ... K
...
K K K ... K

Where K is the value of the cell at the top-left corner of the subrectangle.

Long story short, with that knowledge, you can, given w and h, calculate K. And given K you can verify that such a rectangle exists. The constraints seem too long for this approach, except that the formula for K actually allows you to break very early. So the approach seems to be something like O(width * sqrt(height)) if not better.

I am nervous because this solution is VERY straightforward, plus the contest writers decided to include the supposed worst case in the examples. (That is never good, it might mean there is another case that is slower). I am also concerned about overflow. My solution overflows in theory, but I cannot catch a case in which that matters.

Saturday, June 16, 2012

I figured I should post something about this SRM. I've been very busy these weeks because the semester is ending and I tried to win a t-shirt in the Marathon AND I am not assigned more editorials. But there we go.

Div1 250 - The one with the bit operations

I was just returning to the topcoder mindset after working a lot on an assignment. At the start of the match I felt like running in 10% of steam capacity. I truly felt dumb. I knew what to do to solve the problem, but I was having issues translating it to code.

Kleofas tail : If x is even, x = x / 2. If x is odd then x = x - 1. Repeat and repeat until you will eventually have a sequence of tons of 0s. Given K, A and B, return the number of numbers between A and B, inclusive such that the Kleofas pairs starting from number x eventually include K.

This is heavily related to the binary representation of a number. Let us say the binary representation of x is 10010010101. The sequence goes like this: 10010010101, 10010010100, 1001001010, 100100101, 100100100, 10010010, 1001001, 1001000, 100100, 10010, 1001, 100, 10, 1, 0, 0, 0, ... - If the right most bit is 1, it becomes 0 , else it gets removed. This means that there are two ways for a number x to eventually become K:

The binary representation of K is a prefix of the binary representation of x.

If K is even, the binary representation of (K+1) is a prefix of the binary representation of x.

Assuming that you have a function that counts the number of times a given number K' appears as a prefix in numbers between A and B, then you have an instant solution. But we can simplify it a bit more: Let f(A, K) return the number of elements X between 0 and A, inclusive such that K appears in their Kleofas tail:

We just need to calculate prefixes(A, K): The number of times K is a prefix of X for every X between 0 and A, inclusive.

The problem is how to do that... This is the moment at which my brain froze. At first I tried to do it arithmetically, but I could not. (It is possible, but slightly tricky). Eventually, I recognized that I was not at full brain capacity and that the best I could do would be a bullet (and idiot)-proof dynamic programming solution for the simple counting problem.

Now that I am in my senses, here is a quick approach. Imagine there are i bits after the K part in x. For example, if K=101 then x = 101?????. The minimum x that follows that condition is: 10100000 and the maximum x is : 10111111. Both values can be calculate easily given i. (The minimum is K multiplied by 2i, the maximum is the same plus (2i-1) ). Then the true maximum is min( K*2i + 2i-1, A). There are (true_maximum - minimum + 1) values of x. Then you can just iterate through all values of i until the minimum is larger than A.

Mixed feelings about this problem. Although it was good, we have had too many of these (tricky binary operations) division 1 250s.

Div1 500: the one with digits

I was better now. I knew that if I wanted to save my rating I had to solve this problem. Luckily it turned out to be a typical digit dynamic programming. I am not sure why, but I find these problems very easy. I still needed a lot of time though, because when I coded it my brain was still not functioning and I left as many bugs as you would ever find.

Thanks to the notes, we know that the result always fits a signed 64 bits integer. This means 18 digits. We will say that the maximum number has 20 digits at most. The idea is to use a recurrence relation. f(count1, count2, p, eq, zero): We have already decided the p first digits. eq means that the number is currently equal to N. zero means that the number is currently equal to 0. count1 is the minimum number of times we need to add digit1, and count2 works the same way. For convenience, we will say that numbers are strings, strings of digits. Once we get the final result we can convert it to long long. f() will actually find a string of MAX digits, it will have leading zeros when the result needs less than MAX digits.

Let us say p==MAX, this means that we have already decided all digits, what is the minimum remaining number? Well, if count1 or count2 is positive, then there is no way to fulfill this requirement anymore. Thus there is no result. We will mark instances with no result as infinite. If count1 and count2 are 0, then the result is the empty string - do not modify any new bit.

For another p, then we can try all digits from 0 to 9. Well, actually, we sometimes cannot. If eq==1, it means that all of the previous digits were equal to N, this means that the digit at first position cant be smaller than the digit at the same position in N - Because that would make our result smaller than N. If eq==0, then we can use any digit from 0 to 9. After picking a digit, the values of count1, count2, eq and zero will change according to it and we will have a new instance of the recurrence. (Just note that when zero==1, we cannot count digit 0 as part of the digits to reduce from the count1 requirement, because it is a leading 0 and will not actually appear in the number. The same happens with digit2)

This problem was better. But it is odd to have two problems in the same match and same division that can be solved with the same approach. (I used a similar digit dp to solve 250).

Challenge phase, etc.

The example cases for div1 500 seemed quite strong. I knew div1 250 was tricky, but I also knew that I would likely mistaken if I tried to challenge. I preferred to go to lunch during the challenge phase. Returned, and after all the hours it takes results to arrive lately, I noticed I passed everything and somehow even my slow submissions were enough for top 100. I cannot believe I managed to increase my rating and even reach (again) the area that is close to 2200. I was just lucky though that div1 500 was my style of problem.

As far as the problem set goes. It is not misof's best problem set. But that is only because he set a great standard in previous problem sets. It was a good match, nevertheless. But we writers need to avoid those bit operations in div1 250 for a while.

Saturday, June 09, 2012

No surprises here, I did not advance. Even though I knew my chances and that at best I would have to battle for a good position for honor and thus. I kinda harbored the fantasy of advancing. Mind you, I really think that with the right random shuffle and combination of problems I could end in top 25. But it was unlikely. Nevertheless, I am kind of proud of my 133-th place.

Problem A: The breather

This problem was one to reward intuition. I guess. It was much easier than anything else in the round.

Let us say you know the order you picked. Then, the expected time after you solve 0 levels is: f(0) = L0 + p0*f(0) + (1-p0)*f(1). Where f(1) is the expected time after solving 1 level. After some magic: f(0) = L0/(1-p0) + f(1)

After some iterations, you will find a very messy formula. But it should become clear that it is best to pick the levels in non-decreasing order of L/p. It is actually something that can happen by intuition. At least for A-small, what I did was to think that it is best to get rid of the harder levels first, so that the probability to repeat a lot of levels goes smaller with time. Then when factoring the variable Li it is also guessable that a division can occur. But after getting correct in A-small I did not want to risk it (I didn't have a proof). So I skipped to B.

If two L/p values are equal, pick the one with the smallest index, to make sure the lex first order is picked.

Problem B: The implementation hell

This problem was a sand trap. I knew it was a sand trap. But I sort of thought that I could do it decently fast, and in fact I had hops to solve the large version.

I think most of the top 500 would not get scared of the hexagonal grid. Yet in this case, the grid was the least of the implementation issues. The three cases each have their complications.

You need to turn the hexagonal grid into a 2d grid. It is possible with T=2*S-1. Then you can have a TxT grid. Some of the elements in the TxT grid are invalid (don't exist in the hexagonal grid)., you have to mark them as such.

Then you need to analyze what rules involve connecting two hex cells. (x,y) is adjacent to (x-1,y-1), (x-1,y) , (x,y-1), (x,y+1), (x+1,y) and (x+1,y+1).

We need to differentiate between normal cells, edges and corners. In fact, each edge needs a different id for the cells in that edge (Forks are a pain to detect, see below). More implementation weight.

Rings: The statement help us here. We need to detect an empty cell that is not reachable to the boundary with any path of empty cells. Thus we can do a dfs starting from each of the boundary cells (borders or corners). Then if we try each empty cell, if it was not reached by the DFS, we have found a ring. A BFS also works, I used a BFS for no reason.

Bridges: Another use for a DFS, count the number of connected components of (non-empty) cells. If the number of components is less than 6, we got a bridge.

Forks: These were a pain to debug. Once I understood what was needed to detect forks, I was no longer hopeful that I could solve B-large. From each non-empty cell in an edge, do a DFS of non-empty cells. Then verify if there are two or more cells from different borders that have been visited. If there are two, we got a bridge. If you only find one of such cells, make sure to mark it out so that you don't count it as visited by a dfs from another edge...

Once I debugged my B-small, I already spent over an hour in it (fork mistakes). And it was still wrong (WA). I spent another good half an hour finding the bug (the forks again) and got a correct answer. But there were less than 30 minutes left...

The rest

I was shocked. Although I spent ages solving B-small, I somehow was still in a high position (150-th-ish). It seems that a lot of people had issues. So, I tried more problems. I could not understand what D asks for, and I did not understand what the difference between c-small and C-large was. Then I decided to take back A-large. This time, I decided to prove my intuition - it was correct. So I submitted it. It took me a while to code it because I did not have precision errors (We are sorting by (L/p, id), if we used floating points to calculate L/p, there is a risk that two different expressions L/p that give the same result are not considered the same in floating point arithmetic). So I decided to implement it only in integers (It is easy, just compare p2*L1 and p1*L2 instead of L1/p1 and L2/p2). I wonder if this was necessary.

Then I spent my last minutes watching the score boards. Many interesting surprises there. rng_58 was not in the qualifying top 25. But then the match ended, and results were updated. rng_58 got 25-th place!

Wednesday, June 06, 2012

So I first read the title and thought "oh, not another 25K Marathon in which even fifth place barely makes 200 bucks. But it seems it is something different.

The NASA Tournament Lab is excited to announce to the TopCoder community a special weekend algorithm contest to be held Thursday, June 21 to Sunday, June 24, 2012 featuring a $25,000+ prize purse. This unrated contest will challenge the TopCoder community members to generate solutions to an interesting algorithmic problem. We strongly encourage all members interested in SRMs as well as Marathon Match events to participate. Google and NASA Jet Propulsion Laboratory are supporting this challenge and look forward to interacting with TopCoder community members about their organizations through special chat rooms on Wednesday June 20, 2012. Engineers and scientists from Google and NASA JPL will be available to answer any questions you may have about these leading organizations.
Please save the dates! Registration begins on the June 13, 2012 at 13:00 UTC -4 and ends on the June 19, 2012 at 13:00 UTC -4.
More details will be available soon as well as a registration reminder.

interested in "SRMs and Marathon Match events" - So I am not sure if it is a Marathon.

Weekend contest - If it is a Marathon will be a very unusual one with 4 days long.

I am looking forward to more info. Even if it was a Marathon, 4 days sounds like the perfect length. I hope the 25K prize purse translates into a decent chance to translate effort into prize, though.

Friday, June 01, 2012

It is June. There are few hours till Saturday and it is going to be quite a decisive day for me.

Today was quite a rough start for June. I was writing the editorial for SRM 544 and was supposed to finish it by 9:35 AM. But I really could not and took 12 more hours. No, not because of the hard problem. But because of division I 275. Let us say that I lost hour after hour of my life trying to prove the fact that 201 is the maximum answer.

My original plan for today was to make this post and then focus on a semester project. No luck there.

IPSC

The IPSC at 6:00 AM. I will have to go to bed right now and prepare. This is a great tournament, and if you are not registered yet you really should get ready to participate. It is a team contest, but I am allergic to team work so I'll be participating alone. If you want a team, try looking for someone in topcoder or codeforces, I am sure there are plenty of people that want teammates.

I really cannot miss IPSC. It is a very unique tournament and has perhaps the best format out of all the contests ever. There are some very unusual problems too.

TCO 2012 Round 2C

The last chance to advance/win a t-shirt. Just one hour after the IPSC ends. You would think that I would have avoided going to IPSC to maximize my energy in this round. No, really, I can't miss the IPSC. My current hope is that going to the IPSC before this round will make me be in coding mode before the contest and will actually energize me. But I could be very wrong.

Will also try posting something for the TCO blog after this match ends. Although I'll admit that after US embassy rejected US visa request rejected in a very rude and frankly outstandingly wrong way I am not the most motivated person to post in that blog.

Later in June, we got other matches like the third GCJ round. I also will try participating in the on going round 3 of the TCO marathon track.