Tuesday, May 29, 2012

As I write this, the challenge phase has ended and my 275 seems alive. Then again, no one in my room was an aggressive challenger. Let us see what happens.

Div1 275: The one with vote fraud.

You are given an array of integer percentages from 0 to 100. Like {0, 1, 100}. The real percentages have all been rounded to the closest integer. If there was no fraud, the non-rounded percentages will sum up to 100.0 (important). If there was fraud, they would not. So, given the rounded percentages, is there a non-fraudulent scenario that yields those percentages? If so, return the minimum number of voters. Else return -1.

Oh well, it was 7:00 AM and my brain was slower than usual. I could not do much for the first minutes. The persistent idea was to iterate through many possible values for the total sum of voters. Once we find a correct one, return it. We need an upper bound after which we stop testing and return -1. And we also need to know how to test if a voter bound is possible.

How to test if a sum is correct? Let us say that a rounded percentage is p and the real percentage is x. Then we have: floor( ( x*100 + floor(sum/2) ) / sum ) = x. (Try it, this is the same as rounding the percentage).

Let us say that *somehow* with that formula, for each p we can find the minimum value of x and the maximum value of x. Then the sum of all the minimums will be minim and the sum of all maximums will be maxim. If sum is between minim and maxim, inclusive, then the sum is valid.

In order to find those minimums and maximums, you will need to do some magic with the formula. I just used binary search (e.g.: Find the minimum x such that: p<=floor( ( x*100 + floor(sum/2) ) / sum )), because, my brain was not cooperating... The good thing about binary search is that I am quite certain it will give correct results. The bad thing is that it makes the deal slower and thus my upper bound for the number of voters had to be small in order to avoid time out.

It turns out the largest valid number of voters is 200. I do not have a clue how to show it. Empirical testing made me choose 250000 for the limit, I just manually tested with a large array until it did not time out... Spent most of the match concerned about the possibility that this constraint was not enough.

Results for the first room are up, and it seems that people making mistakes in this problem is a very likely setting. I would not be shocked at all if my 275 fails.

Div1 500: The one with the binary matrix

Let us say you have a binary matrix

101010101
010101010
101011010
111001101

You want to make it full of zeros. In each step, you pick the first few cells in each row, but making sure that you pick at least as many cells in row i as you pick in row (i-1). This is the simplification of the problem statement about paths.

The statement says that it is always possible to make a matrix full of 0s with finite steps. Of course it is!. Just think about the first row in the input that contains a 1. You will eventually have to set the right-most 1 to 0. This means picking all the cells until that rightmost 1. If we have to do this move, let us do it. Let say that the number of cells we toggle in this row is (req)

Now, let us say that in a later row, the rightmost 1 does not exist or appears in a cell with a lower value than (req), but we have no choice, we still have to toggle (req) cells in this row. We should not toggle more cells, because it would only add unnecessary 1s that we would have to toggle later...

In the next row to it, the right-most 1 is at a position greater than req. It makes no sense to toggle fewer cells than that. Let us update req to the new value.

Repeat for each row. And repeat this process until the matrix is full of 0s. This approach will always find the minimum number of moves, because we never actually have a decision to make, all the moves we do are forced into us. Also, for each row, you will need at most (Row length) steps. Thus the number of steps is at most O(w*h), and for each step, you need at most O(w*h) steps. O(w*h*w*h) in total.

Outcome

yay, it turns out that 275 was very tricky and a lot of people had a wrong submission. My binary search, may take a lot of time and code but it made me very certain that the code was correct (assuming the upper bound was large enough). So I had a good position and a room win.

Problem A - Swinging Wild

The first degree of difficulty was in understanding the statement. Let us simplify imagining there is a vine of length 0 at position D. We want to know if we can reach this last vine.

Let us say we just reached vine i. There is a maximum length we can use in that vine. For Vine 0, it is d[0]. For the later vines, this length depends on the position of the previous vine. So, if we used vine j to reach vine i, the maximum length we can use for vine i is min(length i, d[i] - d[j]).

The first idea is a dynamic programming one. f(i, j) returns whether it is possible to reach the last vine if you have just reached vine i and the previous one was j. Then you can get the maximum length you can use for vine i. This is the maximum distance between i and the next vine you use. Thus you can pick any of those vines within reach, and continue the recursion until i = the last one. This is enough to solve A-small.

But we want to solve A-large, which would not allow such a O(n^3) complexity. Let us instead define maxlength[i], the maximum length possible at which we can reach the i-th vine. You can see that this maximum depends only on the vines before i. Once we know this maximum length, we can use i to find possible lengths for later vines.

For example, for vine 0, the maximum length is d[0]. Pick each vine j within d[0] distance of vine 0, then a possible length for vine j is min( d[j] - d[0], length[j] ). After this step, the maximum length for vine 1 and vine 0 is already know, and we can use vine 1's maximum length to update more vines. This approach is O(n^2) in time complexity and needs only O(n) memory.

Problem C - Mountain view

It seemed like C-small was less tricky than B-small, because of the accuracy rate from other competitors. So I first tried to solve this one. I actually barely got a integer programming solution, and I am still not sure I actually know how to code an integer programming solver.

After the match, I found solutions that merely pick random numbers for the (at most 10) heights and verify that the answer is correct. Then you can output the correct answer. If after enough random attempts, no answer is found , it is impossible. The small number of peaks makes the probability to find a correct answer (if it exists) quite large.

I feel so lame for not thinking of this.

Problem B - Aerobics

So, my initial idea was to consider the circles as squares. And then the problem is just to place those squares. Somewhere in the rectangle. My theory was that if you sort the square lengths in non-ascending order, and then always placed each square in the closest valid position to the bottom-left, you would find a solution (The area is at least 5 times larger than the area of the circles, so it is quite unlikely this solution won't work). Indeed, even in cases where your circles have quite large radius, but the rectangle has only 1 as width, this is possible.

You can also verify that this approach needs only integral coordinates. Somehow, during the match I thought that it was possible to have 0.5 coordinates, and thus I multiply everything by 2 and other unnecessary things.

I was pretty sure that would work, the issue is how to get the best location. With W,L <= 109, we cannot just try each coordinate for the center until we find one that does not intersect. So much that I even tried thinking of a different strategy. For a second, I thought of random (ironic as random would have helped in the other problem, but not this one).

I wish I noticed the obvious solution to this predicament in less time than a whole hour: mundane binary search. For each square you want to place, binary search for the minimum x at which there is enough space to place it. How do you know if there is enough space at a given x? Simply use another binary search, but this time for y. Since you place the squares in order, and the previous squares are all together in the bottom-left position, both situations are binary search friendly.

Then I noticed that this approach works for B-large as well. So I just submitted it.

The last minutes

By the time I finished B, I had around 50 minutes left. But I had no idea what to do. Kept trying to think of something for C. Read D and then got baffled and tried to think of something too. There was a time at which I thought to solve D-small with a random solution. I just wish I was so creative with C...

As time progressed, my ranking got closer and closer to 500. The very last minute it eached 508 and then 520... Some people failed the large inputs in some problems (not lucky for them, but lucky for me). And I advanced to round 3.

To be honest I was not even expecting to be in the top 1000 today.

I really liked this match. B , C and D were all really heavy weight problems in "interesting-ness" and difficulty and A was a good distraction.

Saturday, May 19, 2012

Here it is, a very late blog post about this. I would not put any confidence in my solutions before system tests, so I did not write this post during the challenge phase as usual.

Div1 250, the one with xor

You have (1 <= L <= R <= 4000000000), return the total xor between all numbers between L and R, inclusive.

So, you can assume that 4*109 is quite large and just simulating the xors is not a great idea.

During the match: Let us say you can have a function f(X,i) that counts the total number of times that bit position #i is turned on (equal to 1) in all numbers between 1 and X, inclusive. Then, for each bit position from 0 to 32, inclusive, then that bit will have a 1 in the result if and only if { f(R,i)-f(L-1,i) is odd }. You can test this assertion by just trying out how xor works between each position. We split the problem into two instances of f(X,i) to just have less variables to handle.

I thought f(X,i) was going to be easy to implement. And it sort of is. But It took me longer than I wanted to come up with it. It is becoming a habit that my 250 is usually surprisingly slow, even though I do not remember really slowing down while implementing it. It is almost as if I am initially calibrate to work slower and the speed increases during the game. I need to come up with a way to start in fast mode.

Anyway, let us say we have a number (in binary) 1101110?1011, the ? marks the bit at position i. Imagine that to the left of ? we placed any number strictly less than 1101110, for example: 110010?---- then ? must be 1 (we are counting those) and the ---- can be any number you want. Thus for each possible value for the left side bellow the current one, there are 2i ways to make the right side.

Imagine that the left side was exactly 1101110 then ? still has to be 1 (but if the i-th bit of X is 0, then it cannot be): 110111011---- , this time, you cannot put just any value in ----. They have to be less than or equal to the right side: 11102 = 14. Thus, if the i-th position is 1, then there are additional (right side + 1) different values)

more

That approach may have been a little too long to implement. There are various other approaches. For example, you can move the R-(L-1) stuff so that it is done in the main function. Let f(X) the function that calculates the xor between the numbers between 1 and X, inclusive. Then f(R) ^ f(L-1) is equal to the requested result (That is how xor works, it is its own inverse operation).

Then, to calculate f(X), you can do the same business picking each bit position between 0 and 31. Counting the number of times it appears between 1 and X as turned on. And if the number is odd, then the i-th bit position is 1 in the result.

The two approaches are mostly the same, but I like this second one a little better.

How come it does not time out? Many were saying that it was unfair that a solution as easy as this one would pass. But honestly, xOberon knew what he was doing. If you were to try simple iteration in C++, it would time out unless you considered the key optimization: The maximum value of R is too large for signed 32 bits integers. But it is still small enough to fit an unsigned 32 bits integer. This means that by using unsigned integers, you can keep the operations in the 32 bits world. Keeping in mind that the topcoder servers are 32 bits, this means that xor operations will need only one instruction if done right. The next component is the g++ compiler optimizations. Topcoder uses +O2, I am sure this code gets optimized so that R is a single register and that allows R ^= L to be a single XOR assembly instruction. Allowing the code to run in 1.7seconds even though it needs 4*109 steps!

Div1 500, the one with traveling through islands

The first idea is to see it as a typical dynamic programming. Dynamic programming because it is obvious that you will always move right, up, or right AND up, but will never need to go back down or left, thus the sub-problems are acyclic. dp[x][y] returns the minimum time needed to reach top right, if you are currently standing in the x-th island and the y-th dock in that island. The base case is dp[w][length] = 0 (w = |width|). Because you are already in the objective place. The wanted result is dp[0][0]. For every pair (x,y), then dp[x][y] can either be:

1/walk + dp[x][y+1]: Meaning that you move 1 unit up at walk speed and reach (x,y+1), from then you must still solve the rest of the problem, and the minimum time from that place is dp[x][y+1].

EuclideanDistance( (0,y), (width[x],y2) )/speed[x] + dp[x+1][y2]. For every y2 > y. This means that instead of moving up, you decide to cross the river in a diagonal (maybe straight line if y=y2) and reach dock #y2 at island (x+1). You move width[x] units horizontally and y2-y units vertically in a straight line, thus we need the Euclidean distance. But you move at velocity speed[x]. After reaching (x+1,y2), you still need to reach the final objective and the minimum cost is dp[x+1][y2].

If for every dp[x][y] we try all O(length) values for y2, we will likely fail, because length <= 100000 is too large for a O(length*length*w) solution.

I had two ideas initially. The first one was "Maybe binary search?" and the second one was "Maybe you can find the result for dp[x][y] based on the result of dp[x][y+1]?"

The ternary search idea was, for each dp[x][y], simply do the ternary search to find the best y2 that yields the minimum result. The question is whether the function : EuclideanDistance( (0,y), (width[x],y2) )/speed[x] + dp[x+1][y2] is ternary search-friendly for y2. To me, it was intuitive at first that it was. But here is a proof anyway: The key realization is that dp[x][a] will always be greater than dp[x][a+K] for any a, and positive K. In other words, dp[x][a] increases as a decreases. This makes sense because you always end up further than the top position. Thus dp[x+1][y2] will increase as y2 decreases. The EuclideanDistance() function, on the other hand, will decrease as y2 decreases, because the straight line will be closer. The sum between a strictly decreasing function and a strictly increasing function will have only one local minimum.

Assume that dp[x+1][a] makes a ternary-search-friendly curve, then EuclideanDistance( (0,y), (width[x],a) )/speed[x] + dp[x+1][a] will also be ternary-search-friendly; the Euclidean distance part is a single straight line equation if a is the dependent variable; adding a curve with only one local minimum to a straight line, will result in a curve with only one local minimum still. Then, just demonstrate that dp[w][a] makes a good curve (it does because it is a straight line too!)

So, ternary search gives a correct answer. I implemented ternary search and passed examples. But I always test the largest case I can imagine in Topcoder's servers before submitting the solution. Time out!. It seems that O(log(length)*length*w) is too slow!

That is when I panicked. The only other idea I had was that there was somehow a way to optimize the search for the answer for dp[x][y] based on what we found for dp[x][y+1]. But how?

I tried many ideas that were many wrong or did not optimize things too well.

I was about to give up, but around 4 minutes before the end of the coding phase I had an idea!.

Remember that the Euclid() is a straight line when we make a graphic for the values of y2? In fact, Euclid() depends only on the difference between y2 and y. This means that the Euclid used between y2 and y2 + M is the same as the one between y and y + M. Now, if for y2 we know the value of M, then we should not pick a smaller new M for y. And we cannot pick a M larger than M+1.

But why? It is difficult to explain without drawing it:

Well, the two top pictures are the values of y2 that we will try. Let us see why is it not necessary to try the bottom ones.

For the bottom versions. We will simplify notation by defining E(M) - the cost to move in a straight line between (x,y) and (x,y+M) and D(y) which is equal to dp[x+1][y]. So, the total cost to move from (x,y) to (x+1,y+M) will be: E(M)+D(y).

From the ternary search analysis we can see that D(y1) >= D(y2) for (y1 <= y2). But there is a more interesting thing to conclude: D(y - K - 1) - D(y - K) <= D(y) - D(y-1). The increase of cost between D(y) and D(y-1) is greater than or equal to the increase of cost between D(y - K) and D(y - K - 1) (for positive K). In order to verify this, simply notice that the more distance between y and the top position, the higher the straight lines we can use to cross rivers so the increase (The derivative) either is equal or smaller).

Given the optimal M for (y+1), then we have the following in-equation:

Which means that picking (x,y) -> (x,y+M) is always at least as good as picking (x,y) -> (x,y+M-1). Invalidating the bottom left idea.

For the bottom right example, consider that E(M+1)-E(M) is smaller than E(M+2)-E(M+1). Once again, the derivative. Just imagine the straight lines, as the height increases, the differences of the lengths of the lines increases. Take the derivative of the Euclidean distance in case of doubt.

I will admit that I did not come even close to a formal proof during the match. Instead I just assumed things would work like this after imagining the straight lines rotating. I was desperate and gave it a try. I submitted with about a minute left before the end of the coding phase.

Challenge phase, outcome, etc

So, I knew two things. My 250 got low score and was more complicated than what other people did. My 500 stood on an idea that I was not incredibly sure about. I really did not feel optimistic.

I opened some 250s, while other people were already challenging the wrong ones. I found a 250 with: if ( L==R ) return 0; Which seemed like a bug to me, since when L=R, the result is L. I read the constraints and "noticed" that L was strictly smaller than R. So, this was not a corner case... Some time later, this submission got challenged. I re-read the constraints once again, and L=R was actually allowed in the constraints... what a waste of 50 points. Though it is strange, because L=R=5 was an example case, so this was a strange mistake to find.

Got 1992 rating. It is always good to re-bounce after a bad match with an even better score than before.

I think that, regardless of whether I am able to understand Coding Horror's triggering post or not . This is a good discussion some people out there are having. And I love to at least post my opinions in this blog so I can read them and maybe some of the few guys who read this blog.

Coding, that non-essential skill

After thinking and over thinking. I think Jeff's main annoyance with this whole thing is the notion that coding is essential.

So, is coding an essential skill? Do you have to learn to code to live a good life? The answer is, to me, nope. Surely, there really are few skills and knowledge that you could call essential, and programming is not one of them... But wait, neither are reading, writing nor math... I mean, surely you need *some* writing, *some* reading and *some* math (mere arithmetic, in fact). You also need *some* basic social skills. Everything else is optional, really.

It seems that most energy spent by the strong opponents of the learn to code movement is to remind us that coding is not essential. To which I have to ask: Big deal? If a guy makes learning to code his new year resolution, is it really a big crisis for us?

The relevant question is whether the Code Year campaign is really saying this. That coding is as essential a life skill as basic reading and arithmetic. I do not think so. It seems that Code Year is simply an advertisement for Code Academy, and that they are really saying that, if you want to learn something new, that code is an option and not one that is as obscure as some engineers would like you to think. That you can learn to code this year. I also do not think anyone is being forced to learn to code. It is a completely voluntary thing to do.

Learn to write?

If writing is not an essential skill beyond a certain basic level. The question is then, why should someone who has already acquired decent writing and communication skills be compelled to comply with the request to focus on writing and ignore code? There is certainly no reason. If you have run out of essential things to learn, then you should feel happy because you can find new things to learn. If you pick coding I will not make a big deal about it. You can also improve your writing. It has always been a choice you can make. You can even choose to do everything and improve your writing while you also learn to code.

Some people really should learn programming

A slight problem with the [it is not essential!] meme is that just because something is not a essential thing it means there is nothing good for society if you learn it. There are increasingly more careers that need some knowledge of how coding works.

The knowledge of how code works shall be , to me, not different than how some judges and lawyers have to learn know about environmental science (We got some judges out there making very important decisions regarding programming. They are just an example. I also think that many authority positions in companies related to this should also get to know at least some coding.

I say this because, unlike plumbing, programming is not really something that people can intuitively know what is it about. e.g. consider Hollywood. So, really, I think this information about the basics of programming is something that is needed for certain groups with a job description different to software engineer.

Become a programmer?

At least to me it seems that the {Learn to code} people are not saying: Become a programmer. So, I don't really think Bloomberg is planning to replace any programmer in the city hall or anything. Because I cannot find evidence of Code Year movement to ever say that after you learn to code you would replace part of your job description with "being a crack at javascript". That would be silly.

Hello world

Jeff mocks the notion of Bloomberg's discovery of programming to be a hello world-like program. Well, what would be so bad about it? Perhaps more people just getting to run their codes just once will make the world slightly better? It is difficult for me to imagine a way in which more overall knowledge about something would be a bad thing. In fact, it seems to me that historically, those in the thinking that some knowledge was not appropriate for some people were the ones in the wrong.

If more people learn to code

Is it going to be the end of the world? We will have more bad coders. That is for sure. There will be many people that will hate code after having been encouraged to learn to code only to find out they do not have the patience for it. There will be some guys and women who have found a cool new hobby, and they might spend some of their free time making cool minor projects, and eventually finding out the reasons to have a good methodology. Some other people will find a new career path. Some of them will become bad programmers. But a couple of them will become great programmers. More people will actually know what programming is all about though. There will be more knowledge around. And the next time a jerk says that he invented the double click, more people will know that such claim is bogus.

I think that regardless of whether the learn to code group is based on real points or not. The net outcome will be positive. As I mentioned before, it is difficult for me to imagine a good justification to think that more people knowing about a subject is ever going to be a bad thing.

There, that is my conclusion. It has become my new ideal to tell people that they have to learn everything. You shall never stop learning because your final objective is to learn everything. You can, however, pick the order in which to learn your stuff. Enjoy.

Edit:

You too can earn 100k dollars a year?

This post reminds me of another thing that is being assumed about the learn to code meme. Many people seem to interpret it as "you can learn to code in one year and make millions with code.". I will be honest that I have not really seen much from this movement, the litte from the code year home page does not seem to imply that to me.

Of course, that premise is terribly wrong. Once again though, if people really fall for it, then this whole experience will help more people find out that no, 100K a year with coding is not as easy as learning to code. And that is a great thing to learn.

We may be missing something, that learning is so great that even if your starting premise is wrong, you still learn useful things. Let these beginners be wrong about their expectations.

Tuesday, May 15, 2012

Imagine if Mike Bloomberg, major of New York wanted to learn to play the guitar. To which guitarist guild would reply: "NO! Don't learn to code! There are millions of terrible guitar players, we don't need any more!".

Then Mike Bloomberg decided to learn astrophysics. To which Neil deGrasse Tyson would take offense. "Can you explain to me how Michael Bloomberg would be better at his day to day job of leading the largest city in the USA if he woke up one knowing the total mass of Andromeda?"

So, these thoughts sound nonsensical. Yet somehow, in regards to programming, they are not instantly nonsensical. At least not for many people. As coding Horror' post Don't learn to code and many of the comments placed in there would show.

The "everyone should learn to code" movement isn't just wrong because it falsely equates coding with essential life skills like reading, writing, and math"

Look, I love programming. I also believe programming is important … in the right context, for some people. But so are a lot of skills. I would no more urge everyone to learn programming than I would urge everyone to learn plumbing. That'd be ridiculous, right?

Which makes me wonder what is Jeff's problem with more people learning plumbing?

So, why not. If you want to learn plumbing why not. Worst case scenario, plumbing is not for you and you trying to learn it will make you figure that out.

Our schools teach us music, calculus, sports, chemistry and a lot of stuff that we won't necessarily use in our lives. So what? And again, What is wrong of learning for the sake of learning? That is part of what makes us human.

What Jeff is saying sounds to me like this: If more people learn to code, we will have more bad coders. Boohoo. Suddenly we are back to medieval time, and we are suddenly afraid of other people learning our precious knowledge, really? Is this much better than Pythagorean hiding the square root of 2 (Just been reading Carl Sagan lately, sorry).

Also, what is up with this?:

Please don't advocate learning to code just for the sake of learning how to code. Or worse, because of the fat paychecks. Instead, I humbly suggest that we spend our time learning how to:

Research voraciously, and understand how the things around us work at a basic level.

Why not, if you want to, learn coding and also learn those things mentioned? I mean, it is not like we had to choose between the two. I see no issue with a human being learning all those things AND coding. If that is what she would like.

Rant: Just because industrial engineers exist, does not mean you shouldn't ever give carpentering a try. And in that regards, just because your job uses something as lovely and wonderful as programming as part of the million of times more ridiculous, silly and frustrating process that is making software for boring business, it does not mean that everyone else should be denied the joy of programming. There are a lot of ways amateur programming can work as an entertaining hobby that is outside of the lame thing that software development is. We got modding, the demo scene, scratch, algorithm contests, games.

More so, more programmers means not only more bad programmers, but also due to any law of proportion, more good programmers.