I was planning to give this test a skip after taking part in sudoku cup earlier...but i am happy that i managed to scrape in just enough time to take this test...loved the snake fillomino very much....puzzles are meant to be solved for fun and i cant think of a better example than snake fillomino as the paragon of fun for this test...and skyscrapers fillomino was good too...clean bowled though seeing the second skyscraper fillomino..i wonder how people even start doing such fillominoes without a number and still manage to complete it in a decent time...would love a walkthrough this time for the second skyscraper fillomino...overall a CLASSY monthly test!!!!!Great work melon and grant!!!

kiwijam - 2012-10-29 2:52 AM
But who is this "sai" who put in an unbelievable world-champion-like performance on their first ever test...? Or is it someone that competes under a different name usually?

Well, we have exactly same thoughts ever since the test started (because sai was the first official competitor). It will be nice if sai fills up his/her real name.

As best we can tell, sai is the same as gorogoro from crocopuzzle and formerly went by Sphinx at nikoli.com. Given his track record, we're treating the performance as legitimate.

We'll think about making the grids larger next time, but it's hard to do so without doubling the page count. I made the grids as large as they could be while still making all the formatting/positioning work.

MellowMelon - 2012-10-29 10:19 AM
As best we can tell, sai is the same as gorogoro from crocopuzzle and formerly went by Sphinx at nikoli.com. Given his track record, we're treating the performance as legitimate.

Impressive performance, then! There are no Fillomino puzzles on nikoli.com, and the ones on croco-puzzle, based on a sample size of three specimens, are mostly artless, albeit more likely to contain implied polyominoes than the ones generated by Tatham's applet. As a Japanese person subscribed to nikoli.com, it is imaginable that sai solves many artful Fillomino puzzles in print in Nikoli's beautiful publications, and has developed killer Fillomino skills as a result. :)

Fantastic test! I loved the hard Walls, and I was literally laughing out loud while solving the sum from all the cute combinations and the lovely "impenetrable sum walls". The No Rectangles were lots of fun too.

I agree with Kota and would love to see bigger grids, even if it means the larger page count, especially for things like Liar that seem to require more notetaking (at least for me).

I also found the darkness of the circles to be a bit obstructing to my eye; I had to keep reminding myself that polyominoes could cross the circles.

I look forward to spending some time later to finish the remaining puzzles -- especially the second snake, which I "proved" was unsolvable at least two or three times, as I kept coming back to it hoping I'd see something else after doing one or two other puzzles. I'm sure I'll be very embarrassed when I see how it's supposed to work!

Quick note: The frequency of misplaced submissions surprised us. In hindsight, the fact that the answer key format is the same for almost every puzzle on the test probably doesn't make things any easier. We have now decided to credit these answers. At this point we believe all such answers, claimed or not, have been appropriately dealt with now.

Fillomino-fillia 2 is now over. Thanks everyone for your participation in the test. We hope you enjoyed the puzzles, those you solved during the time and those you did not. 222 players participated, with about 75% of them submitting puzzles.

Congratulations to EKBM, sai, and deu for topping Fillomino-fillia 2 by finishing in 77, 85, and 101 minutes respectively. Very impressive performances from all 3 of them. After a number of impressive runs in past tests where time bonus got lost to errors, EKBM's win here is well-deserved. Also worth mentioning are uvo, xevs, S_Aoki, and Kota, the other players that managed to solve all 18 puzzles. Japan had a very good showing on this test (well, it is Fillomino).

Other players worth mentioning are tamz29, Prasanna16391, and swaroop2011, who participated in FF1 last year and dramatically improved on that performance for FF2.

Amidst other discussions like your favorite puzzles, one thing we are interested in hearing from players is your opinion of the grading, including the penalty amount and the stricter manual grading compared to most tests. Any feedback on this would be appreciated.

In case you want some walkthrough, I think I can (only for puzzles I've solved :P ).

The penalty amount is somewhat too strict for penalizing unsolved puzzles. Maybe count incorrect submissions as penalties only if the corresponding puzzle is eventually solved? Mostly inspired from ACM-ICPC penalty rules for programming contests where a problem contributes to the penalty only if it is eventually solved.

Anyway, amazing puzzles. Thanks for the authors to make such great puzzles. Will there still be an author guessing contest?

chaotic_iak - 2012-10-30 3:06 AM
The penalty amount is somewhat too strict for penalizing unsolved puzzles. Maybe count incorrect submissions as penalties only if the corresponding puzzle is eventually solved? Mostly inspired from ACM-ICPC penalty rules for programming contests where a problem contributes to the penalty only if it is eventually solved.

Well, that's a Catch-22. Oops, I submitted a wrong answer to this puzzle; should I submit a correct answer and decrease my score on the rest of the test by 2%, or not? (2% of 120 is 2.4, so it's probably worth it most of the time, but still.) :)

Well, that's a Catch-22. Oops, I submitted a wrong answer to this puzzle; should I submit a correct answer and decrease my score on the rest of the test by 2%, or not? (2% of 120 is 2.4, so it's probably worth it most of the time, but still.) :)

Well, similar to the point I made during the test, I think if a person is thinking that much, and doesn't submit, that in itself is a penalty during the test.

The main point of the penalty, as I understand it, is to stop repeated tries till a solution is reached. That much is fulfilled without penalizing puzzles that aren't solved correctly within test duration.

Well, that's a Catch-22. Oops, I submitted a wrong answer to this puzzle; should I submit a correct answer and decrease my score on the rest of the test by 2%, or not? (2% of 120 is 2.4, so it's probably worth it most of the time, but still.) :)

Well if the puzzle is worth a lot and you have time then obviously you'll fix and resubmit it. The problem is that small puzzles have a relatively high penalty.

The situation with the current system that seems likely to me (it happens in most of my tests) is:
I've done lots of puzzles, and now have 3 minutes left at the end.
I rush through a 2-pointer, and with 10 seconds left I think I have a solution but am not sure.
If it's correct I get +2, if it's wrong I get -2 and won't have time to find the issue and resubmit.
So should I even bother submitting it?

For me, the purpose of penalties in an instantly graded test is to resist temptation to guess the last couple of key digits; proper key choice should handle the rest. So the ideal penalty for a wrong submission should be about -2*Puzzle_value* Probability_of_guessing_last_2_digits. Tying the penalty to the overall scores introduces the described rather unpleasant meta-game; also, guessing harder puzzles becomes relatively cheaper, which seems to me an undesirable property. I would also want penalties to apply only to solved puzzles, and up to that puzzle's value. I don't have data to back it up, but I suspect most of the errors are a result of either trivial solution mistakes, or mangling key entry / extraction one way or another. If the site isn't facing a horde of furious guessers and cheaters, treating minor mistakes harshly just sours the fun. I suspect that losing a lot of points, in addition to losing a lot of time, on a puzzle you don't finish correctly goes against fun.

And what fun it was! I'd like to thank both authors for two ( well, three ) hours spent with inspired puzzles. From the puzzles I did get to solve under test conditions, I appreciated the neat frame of 3s on the last Classic, the linearity of solution in Sum and the bottom Snake, the somewhat surprising end of the bottom No Rectangles, and the strong character of the bottom Walls. Due to experience with the previous contest, I shied away from the big pointers, expecting beautiful terrors. In hindsight, I should have been braver, as these turned out much more approachable than expected.

Glad to know I'm not the only one who found the grids a bit small to work with ease. Other than constructing in flatter grids, you can give yourselves a little more room by putting the external answer extraction device on the side rather than below, or a whole lot of room by formatting instructions into a column.

Thank you both for the feedback. It's a little depressing to realize we didn't avoid complaints about answer entry the second time around either, but given how mistake-prone Fillomino is we might be cursed to deal with them forever.

term: I don't think I agree on the purpose of penalties. I was never of the opinion that instant grading is supposed to be a departure from the original system where it's important to check work on finished puzzles. If you look back through threads on previous instant-graded tests, you'll notice I'm always complaining that the penalty was so low that I did increase / could have increased my score by letting the instant grader do all the checking for me. The reason for the high penalty amount and the penalty applying even for unsolved puzzles is to discourage instantly clicking submit each time you solve a puzzle.

It's true losing points to minor mistakes is no fun, but one reason I prefer instant grading is that the original system seems worse about this. Here at least you can recover the points for puzzles you've worked through. On last year's test at least a few players lost a 10+ point puzzle to one silly error. Here losing 10 points requires making lots of silly errors.

Anyways, this is one reason I wanted opinions on the penalty amount. Looking at the score page, the goal of having a penalty amount that actually mattered to the rankings (hence one that encouraged checking) was achieved, but how much solvers liked it is something that can't be seen there.

kiwijam's pointing out a situation where the penalty is higher than the puzzle value is a possibility I was aware of before the test, but I didn't think I could achieve the above aims if I decreased the 2%, so I left it. Next time I may do something like cap an individual penalty at half the puzzle's value. Finally, with regards to rushing to finish a puzzle in the last minute, I maintain penalties should still apply. If you finish that puzzle with 30 seconds to go, the instant grader can quickly tell you if you're wrong, which is a huge advantage over being in the same situation with a non-instant grading system. That this comes with a small loss in score seems appropriate.

I think some people like to solve and submit a bunch of puzzles together. They're definitely not thinking of letting the grader do the checking, and if say someone submits 14 puzzles together and 4 of the small pointers turn out to be wrong, thats a big cut to the points if they don't end up solving those. There's many other reasons, where it can be discouraging to people who are actually solving without thinking of the grader and stuff, for the sake of stopping a minority of people thinking of taking advantage of the checking.

I maintain that a person can only take advantage of the grader if the answer is submitted eventually, so I think there can at least be a smaller penalty percentage for the puzzles that don't get solved correctly.

Also, the individual penalty cap for smaller point puzzles is a good idea. While its important not to encourage letting the grader do the checking, its just as important to let competitors submit freely instead of thinking about gains and losses mid-competition.

What if the penalty is simply reducing that particular puzzle's value by N% or a fixed amount of point?
In this way:
- submitting a wrong solution without eventually finishing it doesn't decrease your score from other puzzles.
- sort of discourage guessing because guessing usually takes multiple tries (further reducing the profit from guessing) and the value would be so dismal that its not even worth guessing.

tamz29, I think the problem with your proposal is that then you might as well guess on every puzzle -- especially ones that you started but didn't finish -- because the worst you can get on any one puzzle is 0 points.

I lost points due to typing in a solution key to the wrong puzzle blank (annoying! edit to add: I see I got that penalty refunded now, thanks. But maybe all kinds of carelessness in solving/entering should get at least some small penalty?) and due to having made a mistake solving the puzzle (a lot better than getting 0 points for that puzzle!)

joshuazucker - 2012-10-31 2:57 PMI lost points due to typing in a solution key to the wrong puzzle blank (annoying! edit to add: I see I got that penalty refunded now, thanks. But maybe all kinds of carelessness in solving/entering should get at least some small penalty?)

If I ever do a test by myself, I'm tempted to ask Deb to try an instant grading system where entering a correct answer in a wrong blank instantly gives you both the wrong answer penalty and the right answer value. That, or just have one blank that you fill out multiple times and not worry about filling in the wrong blanks.

What's the point of not including FF2 in LMI puzzle ratings? Since this is going to be held AT MOST once a year, and the first Fillomino-Fillia counted for the ratings, I think that these should be counted as ratings. (Likewise, I think the 4 TVC's should count as one puzzle performance for the ratings). These ratings should determine who is the best puzzle solver for LMI tests, and if one rules out certain competitions, then these will not be as accurate.

When we started a new section for Annual contests last year, I saw them as "additional contests" to monthly contests. And because of that it was decided that annual contests by default won't be considered in ratings since we don't want more than one contest a month to be considered in ratings. There is nothing against FF2 not being included - just that in my earliest interactions with Palmer, I had mentioned to him this will be an annual contest. So even if there was no other monthly test "happened" in October, we didn't change our earlier stand.

Including 4 TVCs as one puzzle performance is something I have often wondered, and we'll try to do that next year. Also, to reduce the number of tests at LMI, we are considering the next annual contest also to be "December Monthly test", which will be included in ratings. But there are more than 1 annual contests in December, and we can not be including everything.