Internally, LocalDateTime uses a single millisecond-based value to represent the local datetime.

So I assume the representation will change if an object moves from one time zone to another.

Are you saying that if I'm in ESTand store "2013-10-21 10:50:05" and send that to someone who is in PST you want it to show up as "2013-10-21 10:50:05" or do you want it to show up as "2013-10-21 7:50:05"?

I'm not getting what the point of not using timezones or an epoch date is. If you want it to show up as "2013-10-21 10:50:05" in both (or keep the numbers the same) store them as numbers or a string. If you want it to adjust based on timezone then I'm not seeing the issue with either converting it to an epoch when you store it or storing some timezone with it.

Oh fuck me. It seems like some Brazilian Android phones are converting a timestamp in millis from 20 Oct 2013 to "31 Dec 1969". Probably return value "-1" from some C API that breaks on the October DST change date because of broken tzdata. It works on any device we have here though from Samsung/Sony/Nexus.

One of the most frustrating aspects of software development is that there are about a million ways to solve any given problem that will work just fine, half of which are even considered 'best practices'. Yet in any given design process, there will be individuals who cling to their approach as the One True Way beyond any logic or reason. If you're unlucky enough to get two such individuals, the whole process comes screeching to a halt and becomes a battle of politics, and who can convince the highest ranking (and largely clueless...) member of management to back their approach.

My wife works as a structural engineer, and settles such arguments with official code books, regulations, and math. I hope one day that software development, as a discipline, matures to this point

(the above rant is completely unrelated to the fucked timezones discussions, which I thankfully don't have to deal with, but is entertaining and educational )

I really, really hate dealing with One True Way people - as you say, there is almost never a case where they're right. And even if they're right today, they'll probably be wrong next year. The ability to adapt to changing technologies, requirements, and circumstances is at least as important as technical knowledge.

Also, there is a parallel discussion of timezone fuckery going on in the Boardroom right now - there is a surprising amount of overlap

I really, really hate dealing with One True Way people - as you say, there is almost never a case where they're right. And even if they're right today, they'll probably be wrong next year. The ability to adapt to changing technologies, requirements, and circumstances is at least as important as technical knowledge.

Also, there is a parallel discussion of timezone fuckery going on in the Boardroom right now - there is a surprising amount of overlap

I was just thinking the same thing (about the timezone overlap) It is most likely due to the overlap in the posters between the 2 forum.

I really, really hate dealing with One True Way people - as you say, there is almost never a case where they're right. And even if they're right today, they'll probably be wrong next year. The ability to adapt to changing technologies, requirements, and circumstances is at least as important as technical knowledge.

What gets me is that in software development, there is often more than one right answer. So you get two people, both with different good/correct/viable solutions, and they decide to butt heads for weeks over which one to go with.

I and the other lead guy have opposite styles - both work well, but avg(me + his) solution is much faster and just as maintainable. I often create the pragmatic/practical/fast solution first and he does a pass on it refactoring ugly bits.

Yesterday my next-door cube neighbor was ranting about some example code on StackOverflow being wrong, people shouldn't post broken examples online, blah, blah blah. Turns out it was fine - he was under the mistaken impression that code in a "finally" block in C++ wouldn't execute if control flow left the "try" block via a "return" statement.

It means that as long as the program is still running this will be executed. There's not a lot that you can do if the user kills the process or if there is a power cut.

-1 useless.

In the context of writing programs in a high level language, discussing shit likes this adds nothing to the conversation.

Given the other 2 comments I'd say not really. What it adds to the conversation is the realisation that you should try to leave stored data in a recoverable position where possible. Just because the finally should clean everything up if something goes wrong doesn't mean that you shouldn't save your writes until all of the processing has been completed where possible. After all, you want to minimise the risk that the system could leave data in your datastore that could break the system next time it tries to use it.

Lovely, another case of being overly pedantic. I love how it evolved from a level zero newbie discussion

Quote:

he was under the mistaken impression that code in a "finally" block in C++ wouldn't execute if control flow left the "try" block via a "return" statement.

to a discussion more suitable for database ACID compliance in just a few posts.

--

In a side project, I discovered that I really, really like to see DiskKeeper/NTFS for Linux. EXT3 doesn't need it, they say. Well, my little .py script (based on this) says otherwise. 35% of all files are affected by excessive fragmentation (avg frag < 256K):

No wonder disk queue length is too high, for "simple to serve" files read in linear order at a rate of ~2/second. It's basically random read all over the place.

And with no pre-allocation API (truncate(size) does not work for allocating in 1 extent), defragging in user space is going to suck. This is what needs to be done: create copy, check frag count, if better, keep, if not, try again, and do many passes of it. Bleargh.

I remember them teaching us the limitations of try/catch/finally at university (granted they glossed over it with a joke something like "unless someone pulls the plug of course", but it served it's purpose to highlight that we don't just need to think about our programs but also the environment they're running in), so not so low level that you don't care for normal programming. It's something that should never be a problem in normal running. However knowing that there are limitations can be the difference between looking at a program wondering how something could have happened when the code says it's impossible vs having a clue about where to start.

In a side project, I discovered that I really, really like to see DiskKeeper/NTFS for Linux. EXT3 doesn't need it, they say. Well, my little .py script (based on this) says otherwise. 35% of all files are affected by excessive fragmentation (avg frag < 256K):

There are a few pathological use-cases that can cause ext3 to fragment like crazy, like writing multi-gigabyte files at 1 kB/sec, or allocating tons of small files, then grow them by several magnitudes. Turns out the later is basically what a large IMAP-host storing mails in maildir-format can end up with, while the former is what mbox-mailstorage can degenerate into.

That being said you might want to look into Shake, an attempt at a userspace-defragger for ext3, which has a few neat ideas how to force continuous block allocation in ext3.

There are a few pathological use-cases that can cause ext3 to fragment like crazy,

Is there a filesystem that doesn't fragment like crazy when it's almost out of space on the device and you do very fragmentish things? A lot of the auto-defragment stuff falls on its face in low disk space situations.

Well, the files average at ~4MB each, range is 2M-16M, and there are tons of them, but the filesystem itself is at 50% full, 1T free out of 2T total so the free space map shouldn't be that crazy.

EXT3 seems to be completely stupid though, the attempt to "defrag" has pretty low success rate. This is what I tried:

ftruncate(size) - tell the filesystem we're going to write away $filesize. On Windows, File.truncate() in Java pre-allocates the file so it doesn't fragment too badly. On Linux/EXT3, it maps to ftruncate() that's the next best thing - it creates a sparse file but does no allocation.

copy the data with read/writes in 32MB increments. For 99.999% of the files, that's one write() call. Hint hint, please create that in one piece, not too many.

copy over the mode/stat stuff

check if it's an improvement, mv tempfile original if it is.

This is pretty much what Shake does, their fcopy() in executive.c does fadvice() that the source will be read sequentially and it handles sparse files which I don't. I prefer .py as I can customize the logic, it's super readable and less error prone.

The first pass results in pretty huge improvements, like

Quote:

Done. Before: 6626 fragments. After: 4527 fragments. (-2099)

But after fixing the most obvious issues in the first 2-3 passes, it's just down to luck if a defrag will improve things or not.

I don't know wtf EXT3 is thinking, but allocating a 1.67 MB file in 36 fragments is kinda crazy when there's a TB of free space to play with. Oh well, that's one reason to go with EXT4, you can at least manually defrag by requesting extents, but I'm not in a hurry to redo all my storage.

That should work, but the filesystem has gone to hell for some reason.

dd if=/dev/zero of=zero.bin bs=4096 count=4096 gives me a zero.bin with 500+ fragments, and cp file file.2 has the same issue. But if you cp file to file.2, file.3, file.4, file.5, etc, at #10, it runs out of the stupid and starts allocating the file in just a few chunks instead of tens or hundreds. So this works:

For i in 1..32 {

- Create 4MB file filled with junk data

- Calculate how fragmented this junk file is, if <= 16, break

}

cp old new

mv new old

The first for loop is critical, it seems to eat up the space with hundreds to thousands of fragments, and after it's exhausted, the cp part will allocate new files with less fragments. Not always, but with way higher success rate than without the fragment eating part. Most folders go from 10k excess fragments to "zero" / 128k+ average fragment size in 5 loops. Which is fine, modern disks do 128K read almost as fast as fully sequential.

Then I read about EXT3's block allocator working once block at a time so it all makes sense now, it seems like a stupid brain dead algorithm that allocates blocks left to right with a bit of locality, some folders are fubar while others are fine. Makes sense, as you need to alloc for every 4KB, you can't spend too much CPU time scheming. "No need to defrag" indeed.

EXT4's delayed allocation and multi block allocator should do much better, but that's for the next hardware refresh and OS install.

Side effect: e2freefrag says that the amount of free extents in the sizes 4-8k, 8-16k, 16-32k doubled from 100k to 200k. Oh well, that's what you get I guess for removing 250k excess fragments from real data. 41% of the free space is still in 64M+ chunks.

[edit] Filesystems, until a few days ago, I just assumed it just worked and there isn't much need to improve it, but EXT3 proved me wrong. Good thing there are competitors, the BTRFS guys wrote a very good overview in-depth paper of it, with comparisons, on http://dl.acm.org/citation.cfm?id=2501623 (Aug 2013 so very up to date, ACM DL paywalled). Maybe it isn't such a crazy idea after all to go create a competitor to ZFS

I realize that our customers want to do geocoding of locations on networks that don't have access to Google. I also recognize that they have a GEE server which will sit inside the network. However, attempting to use their database query code (which is not an API) to provide geocoding is not going to work. For starters, that code returns javascript code, not JSON. Trying to treat it like JSON has now had you and I waste a day and a half. Secondly, it doesn't handle unicode correctly (because it's returning JS and not JSON). Third, it takes 17 seconds to return. That's never going to be acceptable for autocomplete performance, even if our end users don't have any other options. Finally, because it's not an API, there's no guarantee that an update won't change the internal code, rendering all our efforts moot.

I get that the customer wants this to happen, but the customer's expectations are unrealistic given the technical constraints. The right answer is not to keep hammering in the hopes that eventually, our square peg will turn round.

I don't find it that bad compared to its full-featured competition. Granted, I don't have too many add-ins installed and the last time I used IntelliJ was on far inferior hardware (but VS 2008 compared favorably to it at the time). Oh, and any source control plugin using the VSS API can slow VS down to a crawl.