Archive for the ‘Programming’ Category

AngularJS is a great front-end framework, and I love it, but it appears that a lot of people are complaining about memory leaks. I spent last 2 days trying to isolate the cause of my memory leak too, but it turns out that mine was rather Heisenbergian, that is, the act of observing caused the problem! Before you start looking into memory leaks in AngularJS, you should make sure that you do two things:

1. Comment out all of your console.log() functions in your source. The log function apparently retains a reference to the variable that it outputted. So, if you do something like console.log(data), that “data” would never get garbage-collected, and your memory usage will keep on increasing (the hallmark of memory leak). Voila! You just created a memory leak by trying to observe it.

2. Turn off (disable) AngularJS Batarang (the developer plugin for Chrome). The same logic here. If you keep that on, your memory will not get garbage-collected.

After doing these two things, if you still see the memory usage keep climbing, you have a real memory leak (congrats).

Another thing to keep in mind: Chrome is a bit slow in garbage collecting. So, even if you have no memory leak, your memory usage on the Timeline page of Developer Tools may keep on increasing. Give it a little time. Or, click on the trash button at the bottom of the window (forces garbage collection).

The most likely culprit of memory leak in Angular is JQuery used in your directives. If you attach an event-listener in your directive using a JQuery plugin, the latter would keep a reference to your DOM even after Angular deletes its own reference to the DOM, which means it would never be garbage-collected by the browser, which in turn means “Detached DOM tree” in your memory (You can see it in the “Profiles” page of Chrome Developer Tools. Take a snapshot, and search for “detached”.). As you navigate around your app (loading different views/controllers), you will end up creating more of these detached DOMs.

Launch your app, take a snapshot of your memory in Developer Tools, and navigate around for a bit, and take another snapshot. You would probably see the number of detached DOMs increase. Even if the numbers remain the same, the “retained size” might keep increasing for each. This is because the same exact DOMs are repeatedly being created in memory without garbage-collection. By going down the tree browser, you can see what these DOMs are. And, then take a look at your directives to see if there are any event-listeners attached to them. Each JQuery plugin should offer a way to “destroy” itself. Listen for Angular’s $destory event, and unbind the event-listener.

I’m pretty conservative when it comes to selecting any sort of technologies. My biggest concern with selecting any web development framework is the longevity of support. Most programmers select frameworks or languages that are the purest and most elegant from a theoretical or academic point of view, but in reality, they are all human products; as such, they are prone to human frivolity and vanity. There are countless great applications and operating systems that were clearly superior to most Microsoft products, but they died. Naturally, developers would want to believe that product is everything in business, but in the end, it’s more like the cherry on top; everything else actually matters more.

In my view, most of the frameworks currently available are more than good enough. The vast majority of websites don’t require serious programming. It’s mostly about pulling, pushing, and slicing data. It’s like selecting a car for your daily commute; does it really matter whether I’m driving a Ferrari, Porsche, or Honda? In terms of what these cars can do for me, it makes no difference; they can all get me from point A to B. But it does make a difference in a sense that the car I drive every day to work would probably need to be serviced/repaired often. For that, Ferrari and Porsche would be a bad idea as it would probably take a lot longer to fix (and cost more). So, I select a framework equivalent to a Honda.

I particularly avoid frameworks with a lot of hype, an equivalent of Hummer. When people become really excited about something, it is inevitably followed by a quick deflation because it’s not possible to sustain the same level of excitement for a long time. It’s like trying to sustain an orgasm for hours; what goes up quickly tends to come down quickly also. It’s better to select something that is low-key and modest if you need it to last for a while.

In contrast, broadcast design is all about whatever is the hippest and most exciting at the moment, because as soon as it airs a few times on TV, it is discarded. There is no point in designing anything that would stand the test of time. But when you design a logo, you want to resist the temptation to do something trendy, hip and exciting, because it would look dated in a few years. Young designers tend to have a hard time resisting, and the older design directors often have to curb their enthusiasm. This is necessary in IT too. We have to tell the younger programmers, “OK, let’s not get excited about this hot new programming language. We’ll see where it goes in a year or two, then decide.”

For our clients, we don’t use those inexpensive shared hosting services, but the site for my daughter’s school that I maintain pro bono is hosted on a shared server at DreamHost, which has been experiencing a series of hacking incidents. They host non-profit websites for free, so I’m not complaining, and am thankful for them. I just want to share the things I discovered on our site so that others may be able to benefit from it.

A few days ago, I noticed a file named installer12.php in one of our tmp directories. This file is designed to self-destruct by the last line in code which is:

@unlink(__FILE__);

At the top of installer12.php is an array with hundreds of random words, and it randomly combines two words to create a file name. What this file with a random name does is explained by Leo Parker Dirac on his blog. In his case, installer12.php happens to pick “ainslie” and “turning” to create “ainslieturing.php”. Both of these words are in installer12.php.

The reason why installer12.php did not self-destruct on our site is because our tmp directory is not publicly accessible. We have htaccess file that sets the web root lower down in the directory structure. So, the hacker somehow managed to copy this installer12.php into our tmp directory, but could not trigger it because it’s not publicly accessible. So, it remains undeleted.

The installer12.php in our tmp directory has a Linux user of rp_admin and group of pg7029. Neither are ours, which means that the hacker did not copy installer12.php from any script on our site. If he ran a compromised/malicious script on our website to copy this file, it would have our Linux user and group (just like the JPEG files that we allow users to upload to the site). My guess, therefore, is that the hacker had shell access to the shared server (where our site is hosted) and were able to copy installer12.php to any directory on the server with permission set to 777. In fact, our tmp directory had many subdirectories, and installer12.php was copied into all of them (about 100). So, some sort of script searched the server for any directory with 777 and automatically copied the script in all of them.

After I reported this incident to DreamHost, they ran an automated script to scan our website for any suspicious files. It is supposed to delete any known malicious files but it didn’t delete installer12.php, which leads me to believe that they are not aware of it.

Here’s the code part of installer12.php (right above this part is a big array with random words):

This is definitely a shocker. I did not think they would do this. If they had to allow Flash, I thought they would allow Flash on their iOS Safari before allowing the developers to use Flash to build Apps. The latter, I think, has more severe consequences for Apple. If the developers can use Flash to build iPhone/iPad Apps, those who haven’t learned Objective-C would never bother to. Objective-C is a rather esoteric language. In comparison, the difference between ActionScript (Flash), Javascript, Java, and C++ are pretty minor. It will open up a huge floodgate of Flash developers to create iPhone/iPad apps.

This may sound like a good thing for Apple, but it’s not. If Flash dominates the market of iOS development platform, Apple will lose control over their developers. Even if they release new iOS or hardware technologies, they would be at the mercy of Adobe to implement them effectively. Even the timing and the speed of the propagation of these new technologies would be determined by Adobe. They could even refuse or deliberately delay the implementation of some of key technologies if that is in their own interest (e.g. because they have ties to competing technologies.). Steve Jobs implied this to some degree in his Thoughts on Flash earlier this year.

Part of what motivated this move by Apple is probably the fact that the market share of iPhones is declining in relation to Android. This is a serious matter to consider for Apple. The developers are finding comfort in the fact that Google hardly regulates the app market on Android. The last thing developers want is to work for months developing an app only to be rejected by Apple. So, a safer approach is to develop the app on Android first, then port it over to iPhone; the opposite of what many developers were doing before. So, this might be part of an effort to create the perception that Apple is going to be less of a Nazi from here on.

It is commonly argued that “security through obscurity” is false security. I think this whole debate is poorly defined. Ultimately security is all about obscurity and nothing more. Take password for instance. “123456″ is the most common password, so if you are smart, you would not use it. Your birthday would be more obscure but it is still relatively easy to crack, especially by someone who knows something about you. So, you might use the name of your cat. But, you might feel that this too might be crackable. So, you combine the name of your cat with the name of your first grade teacher. And, so on… The more important the information you are trying to protect, the more obscure you make your password. This is security through obscurity. There is no security system that does not use security through obscurity. Even fingerprint scanners rely on obscurity. The chance of someone sharing the same fingerprint as yours is 1 in 64 billion. Again, this is not perfect. It is still relying on obscurity; the only difference is the degree.

Access Control List (“ACL”) is a way to control user access to a website. It manages different groups of users like administrators, managers, employees, customers, etc., where each group accesses different areas of the website. ACL comes built into many web development platforms. We are using CakePHP which has a sophisticated ACP built in, but we’ve never used it before. So, I recently looked into how ACL is implemented on CakePHP. After Googling about it for about an hour, I found a whole bunch of articles and blog posts about how “hard” it is. I then created a test project with ACL to look into the details of it. Oy. I now see what everyone is complaining about.

Personally, I have no idea why anyone needs this type of complex access control. What sort of systems are people building that actually require this level of complexity? A system for CIA?

In the past, I’ve simply added another column in a users’ table called “security_level”. I’ve never even bothered to create “groups” table, because we’ve never come across a situation where it was necessary. (I simply store the security_level value in session and check it wherever I need it.) I’m a pragmatist, so I never bother to create anything that the reality does not require. Having 3 different levels of access seems to take care of pretty much everything.

From a point of view of a pragmatist, I see a serious problem with having a complex ACL. If you need a complex ACL, it means that you must be managing a system that is used by thousands of people working within a complex organizational structure. When you have a complex ACL with thousands of users, managing the access list becomes a full time job. As the security needs change in the real life, someone has to modify the ACL to reflect the new reality. Having the ability to fine-tune the privileges of individual users means that nobody could possibly have a clear picture of what everyone is accessing unless you specifically look it up on the system. This can easily create security holes that nobody is aware of. For instance, one specific user may have access to a top-secret area of the site that nobody is aware of, until someone suspects something and looks him up on the system. (For instance, you meant to temporarily grant him access to a very specific section of the site, but you forget to revoke it later.)

In other words, complexity of a security system is itself a security risk. So, a complex security system defeats the whole point of having a security system. When you simplify the security system, it may create some inconveniences in reality, but the simplicity allows many people to intuitively understand how the security works, which makes it more secure with less room for mistakes and holes.

For instance, with my scheme of just having 3 levels, all I would need to know is what security_level you have. I would then immediately know what you can access and what you cannot. Not just me, but everyone else who has the same security_level would know what that means. Every user in this situation can act as a potential auditor who can keep an eye on other users. Once you start fine-tuning each individual, nobody would have any idea who has access to what, and who should have access to what.

Am I wrong here? What am I missing? Why is everyone going nuts trying to implement such a complex ACL? In reality, the number of websites that actually require that type of complexity would be very small, and those who require it can afford to write their own ACL (such as large government institutions or financial institutions), so what is the point of writing a reusable library? Wouldn’t it make more sense to create a reusable library that is very simple, so that 99% of websites can use it with ease?

I find that many programmers, especially those who studied computer science in college, tend to get so excited about certain abstract ideas like flexibility, scalability, re-usability, and controllability, that they ignore what the reality needs. It reminds me of hardware geeks who get really excited about building super-fast computers even though they have no use for them personally. (All they do is to run benchmark testing utilities to prove their speed.) This lack of central coherence is often absurd.

I think the power to control users is a particularly exciting area for some programmers because it involves controlling actual power (political or organizational), and because the programmers often get to be in the most powerful position (“superuser”). But, they really need to stop masturbating and start focusing on what the reality really needs.

It’s common these days to convert URL arguments into what looks like a directory structure. Here is an example:

http://example.org/words/2009/05/a-quick-take

WordPress and CakePHP do this for you. I never liked this idea, and it can become a real hassle when implementing a web-based application that offers a variety of features. For instance, say, you want to add the ability to change the background on your blog page by passing an argument

http://example.org/words/2009/05/a-quick-take/blue

Say, you also want to have a background music

http://example.org/words/2009/05/a-quick-take/blue/on

And, you also want, to have the option of displaying banners or not

http://example.org/words/2009/05/a-quick-take/blue/on/hide

Now, suppose you just want to hide the banner, and you don’t care about the background color or music, ideally, you would want to do this:

http://example.org/words/2009/05/a-quick-take/hide

But you can’t because 5th argument is defined as a background color. So, even if you don’t care about the background color or the music, you still have to specify all the arguments.

And furthermore, what if you wanted the ability to break up a long post into multiple pages? (That is, AFTER you have already implemented all the options above.) You want to add an ability to append a page number like this:

http://example.org/words/2009/05/a-quick-take/2

But you can’t, because you have the 5th argument already reserved for the background color. In order to change this, you will have to go back to all the links and shift all the positions by one. This is a huge pain. And remember, it’s not just you who have to shift the arguments, everyone linking to you (including search engines) now must shift them, or else the link will break.

So, this scheme may work for a closed system like WordPress (where it serves only one purpose), but it’s a real pain for a system that needs to remain flexible and extendable. It’s one of those things that you need to be aware of and be able to weigh the cost and benefit when you are designing the whole system.

CakePHP implemented what they call “named parameters” to get around this problem. Here is an example:

http://example.com/controller/action/param1:value1/param2:value2/

I believe named parameters are order-insensitive. So, you could eliminate the ones you don’t care about. This feature was added after-the-fact, because, I believe, many developers realized the same thing I realized. It was a real hassle in many situations. So, it’s like a work-around, which is unfortunate. The combination of two schemes makes the whole thing more convoluted than it needs to be. Also, we need to keep in mind that search engines would not understand what those colons mean. Even if CakePHP sees them as order-insensitive, search engines would not know that; so they have to treat them order-sensitive, which means that when you flip the order, they would consider them as separate URLs.

Furthermore, the slash schemes are often hard to read and understand. For instance:

http://example.com/portfolios/2/3/1061

You have no idea what those 3 numbers mean. If it was using a straight URL, it would look like this:

Myth: “Dynamic URLs are okay if you use fewer than three parameters.”
Fact: There is no limit on the number of parameters, but a good rule of thumb would be to keep your URLs short (this applies to all URLs, whether static or dynamic).

www.example.com/article/bin/answer.foo/en/3
Although we are able to process this URL correctly, we would still discourage you from using this rewrite as it is hard to maintain and needs to be updated as soon as a new parameter is added to the original dynamic URL. Failure to do this would again result in a static looking URL which is hiding parameters. So the best solution is often to keep your dynamic URLs as they are.

The second one is particularly interesting because it’s not just humans who have hard time understanding what the parameters mean, and what to do when adding more parameters to the existing order. Google would have no idea either. So, contrary to the popular belief, those readable URLs are actually SEO UN-friendly.

When you want to play back the sound on a user-event (like pressing a button), you need to check the player to see if it’s already playing. Otherwise, the user event would not trigger the sound as you would expect, like playing a drum-machine where each time you press, the sound would start playing back from the beginning. To achieve this effect, you have to first pause it (not stop), and set the currentTime property to 0. Like this:

The other weird thing about AVAudioPlayer is that, if it’s a class member of a UIViewController, you need to explicitly release it. Otherwise, your instance of UIViewController would not be released. I had a situation where I needed to release a bunch of UIViewControllers, but they wouldn’t get released no matter what I tried. After struggling for a while, I discovered that they would be released if I manually released the AVAudioPlayers that were the members of the UIViewController. Weird. Theoretically, the dealloc method of the UIViewController should release the AVAudioPlayers, but somehow it doesn’t happen.

Learning about Objective-C has been quite interesting, especially the histories of Objective-C and C++. They were two different schools of thought that extended C to accommodate object-oriented programming. As we can see today, C++ has been more popular and we have already seen several permutations of them. In a way, my own history of programming has followed that particular school, although I did not know that an alternative school existed. I learned C, then C++, then Java, and lastly ActionScript.

The main difference between the two schools is in typing: static vs. dynamic. It gets rather philosophical and I find it fascinating in that sense. Static typing (C++) assumes that the world can be categorized (abstracted) perfectly. In other words, categorization is assumed to be inherent in nature. If it fails, it means you made a mistake in understanding the underlying structure of the universe. (This is analogous to Structuralism in the modern philosophy, like Noam Chomsky.)

Dynamic typing assumes that categorization is never perfect because it is an order that we humans impose on nature. As such, the flaws are unavoidable. By leaving the typing dynamic (by leaving the definitions of objects as dynamic as possible until run time), Objective-C is able to accommodate situations that do not fit neatly into predefined categories. These situations do come up often in real life situations.

Dynamic typing, in this sense, is analogous to post-structuralism. Its fundamental assumptions are similar to the philosophies of Derrida and Wittgenstein. I’m a big fan of both philosophers, so I find Objective-C to be fascinating.

However, most development projects are not academic exercises. So, we do need to take into consideration the parameters and the realities that the business imposes. In this sense, I do find Objective-C to be quite lacking.

As I said above, C++ has already evolved several times and it has been improved quite a bit. C++ had many provisions to make it backward compatible with C, but Java (and I would assume C# also) has moved beyond it for the sake of clarity. Since OOP has become a predominant method of programming, there was no need to be concerned about backward compatibility.

Personally, I find ActionScript to be the best. In fact, it incorporated some of the strengths of dynamic typing. I think it strikes a good balance between the two.

Objective-C, on the other hand, has been hacked around since the late 80s. To me, it should be laid to rest and should be re-designed from scratch. For instance, the lack of name space has been addressed by prefixes like “NS”. I was like, “What the hell is “NS”? It turned out to be a short for “NextStep”. Oy. I’m not sure why they don’t just ditch the backward compatibility and implement name space.

The obvious superiority of the dot-syntax has been forced into Objective-C in a half-assed manner as “properties”.

I also find the use of header files to be annoying; something most other OOP languages have done away with.

I kind of suspect that Apple realizes how obscure their programming environment is for most people. And, I get a sense that their implementations of the custom CSS and HTML tags for iPhone is in response to this problem. If other mobile devices support the languages and the programming paradigms that are more familiar to the mainstream developers, Apple could lose them to the competitions. In the mobile market, the market share that these companies are concerned about is the market of developers, not so much that of the consumers. It’s very much like the game consoles. It’s the games that determines the popularity of the consoles, not the other way around.