Auditing – Story of what is happening behind the scenes

More than an year back, I had chance to audit a Social Networking Website, pretty famous in India, developed in PHP. Looking at the code I thought that what matters more to these people, the immediate revenue model or the long term model. How really different developers and the management people perceive the code quality?

Some of the top level findings from that audit are:

In one of the PHP files profile.php twenty plus Ajax calls were made on page load for which the response time was very high. Indiscriminate Ajax use can multiply the concurrent load, which in turn can choke the web server, DB server, physical connections, etc.

No caching or searching strategy was used to avoid repeated data retrieval from DB, especially for data which doesn’t change too frequently.

No connection pooling for management of DB connections.

No pagination on UI pages, as the number would grow the whole data to be displayed on the page.

No domain model in the application.

No framework used even for very standard operations. Reuse tried-and-tested 3rd party components (PEAR) as far as possible instead of writing such components from scratch, since they’re likely to be more optimized.

No separation of the business logic, presentation logic and database layer. From view part to database connections, everything is done on a single page, which is hard to maintain the code and causes scalability and performance problems. For example in home.php, it does some business logic checks on top by selecting data from database and depending on that the html code is echoed.

Very common use of “Select * “ in sql queries even when single value is used.

In one of the PHP files home.php 24 times the database hits are done. eg. to find information of logged in user etc..

No logic of abstract or utility methods in the application. As all these checks are common for each user, so it can be separated in different class which will help in maintain the code.

Few days back, I got another chance to audit another Social Network Website, focusing on very niche market in US, developed in Java using latest J2ee technologies and latest frameworks. Initially it did sound very promising and surely they are doing pretty good business also. But let us see how the story goes once I started looking into what actually happening behind the scenes.

Some of the findings are:

No separation of business logic layer and database layer.

Lot of duplicate code through out the application. Same method doing the same thing copied in different classes. Same method in service layer and database layer. Lot of code duplicity in query building.

There are no test cases for some of the packages. Very minimal test cases for complex functionalities.

No default in switch statements at many places. Method contains a switch statement where one case branch will fall through to the next case.

Using non-short-circuit logic rather than short-circuit logic if (value != null & (!SomeEnum.VAL.equals(value)))

Null pointer dereference, there is a branch of statement that, if executed, guarantees that a null value will be dereferenced, which would generate a NullPointerException when the code is executed. Not enough test cases to cover such scenarios.

If we look closely at the above projects, both the websites are running fine and doing fare business. But the problems we see above are just an indication of the larger problems which may still to be discovered. The common problems at high level which both covers are:

Code Duplication

Big size files

No separation of concern

Performance inefficient

Large cyclomatic complexity

Unused variables, parameters, methods and even classes

Over complicated expressions

Design not evolving over a period of time

Lack of developer knowledge

Lack of adequate test cases.

Just looking at the code from the third person perspective, you can easily make out that the code is not suited for portability, maintainability and scalability etc.

In terms of recommendations I would say to pro actively look for the such code quality issues during the development phase of the project itself. There are so many Static Code Analysis tools available to make the job easy for you. Few to name which I used during these audits are Simian, PMD, Cobertura, Clover, FindBugs etc. The best option is to integrate these tools with the build process and keep a watch on the quality on regular basis. Catching the problem at early stage will help in minimizing the bigger problems which may occur later on. Most of the above issues can easily be caught with these tools and can be fixed in few minutes rather than to get these as bug. It is helpful even in case of lack of developer knowledge because you have the facts/suggestions from these tools in front of you.

Well, I have to agree that over the period of time the quality has improved a bit but still in what respect. Is it anything to do with the domain we are talking about here or is it just the two different projects or is it the business model people keep in mind to keep going. Is it really the process, may be code review practices or tight deadlines? Is it only the developers who write the code? Is it also the management team which does not provide enough time for re-factoring? Based on the experience and on interaction with the different development teams I could make out one thing that it is the way the developers and the management team perceive the code quality and take the necessary steps to keep it up to the mark.