Wednesday, February 23, 2011

"Ya ni llorar es bueno", this is a Mexican saying that translates to "it's not even worth crying anymore". I thought about this saying after a talked with few companies and friends regarding legacy application, specially legacy code that directly hit the ROI of the company. To add some context around this post, let me explain what does "legacy code" means to me:

It is using a technology relatively "old" (+5 years)

It is in production

The application plays an important role in the company

I have worked on a few of these applications and I hate most of them! Not because the they are hard but because of politics. It has been my perception that managers are "gun shy" when it comes to handle the re-write of legacy code when they are clearly out of control. I am not sure why this is, but this are some my reasons based on my interviews with tech leaders and developers:

Time: aah, the Pandora box of managers. Lets face it, it is hard to estimate software projects, and worse the re-factoring of a major legacy code. The stakes are high if the re-factoring goes wrong. Managers have mentioned that these refactoring are a double edge sword because it can be really good, or it could eventually damage their reputation and the morale of the team.

Resources: many manager are fighting with other departments when it comes on resources/programers. Many companies are trying to innovate and create new applications to latch on to the new business models. These projects take priority over the legacy systems that they have in place (if it's not broken, don't fix it).

Domain experts: many of the large dev shops (+20 developers) have two types of development departments: engineer and support. The engineers are senior developers with at least 5 years of expertise. They develop the application based on the specs of the project manager and product owners. The support team are a mixture of junior and junior-senior developers. They take over the code when the engineers deploy the application to production. The problem with this model is the lack of mentoring/couching of the junior team. The support team does not have the know-how or the domain expertise as the engineers. The consequences are the introduction of bad code or bad practices when they need to add a feature or fix a bug.

Some of the dark side of the legacy codes are the following:

Bloated controllers and DAOs

Business logic is ALL over the code (DAO, stored procedures, controllers, DTO, the list goes on)

Silo effect - just a few developers know about the application

Fragile code - tightly coupled code

Bug identification and turn-around time is large

Application is slow

I doubt that I can get the right answer to this problems, but here is my advise to current and aspiring managers regarding this matter:

Bite the bullet: if the legacy application is a major part of your business, don't wait any longer. Get a grip on this application before it get any worse and start doing a re-factoring by doing small iterations (not longer than 2 weeks old).

Candid conversations: ask the tough questions regarding the application like "does the architecture needs to be changed?" Have all the developers review the current app and see what should be changed. Identify the large parts of the code to change (architecture) and what are the smaller (low hanging fruits) projects that can give more momentum to the project.

Go agile! An agile methodology like XP, Scrum, and Kanban add great value to projects specially like pair programming, test-driven development (tdd), domain-driven development (ddd), and continuous integration (CI). The greatest thing about Agile is its rapid time to market. Small iterations means that your customers can see what you did in a couple of weeks and get their feedback. This also provides early risk reduction since you can find bugs relatively quickly. Pair programming helps mentor junior developers and avoid the silo effect. TDD guarantees that any code added to the application is tested before shipping it to production and gives better quality to your customers. DDD adds "depth" to your code and isolates all the business logic in one area of the code. This way, any developer know that if there is a change in the business logic, he/she needs to look for the domain packages. Finally, CI (building pipelines) it is primarily focused on asserting that the code compiles successfully and passes the suit of unit and acceptance tests (including performance and scalability tests).

Legacy code is a dreaded word for developers. The fact is that no developer wants to develop in Java 1.4 or Struts 1.x. If your want a company to attract good talented individuals, try to handle your legacy code.

Again, I'm sure that I missed or don't understand ALL the reasons why so many companies have large legacy code in their core systems, so please...I welcome your thoughts in this matter.

Here we've used attribute method to create a virtual instance variable. To the outside world, price_in_cents seems to ben attribute like any other. Internally, though, it has no corresponding instance variable.

Default HashLets say that you need to count the amount of words in a file. The way that would do the counting, is that each line would be treated like an array of words. Create a hash, check if the hash has the word, if it doesn't then add the count of 1; otherwise, increment it.

Here, line 7 does the iteration of each of the 10 numbers and uses the appropriate calculation based on the parameter and passes each number to it. Then, it joins each number with a comma.

Slow?The Ruby team has done a lot to its virtual machine, but (coming from a static typed language perspective) it is still hugs a lot resources.

MultithreadedIn the previous version of Ruby (1.8), the way that it handle the threads was through its VM. This process is called green threads. In Ruby 1.9, threading is now performed by the operating system. The advantage is that it is available for multiprocessors. However, it operates only a single thread at a time.

Learning CurveIf you are using Groovy, Python, or any similar dynamic language, the learning curve is almost null. The only problem is the lack of good support for editors. NetBeans used to be a very nice editor, but they have decided to discontinue the support for Ruby. I've been a Mac guy for quiet some time now, so I've been using TextMate and it worked great.

Where should I use it?I try to "pigeon hole" the applications carefully before me or my team decide which type of language to use. For example, if the application needs to be highly efficient with a large amount of transactions, and performance needs to be immediate, I do NOT trust Ruby (sorry). I would turn to Java, Spring, Hibernate/iBatis, EHCache stack. However, if the application is a quick and simple CRUD pages with low usage like a series of admin pages, then Ruby would be my choice.

Comparing Ruby to JavaIt is worth to keep Ruby in your toolbox in case you need to do quick scripts or sites. I did end up learning different things, like Ruby's nice components. For example, if you would like to get the top-five words in the previous word count example, you can just do the following:

The sort_by and inject are two handy component. Here are some other examples:

[ 1, 2, 3, 4, 5 ].inject(:+) # => 15

( 'a'..'m').inject(:+) # => "abcdefghijklm"

Also, being a TDD guy, I really enjoyed the Behavior-Driven Development or BDD. Based on the books and forums that I've read, this is the Ruby community's choice of tests. It encourages people to write tests in terms of your expectations of the program’s behavior in a given set of circumstances. In many ways, this is like testing according to the content of user stories, a common requirements-gathering technique in agile methodologies. Some of the frameworks are RSpec and Shoulda. With these testing frameworks, the focus is not on assertions. Instead, you write expectations.

At the kinds of volumes that Twitter handles (and with what I assume is a somewhat scary growth curve), Twitter needs to improve concurrency—it needs an environment/language with low memory overhead, incredible performance, and super-efficient threading. I don't know if Scala fits that particular bill, but I know that current Ruby implementations don't. It isn't what Ruby's intended to be. So the move away is just sound thinking. (I suspect it also took some courage.) I applaud Alex and the team for this.

Instead of defending Ruby when it's clearly not an appropriate solution, let's think about things the other way around.

The good folks at Twitter started off with Ruby because they wanted to get something running quickly, and they wanted to experiment. And Ruby gave them that. And, what's more, Ruby saw them through at least two rounds of phenomenal growth. Could they have done it in another language? Sure. But I suspect Ruby, despite the occasional headache, helped them get where they are now.

ConclusionFinally, I would recommend learning Ruby and I would definitely keep doing more stuff with it. Although there are some characteristics similar to Groovy, Python, and others dynamic languages, it does has some nice different features. Also, the Ruby community is very vibrant/active. There are thousands of programmers building packages/APIs or gems.

The way to load the API is very similar to the "yum" command in Linux. Let say you need to do the following:

Connect to a GMail account

Check for e-mails that have attachments

Do some type of business logic

Send an e-mail with your results

I could create everything from zero, but instead I was able to find a nice little gem named "gmail". I just did "gem install gmail" and voila, I got the API! I also wanted to use MongoDB for a Ruby on Rails (RoR) project and a podcast explain to me how to do it.

Again, this is just the beginning but it was a really nice experience and remind me back of why I got into programming. The challenge and the unknown is what drive must of us on finding better solutions.

Wednesday, February 2, 2011

Persisting tuning has been drilled into my head early in my career and I understand the term, "earlier is cheaper". At the beginning of my career I worked for a major bank on a data warehouse. I worked as a junior developer along side with some pretty solid data architects. I learned a lot about databases. Specially, that they are notorious for bottlenecks in applications. Later in my career, I worked on web development projects. I found love with the Spring framework and ORM (Hibernate and Ibatis). Here are my thought regarding a presentation done by Thomas Risberg, a senior consultant from SpringSource, he stated the following:

There is no silver bullet when it comes to persistence tuning. You have to keep an open mind an approach the problem systematically from a variety of angels

The presentation did not touch on "big data" and the NoSQL movement. However, there is still a lot of good stuff in it, specially if you are using Java, Spring, Hibernate, and an ODBC.

Currently, I have been working on an application that needed to support up to 200 messages per second. Below are a few strategies and process that I implemented to try to increase three major task:

Performance: response time needs to be in millisecond

Scalability: able to hold 200 message per second

Availability: able to scale horizontally (not buying a bigger box, but getting a similar box)

DBA - Developer RelationshipWhen creating a database, you have two tasks as a DBA:Operational DBA:

ongoing monitoring and maintenance of database system

keep the data safe and available

Development DBA:

plan and design new application database usage

tune queries

The Operational DBA role is the following:

Data volumes, row sizes

Growth estimates, update frequencies

Availability requirements

Testing/QA requirements

Development DBA role is the following:

Table design

Query tuning

Maintenance policies for purging/archiving

Database Design:Database design can play a critical role in application performance

Guidelines:

Pick appropriate data types

Limit number of columns per table

Split large, infrequently used columns into a separate one-to-one table

Choose your index carefully - expensive to maintain. improves query

Normalize your data

Partition your data

Application TuningBalance performance and scalability concerns. For example, full table-lock could improve performance but hurt scalability

Improve concurrency

Keep your transactions short

Do bulk processing off-hours

Understand your database system's locking strategy

some lock only on writes

some locks on reads

some escalate row locks on table locks

Performance improvementsLimit the amount of data you pull into the application layer

I have not been able to update my blog. Towards the end of last year, my daughter, Juliana Olivas, was born. Therefore, I'm now trying to change the world one diaper at a time. Also, I'm trying to stay awake and keeping up with my current job at Up-Mobile. I am hoping to start updating my blog this week.