Sequentially arranged sentences composed of words (and punctuation)

Tag Archives: Ruby

Here’s an interesting observation, I needed to write a little script to automate some number calculating for me. I was wondering whether to do it in Ruby or Python. I’m doing a lot of Python at the moment so I felt I ought to give Ruby a little go. Share some of the love.

However the solution I had in mind really didn’t work with Ruby because while Ruby has open classes it has a comparatively fixed idea of attributes. In Python you can set attributes very freely on any object so I have got in the habit of creating something and then enhancing by applying a function. Example? Okay.

def make_captain(actor):
actor.rank = "Captain"
return actor

class Person:
pass

captain = make_captain(Person())

So this little trick doesn’t work, or rather is much more difficult to do in Ruby as Ruby, at is dynamic heart, is a language that believes in object-orientation and that classes should encapsulate rather than being little collections of data. You can use instance_variable_get/set but it lacks the elegance of the Python syntax.

In Ruby it would be easier to define the attributes in the class using the existing metaprogramming constructs and then have a class method to generate the content (effectively encapsulating my script logic).

Now this isn’t a straight “Ruby sux, no Python sux more” post. Between Scala, Clojure and Python I have been doing a lot more in a functional style that depreciates objects as anything more than value carriers. The Ruby vision of a class would give me something with a stronger sense of purpose and encapsulation, something that is hard to benefit from in a script for a particular purpose.

What is going to be interesting this year is trying to identify when the value of a piece of code is in the structure of it’s data-definition (i.e. objects) versus its process (functions). Having had a think about it I should perhaps rewrite my script to use some OO modelling because it may answer similar requirements down the line. However from a strict Lean/Waste point of view I should have gone with the Python solution as Ruby was imposing a restriction on me while providing benefits that I was unlikely to realise.

At the Geek Night last night there was a lot of questions about how Heroku compared with Google App Engine (and I guess implicit in that how it compares to GAE/J with the JRuby gem). Since it came up several times I thought it would be interesting to capture some of the answers.

Firstly there are the similarities, both deal with the application as the base unit of cloud scaling. Heroku offers several services that are similar to GAE, e.g. mailing, scheduled and background jobs. Some aspects like Memcache are in beta with Heroku but fundamentally both services seem to concur.

They are both free to start developing and then the pricing for the system varies a bit between the two, GAE might be cheaper but also has a more complex set of variables that go into the pricing, Heroku prices on: data size, web processes required and then options, so it’s easier to project your costs.

Differences then: well a big one is that Heroku uses a standard SQL database and also has excellent import/export facilities. GAE uses Google BigTable which is great for scaling and availability but data export and interactions are very limited. Heroku has an amazing deployment mechanism that involves simply pushing to a remote branch. GAE is slightly more involved but it is still relatively easy compared with deploying your own applications.

GAE provides access to a variety of Google services that are handy such as Google Account integration and image manipulation. GAE also has XMPP integration which is pretty unique if you have the requirement. However these do tie you to the platform and if you wanted to migrate you’d have to avoid using the Google webapp framework and the API. If flexibility is an issue you are probably going to prefer Heroku’s use of standard Ruby libraries and frameworks.

Heroku has few restrictions on what can and can’t be done in a web process, you also have quite powerful access to the user space your application runs in. GAE and particularly GAE/J has a lot of restrictions which can be quite painful particularly as there can be a discrepency between development and deployment environments.

However running a JVM in the cloud provides a lot of flexibility in terms of languages that can be used and deployed. Heroku is focussing on Ruby for now.

From my point of view, choose Heroku if you need: uncomplicated Rails/Rack application deployment, a conventional datastore that you want to have good access to, a simple pricing model. Choose GAE if you want to have out of the box access to Google’s services and a complete scaling solution.

Haml advertises itself as offering a haiku for HTML; that might mean it is short, beautiful and expressive or it could mean it is enigmatic, unsatisfying and cryptic. After a period away I have forgotten how Ruby-centric Haml is. If you want to include parameters in a Haml template within Sinatra you need to bind a map to the parameter locals.

haml :index, :locals => {:hello => “World”}

The easiest way to then include the data within a tag seems to be to use Ruby string interpolation

%p= “Hello #{locals[:hello]}”

There may be more elegant ways to do this but it seems relatively idiomatic. It is just hard to adjust to after months working on very restrictive templating systems.

Heroku is an amazing cloud computing service that I think is incredibly important but which really isn’t getting the visibility it should be at the moment.

Why does Heroku matter?

Firstly it gets deployment right. What is deployment in Heroku? It is pushing a version of your app to the cloud. That’s it. The whole transformation into a deployable application slug is handled by the Heroku, pushing it out to your dynamos is handled by Heroku. Everything is handled by Heroku except writing your application.

Secondly, it is the first cloud/scalable computing targeted directly at Ruby that has the right pricing structure to encourage experimentation and innovation. It has the low barrier to entry that Google App Engine has but requires no special code and imposes no special restrictions.

Thirdly it does the right thing with your data. It respects your right to access, mine and manipulate your data and provides a mechanism for doing this easily.

For me Heroku has set a new benchmark for Cloud Computing and build and deployment in the enterprise as well.

You know you have a good idea when there are half a dozen implementations of it. All essentially the same but filled with subtle nuance. The latest idea I have stumbled across is the “semi-static” website generator. The idea is so painfully obvious I cannot believe that it has taken us until 2009 to do it properly.

What is a semi-static website? Well it’s actually something like this blog. I spend ages writing and editing the articles but once I publish something I don’t generally change it unless it has factual errors. I need something that allows me to make changes but my content easily but my content doesn’t really benefit in anyway from being stored in a database or being generated dynamically on the fly because it is going to spend most of its life not being changed.

If you look at a lot of CMS systems the same is true. People want to be able to edit and reorganise material easily but they are rarely changing it more than once a week. The majority of the time the pages are static.

If you look at a number of CMS or Wiki systems they are actually quite complicated. Before you can get going and generate your content (which remember is the core of what you want to do) you might have to setup a database and get the application deployed to a webserver.

What the semi-static website generator does is provide a set of scripts that allow you to edit and create content and then “compile” your site. The result is set of static HTML, Javascript and CSS files that can be deployed to your static webserver (which is generally pretty easy to setup, most servers will come with Apache all ready to go).

When you combine it with a bit of DVCS magic deployment is even easier. You just checkin your new pages and then update the webserver copy of the content.

So if you want to get into this kind of content management what options do you have? Well you have a lot of choice indeed. Almost all of them are Ruby projects and they have a certain similarity to the Rails concept of code generation. You will be issuing commands like “create page”, “build site” and so on. You operate at one remove from it.

There are many, many, many choices but for myself I chose three to look at Nanoc, Webby and Jekyll. Nanoc I came across via Proggit and was struck by the simplicity and clarity of the documentation.

Installation of all three is via gems but there are some differences here. Nanoc is a very simple Gem and works with HTML and Erb by default, if you want to use a markup language or different templating language you will have to install the gem yourself. This means it also installs easily under JRuby too.

Webby by comparison is a very noisy install, I think it may have its dependencies slightly wrong as it installs a version of rake, flexmock and rspec which are surely development dependencies rather than anything I as a user would need to use for the framework. It also uses hpricot which means you can’t install it under JRuby.

Jekyll has a similarly demanding set of requirements but will install under JRuby. Jekyll did feel a lot like a personal solution as I was installing it. Textile is great but if I am going to be using Markdown why do I need to install RedCloth?

Overall I liked the nanoc install experience most.

Moving on to using them. Nanoc and Webby are very similar in that they use generators to create new elements in the site. Jekyll seems to use convention instead. At this point it is worth mentioning that there are some big differences between the three products in their intentions. Nanoc is a general website creator but it does have a certain amount of opinion as to how to do that. For example it generates very nice clean website urls by putting each page in a directory by itself and then creating the page as an index.html page within that directory. Make sure you have your webserver configured appropriately.

Webby is pretty flexible but comes with some built-in templates that will be build you a basic page, a tumblog style site, a normal blog and a presentation. That was a pretty neat feature and one the other frameworks didn’t have.

However the negative side to this is that the templating is based on Blueprint; which is both excellent as Blueprint is a great way of designing pages and a pain as Blueprint is actually pretty involved in terms of understanding the grid concept and what the various span-x css classes translate to. I felt that Webby did a great job of offering cross-browser styling that looked good but that in doing so it didn’t make the simple stuff simple. With my trial site my sidebar came out on the bottom of the page and I didn’t really know how to get on the right-hand side of the page. In nanoc I could just float: right it, in Webby I felt I had to understand the grid that had been created for me. I’m not sure it really allowed you to use any CSS knowledge someone might already have but the benefit is you know your site is going to look good across browsers and in print and so on. For more professional site development this might actually sway you to select Webby.

Jekyll is primarily a tool for publishing a blog style site so it has a lot of built-in support for special information relating to posts but in terms of layouts it uses general HTML templating like Nanoc does. It allows for the creation of little chunks of reusable content across pages via includes but lacks some of the flexibility of Nanoc’s tag helpers. While focussing on blog publishing Jekyll actually encompasses a powerful set of processing tools for content that effectively allows you to use it as CMS. Jekyll is used to process GitPages on GitHub, a real demonstration of its power beyond just blogging.

For what I wanted to do I initially felt that Nanoc was the best choice, it’s flexible, powerful and generates nice sites. Webby is powerful but didn’t expose that power to me in a way that made me feel in-control. However when I started to get to grips with Jekyll I was really amazed at how powerful the use of a few conventions were. Jekyll processes your site’s pages effectively in-place. It can choose to process your files if they match a naming convention (such as having a content extension like textile or markdown) and it uses an embedded YAML section in the file to do things like provide the title of the page or select the layout that the content should be displayed within. While Nanoc produces fancy URLs Jekyll made it easy to build more conventional websites consisting of clusters of pages.

Jekyll uses a small number of special directories (indicated by a leading underscore) to hold things like layouts, cross-page snippets and posts (if you are using the dated posting functionality). However for a simple site the number of directories and configuration files was significantly less than Nanoc. In fact you could get by with just a master page layout, which would require just one special directory and page within it. Everything that should have been simple was with Jekyll and the result was very close to a conventional website you might have coded by hand but extremely fast to create. It also uses just one command to press the pages and fire up a local server to view the pages. Both Webby and Nanoc seemed to make pressing and previewing slightly harder than it needed to be and at least once I forget to execute the Nanoc commands in the right order.

So, conclusions… Going forward I will be using both Nanoc and Jekyll. Webby I would be willing to consider but I don’t really have a task for it that would reward the additional effort I have to invest in its capabilities. If I just want to bang out pages for a website but minimise my effort and make it easy to abstract common content then I will definitely be using Jekyll. It is a joy to use. For sites that are maybe more based around articles or grouped content then I think I will be using Nanoc. The default URL construction is nice for hierarchical data (as long as you can stand the many directories) and there’s a sense that the configuration files make it easy to make changes that affect the site as a whole. However it does use a lot of scaffolding to achieve the effect and there are a lot of exposed options that could have been reduced by adopting some conventions Jekyll style.

I believe that Java developers are under a tremendous amount of pressure at the moment. However you may not feel it if you believe that Java is going to be around for a long time and you are happy to be the one maintaining the legacy apps in their twilight. Elliotte Rusty Harold has it right in the comments when someone says that there are a lot of Java jobs still being posted. If you enjoy feasting off the corpse then feel free to ignore the rest of this post because it is going to say nothing to you.

Java is in a tricky situation due to competition on all fronts. C# has managed to rally a lot of support. Some people talk nonsense about C# being what Java will look like in the future. C# is what Java would like if you could break backwards compatibility and indeed even runtime and development compatibility in some cases (with Service Packs). C# is getting mind share by leapfrogging ahead technology-wise at the expense of its early adoptors. Microsoft also does a far better job of selling to IDE dependent developers and risk-adverse managers.

Ruby and Python have also eaten Java’s lunch in the web space. When I am working on web project for fun I work with things like Sinatra, Django and Google App Engine. That’s because they are actually fun to work with and highly productive. You focus on your problem a lot sooner than you do in Java.

The scripting languages have also done a far better job of providing solutions to the small constant problems you face in programming. Automating tasks, building and deployment, prototyping. All these things are far easier to do in your favourite scripting language than they are in Java which will have to wait for JDK7 for a decent Filesystem abstraction for example.

Where does this leave Java? Well in the Enterprise server-side niche, where I first started to use it. Even there though issues of concurrency and performance are making people look to things like Erlang and JVM alternatives like Scala and Clojure.

While, like COBOL and Fortran there will always be a market for Java skills and development. The truth is that for Java developers who want to create new applications that lead in their field; a choice about what to do next is fast approaching. For myself I find my Java projects starting to contain more and more Groovy and I am very frustrated about the lack of support for mixed Java/Groovy projects in IDEs (although I know SpringSource is putting a lot of funding into the Eclipse effort to solve the problem).

If a client asks for an application using the now treadworn combination of Spring MVC and Hibernate I think there needs to be a good answer as to why they don’t want to use Grails which I think would increase productivity a lot without sacrificing the good things about the Java stack. Companies doing heavy lifting in Java ought to be investigating languages like Scala, particularly if they are arguing for the inclusion of properties and closures in the Java language spec.

Oracle’s purchase of Sun makes this an opportune moment to assess where Java might be going and whether you are going to be on the ride with it. It is hard to predict what Oracle will do, except that they will act in their perceived economic interest. The painful thing is that whatever you decide to do there is no clear answer at the moment and no bandwagon seems to be gaining discernible momentum. It is a tough time to be a Java developer.

Mocking calls to random number generators is a useful and important technique. Firstly it gives you a way into testing something that should operate randomly in production and because random number generation comes from built-in or system libraries normally it is also a measure of how well your mocking library actually works.

For Ruby I tend to use RSpec and its in-built mocking. Here the mocking is simple, of the form (depending on whether you are expecting or stubbing the interaction):

Receiver.should_receive(:rand)
Receiver.stub!(:rand)

However what is tricky is determining what the receiver should be. In Ruby random numbers are generated by Kernel so rand is Kernel.rand. This means that if the mocked call occurs in a class then the class is the receiver of the rand call. If the mocked call is in a module though the receiver is properly the Kernel module.

So in summary:

For a class: MyClass.should_receive(:rand)
For a module: Kernel.should_receive(:rand)

This is probably obvious if you a Ruby cognoscenti but is actually confusing compared to other languages.

In Python random functions are provided by a module, which is unambiguous but when using Mock I still had some difficultly as to how I set the expectation. Mock uses strings for the method called by the instance of the item for the mock anchor. This is how I got my test for shuffling working in the end.

This does the job and the decorator is a nice way of setting out the expectation for the test but it was confusing as to whether I am patching or patch_object’ing and wouldn’t it be nice if that mock variable could be localised (maybe it is and I haven’t figured it out yet).

Next time I’m starting some Python code I am going to give mocktest a go. It promises to combine in-built mock verification which should simplify things. It would be nice to try and combine it with Hamcrest matchers to get good test failure messages and assertions too.